**5.1 Static Resource-Pool--Single Partition (SRPSP) policy**

In SRPSP, the TRPS algorithm has two phases, a *mapping phase* and *an execution phase.*

The mapping phase (Fig. 2) is performed before the execution of the first task. Each task in T has a given resource-pool Δ. Thus, for each task ܶ߳ܶ and Ґi= Δ. In *Mapping* phase, a mapping between each task and the most appropriate set of resources it needs, is determined. To create this mapping, TRPS iteratively calls the algorithm at RA for each task in T.

In the *Execution Phase*, the first task in set T is executed first. TRPS iterates through all the tasks in T and chooses the next task for which the complete set of resources needed is

Resource Management for Data Intensive Tasks on Grids 57

mapping phase becomes available. Second, there is the presence of unused resources that are not utilized at all; as it is possible that some resources may not become a part of mapping of any task. Thus, at a particular instance there may be resources that are available but are not being utilized while tasks are waiting in a queue (as the complete resource-pools

The *mapping phase* of SRPSP+BF is similar to SRPSP. In the *execution phase*, SRPSP+BF starts just like SRPSP. Once all the tasks for which the resources are available have started to execute, the SRPSP+BF tries to take advantage of the unused resource set by combining them into a single resource-pool. This resource-pool is given to the first task that is waiting in the queue and is it is passed to the Resource Allocator for the resource assignment. This process is called called *backfilling*. Backfilling is repeated till there is no unused resource in the system.

This section presents the concept of Architectural Templates that are used by the resource allocation algorithm deployed at the RA. In addition to deciding on how to decompose a task into its constituent jobs, an Architectural Template divides the available resources into different entities and assigns each of them a specialized role. We have identified the

3. **Compute-Farm:** A set of Grid nodes dedicated to process the raw data file in parallel.

5. **Egress Node:** A node where files are combined after being processed at the compute-

Note that a particular node may be part of a different entity at different times. For example a resource may be best utilized in a compute-farm for processing a particular job at one time, thus being a part of the compute-farm entity. But the same node may be used more effectively in a data-farm for processing a job in another task at another time; thus being a part of a data-farm entity. For each type of a task a set of appropriate templates is given as an input to the Resource Allocator. In this paper we have assumed the same set of templates described later can be used for every task in the bag-of-tasks. Thus, an Architectural Template specifies the structure of the suggested functional domains in which available resources are to be divided. This section briefly discusses a set of templates suitable for

In 2-Tier Templates only the source and the sink nodes are used for both processing and data transfer. There are two different types of 2 tier Templates: 2-Tier-a and 2-Tier-b. In a 2- Tier-a Template, the source node is used for data processing. Fig. 3 (a) shows the process, if 2-Tier-a architecture is used in a system. TRPS co-ordinates with the Task RA (1) and gives it a PBDT task and a resource pool (which is the set of all available nodes for this task). Task RA sends an acknowledgment signal back to TRPS (2). The Task Workflow Engine, deployed at Lower Level, L1, signals the source node to start the processing of data at the

associated with the tasks waiting in the queue are not available.)

**6. Architectural templates for RA** 

farm.

PBDT tasks.

**6.1 2-Tier Architectural Templates** 

following five roles that can be assigned to the entities:

1. **Source:** A single Grid node where the raw data file of Ti is located. 2. **Sink:** A set of Grid nodes where the processed file is to be delivered.

4. **Data-Farm:** A set of Grid nodes used for replicating the data.

available. All the tasks, for which each resource allocated by RA is available start executing. All the tasks, for which all the resources allocated by Resource Allocator are not yet available, wait in a queue. Once a task is complete, it is removed from T. The resources released by task are now added to the free resource set and the queue of waiting tasks is checked again to see whether the resource demand of any of these tasks can be satisfied. If all the resources of a particular are now available it starts execution and the next task in the waiting queues is checked and so on. The resources released by the task are now added to the resource set. The queue of waiting tasked is checked again in a First-In-First-Out (FIFO) order to see whether the resource demand of any of the tasks can be satisfied. When T={}, it means that all tasks have been assigned resources.

Fig. 2. Mapping Phase of Static Resource Pool Policies

#### **5.2 Static Resource-Pool-Single Partition with Backfilling (SRPSP+BF) policy**

SRPSP+BF is an improvement of SRPSP. A drawback of SRPSP is that the performance of the system may deteriorate due to two factors. First, there is the contention for resources, as each task has to wait until the complete set of resources it has been assigned to during the 56 Grid Computing – Technology and Applications, Widespread Coverage and New Horizons

available. All the tasks, for which each resource allocated by RA is available start executing. All the tasks, for which all the resources allocated by Resource Allocator are not yet available, wait in a queue. Once a task is complete, it is removed from T. The resources released by task are now added to the free resource set and the queue of waiting tasks is checked again to see whether the resource demand of any of these tasks can be satisfied. If all the resources of a particular are now available it starts execution and the next task in the waiting queues is checked and so on. The resources released by the task are now added to the resource set. The queue of waiting tasked is checked again in a First-In-First-Out (FIFO) order to see whether the resource demand of any of the tasks can be satisfied. When T={}, it

means that all tasks have been assigned resources.

Fig. 2. Mapping Phase of Static Resource Pool Policies

**5.2 Static Resource-Pool-Single Partition with Backfilling (SRPSP+BF) policy** 

SRPSP+BF is an improvement of SRPSP. A drawback of SRPSP is that the performance of the system may deteriorate due to two factors. First, there is the contention for resources, as each task has to wait until the complete set of resources it has been assigned to during the mapping phase becomes available. Second, there is the presence of unused resources that are not utilized at all; as it is possible that some resources may not become a part of mapping of any task. Thus, at a particular instance there may be resources that are available but are not being utilized while tasks are waiting in a queue (as the complete resource-pools associated with the tasks waiting in the queue are not available.)

The *mapping phase* of SRPSP+BF is similar to SRPSP. In the *execution phase*, SRPSP+BF starts just like SRPSP. Once all the tasks for which the resources are available have started to execute, the SRPSP+BF tries to take advantage of the unused resource set by combining them into a single resource-pool. This resource-pool is given to the first task that is waiting in the queue and is it is passed to the Resource Allocator for the resource assignment. This process is called called *backfilling*. Backfilling is repeated till there is no unused resource in the system.
