**3. Related works**

6 Will-be-set-by-IN-TECH

ID cpu speed stor exp s d bw s d bw 256 1000 3000 4 1 2 1 2 1 1 128 1000 2000 3 1 3 3 3 1 3 256 1000 3000 4 1 4 2 4 1 2 256 1000 3000 4 1 5 3 5 1 3 256 1000 3000 4 1 6 2 6 1 2 64 1000 1000 2 2 3 1 3 2 1

• Let *R* be the set of Grid RMSs. This set includes a finite number of RMSs, which provide static information about controlled resources and the current reservations/assignments. • Let *S* be the set of sub-jobs in a given workflow including all sub-jobs with the resource

• Let *E* be the set of edges in the workflow, which express the dependency between the

• Let *Ki* be the set of resource candidates of sub-job *si*. This set includes all RMSs, which can

Based on the given input, the required solution includes two sets defined in Formula 1 and 2.

If the solution does not have a start, stop slot for each *si*, it becomes a configuration as defined

• **Criterion 1:** All *Ki* �= ∅. There is at least one RMS in the candidate set of each sub-job. • **Criterion 2:** The dependencies of the sub-jobs are resolved and the execution order remains

• **Criterion 3:** The capacity of an RMS must equal or greater than the requirement at any time slot. Each RMS provides a profile of currently available resources and can run many sub-jobs of a single flow both sequentially and in parallel. Those sub-jobs, which run on the same RMS, form a profile of resource requirement. With each RMS *rj* running sub-jobs of the Grid workflow, with each time slot in the profile of available resources and profile of

*M* = {(*si*,*rj*,*start*,*stop*)|*si* ∈ *S*,*rj* ∈ *Ki*} (1)

*N* = {(*eik*,*start*,*stop*)|*eik* ∈ *E*} (2)

*a* = {(*si*,*rj*)|*si* ∈ *S*,*rj* ∈ *Ki*} (3)

sub-jobs and the necessity for data transfers between the sub-jobs.

A feasible solution must satisfy following conditions:

Table 2. Sample RMS configurations

and runtime requirements.

run sub-job *si*, *Ki* ⊂ *R*.

in Formula 3.

unchanged.

> The mapping algorithm for Grid workflow has received a lot of attentions from the scientific community. In the literature, there are many methods to mapping a Grid workflow to Grid resource within different contexts. Among those, the old but well-known algorithm Condor-DAGMan from the work of Condor (2004) is still used in some present Grid systems. This algorithm makes local decisions about which job to send to which resource and considers only jobs, which are ready to run at any given instance. Also, using a dynamic scheduling approach, Duan et al. (2006) and Ayyub et al. (2007) apply many techniques to frequently rearrange the workflow and reschedule it in order to reduce the runtime of the workflow. Those methods are not suitable for the context of resource reservation because whenever a reservation is canceled, a fee is charged. Thus, frequent rescheduling may lead to a higher running workflow cost.

> Deelman et al. (2004) presented an algorithm which maps Grid workflows onto Grid resources based on existing planning technology. This work focuses on coding the problem to be compatible with the input format of specific planning systems and thus transferring the mapping problem to a planning problem. Although this is a flexible way of gaining different destinations, which includes some SLA criteria, significant disadvantages regarding the time-intensive computation, long response times and the missing consideration of Grid-specific constraints appeared.

> In Mello et al. (2007), Mello et. al. describe a load balancing algorithm addressed to Grid computing environment called RouteGA. The algorithm uses GA techniques to provide an

**Max-min algorithm**

**Suffer algorithm**

(2007).

**GRASP algorithm**

**w-DCP algorithm**

algorithm to our problem in Quan (2007).

algorithm is presented in Algorithm 1.

about the algorithm can be seen in Quan (2007).

of the Grid-Based Workflow Within an SLA Context

Max-min's metric is the Maximum MCT. The expectation is to overlap long-running tasks with short-running ones Berman et al. (2005); Casanova et al. (2000). To adapt the max-min algorithm to our problem, we analyze the workflow into a set of sub-jobs in sequential layers. Sub-jobs in the same layer do not depend on each other. With each sub-job in the sequential layer, we find the RMS which can finish sub-job the earliest. The sub-job in the layer which has the latest finish time, will be assigned to the determined RMS. A more detailed description

<sup>11</sup> w-TG: A Combined Algorithm to Optimize the Runtime

The rationale behind sufferage is that a host should be assigned to the task that would "suffer" the most if not assigned to that host. For each task, its sufferage value is defined as the difference between its best MCT and its second-best MCT. Tasks with a higher sufferage value take precedence Berman et al. (2005); Casanova et al. (2000). To adapt a suffer algorithm to our problem, we analyze the workflow into a set of sub-jobs in sequential layers. Sub-jobs in the same layer do not depend on each other. With each sub-job in the sequential layer, we find the earliest and the second-earliest finish time of the sub-job. The sub-job in the layer which has the highest difference between the earliest and the second-earliest finish time will be assigned to the determined RMS. A more detailed description about the algorithm can be seen in Quan

In this approach a number of iterations are made to find the best possible mapping of jobs to resources for a given workflow Blythe et al. (2005). In each iteration, an initial allocation is constructed in a greedy phase. The initial allocation algorithm computes the tasks whose parents have already been scheduled on each pass, and consider every possible resource for each such task. A more detailed description about the algorithm can be seen in Quan (2007).

The DCP algorithm is based on the principle of continuously shortening the longest path (also called critical path (CP)) in the task graph by scheduling tasks in the current CP to an earlier start time. This principal was applied for scheduling workflows with parameter sweep tasks on global Grids by Tianchi Ma et al in Ma et al. (2005). We proposed a version of DCP

The experiment results show that the quality of solutions found by those algorithm is not sufficient Quan (2007). To overcome the poor performance of methods in the literature, in the previous work Quan (2007), we proposed the w-Tabu algorithm. An overview of w-Tabu

The assigning sequence is based on the latest start\_time of the sub-job. Sub-jobs having smaller latest start time will be assigned earlier. Each solution in the reference solutions set can be thought of as the starting point for the local search so it should be spread as widely as possible in the searching space. To satisfy the space spread requirement, the number of similar map *sub* − *job* : *RMS* between two solutions, must be as small as possible. The improvement procedure based on the Tabu search has some specific techniques to reduce the computation

time. More information about w-Tabu algorithm can be seen in Quan (2007).

equal load distribution based on the computing resources capacity. Our work is different from the work of Mello et. al. in two main aspects.


Related to the mapping task graph to resources, there is also the multiprocessor scheduling precedence-constrained task graph problem Gary et al. (1979); Kohler et al. (1974). As this is a well-known problem, the literature has recorded a lot of methods for this issue, which can be classified into several groups Kwok et al. (1999). The classic approach is based on the so-called list scheduling technique Adam et al. (1974); Coffman et al. (1976). More recent approaches are the UNC (Unbounded Number of Clusters) Scheduling Gerasoulis et al. (1992); Sarkar (1989), the BNP (Bound Number of Processors) Scheduling Adam et al. (1974); Kruatrachue et al. (1987); Sih et al. (1993), the TDB (Task Duplication Based) Scheduling Colin et al. (1991); Kruatrachue et al. (1988), the APN (Arbitrary Processor Network) Scheduling Rewini et al. (1990), and the genetic Hou et al. (1994); Shahid et al. (1994). Our problem differs from the multiprocessor scheduling precedence-constrained task graph problem in many factors. In the multiprocessor scheduling problem, all processors are similar, but in our problem, RMSs are heterogeneous. Each task in our problem can be a parallel program, while each task in the other problem is a strictly sequential program. Each node in the other problem can process one task at a time while each RMS in our problem can process several sub-jobs at a time. For these reasons, we cannot apply the proposed techniques to our problem because of the characteristic differences.

In recent works Berman et al. (2005); Blythe et al. (2005); Casanova et al. (2000); Ma et al. (2005), authors have described algorithms which concentrate on scheduling the workflow with parameter sweep tasks on Grid resources. The common destination of those algorithms is optimizing the makespan, defined as the time from when execution starts until the last job in the workflow is completed. Subtasks in this kind of workflow can be group in layers and there is no dependency among subtasks in the same layer. All proposed algorithms assume each task as a sequential program and each resource as a compute node. By using several heuristics, all those algorithms perform the mapping very quickly. Our workflow with the DAG form can also be transformed to the workflow with parameter sweep tasks type, and thus we have applied all those algorithms to our problem.

#### **Min-min algorithm**

Min-min uses the Minimum MCT (Minimum Completion Time) as a measurement, meaning that the task that can be completed the earliest is given priority. The motivation behind Min-min is that assigning tasks to hosts that will execute them the fastest will lead to an overall reduced finished time Berman et al. (2005); Casanova et al. (2000). To adapt the min-min algorithm to our problem, we analyze the workflow into a set of sub-jobs in sequential layers. Sub-jobs in the same layer do not depend on each other. With each sub-job in the sequential layer, we find the RMS which can finish sub-job the earliest. The sub-job in the layer which has the earliest finish time, then, will be assigned to the determined RMS. A more detailed description about the algorithm can be seen in Quan (2007).
