**w-DCP algorithm**

The DCP algorithm is based on the principle of continuously shortening the longest path (also called critical path (CP)) in the task graph by scheduling tasks in the current CP to an earlier start time. This principal was applied for scheduling workflows with parameter sweep tasks on global Grids by Tianchi Ma et al in Ma et al. (2005). We proposed a version of DCP algorithm to our problem in Quan (2007).

The experiment results show that the quality of solutions found by those algorithm is not sufficient Quan (2007). To overcome the poor performance of methods in the literature, in the previous work Quan (2007), we proposed the w-Tabu algorithm. An overview of w-Tabu algorithm is presented in Algorithm 1.

The assigning sequence is based on the latest start\_time of the sub-job. Sub-jobs having smaller latest start time will be assigned earlier. Each solution in the reference solutions set can be thought of as the starting point for the local search so it should be spread as widely as possible in the searching space. To satisfy the space spread requirement, the number of similar map *sub* − *job* : *RMS* between two solutions, must be as small as possible. The improvement procedure based on the Tabu search has some specific techniques to reduce the computation time. More information about w-Tabu algorithm can be seen in Quan (2007).

RMS 1

Fig. 5. The encoded configuration

**Determining the makespan**

will satisfy Criteria 2 and 3 and 4.

problem.

RMS 3

of the Grid-Based Workflow Within an SLA Context

RMS 5

Encoded configuration

in Figure 6a. The initial population will be presented in Figure 6b.

sj0 sj1 sj2 sj3

RMS 2

1 3 5 2 3 k

Each sub-job has different resource requirements and there are a lot of RMSs with different resource configurations. The initial action is finding among those heterogeneous RMSs the suitable RMSs, which can meet the requirement of the sub-job. The matching between the sub-job's resource requirement and the RMS's resource configuration is done by several logic checking conditions in the WHERE clause of the SQL SELECT command. This work will satisfy Criterion 1. The set of candidate lists is the configuration space of the mapping

The crossover operation of the GA will reduce the distance between two configurations. Thus, to be able to search over a wide search area, the initial population should be distributed widely. To satisfy the space spreading requirement, the number of the same map sub-job:RMS between two configurations must be as small as possible. We apply the same algorithm for creating the initial set of the configuration in Quan (2007). The number of the member in the initial population set depends on the number of available RMSs and the number of sub-jobs. For example, from Table 1 and 2, the configuration space of the sample problem is presented

The fitness value is based on the makespan of the workflow. In order to determine the makespan of a citizen, we have to calculate the timetable of the whole workflow. The algorithm for computing the timetable is presented in Algorithm 3. The start and stop time of the sub-job is determined by searching the resource reservation profile. The start and stop time of data transfer is determined by searching the bandwidth reservation profile. This procedure

After determining the timetable, we have a solution. With our sample workflow, the solution of the configuration 1 in Figure 6b including the timetable for sub-jobs and the time table for data transfer is presented in Table 3. The timetable for sub-jobs includes the RMS and the start, stop time of executing the sub-job. The timetable for data transfer includes the source and destination sub-jobs (S-D sj), source and destination RMS (S-D rms), and the start and stop time of performing the data transfer. The makespan of this sample solution is 64.

RMS 3

<sup>13</sup> w-TG: A Combined Algorithm to Optimize the Runtime

RMS k

sj4 sjn

#### **Algorithm 1** w-Tabu algorithm


### **4. w-GA algorithm**

#### **4.1 Standard GA**

The standard application of GA algorithm to find the minimal makespan of a workflow within an SLA context is presented in Algorithm 2. We call it the n-GA algorithm.

**Algorithm 2** n-GA algorithm


#### **Determining the assigning sequence**

The sequence of determining runtime for sub-jobs of the workflow in an RMS can also affect the final makespan, especially in the case of many sub-jobs in the same RMS. Similar to w-Tabu algorithm, the assigning sequence is based on the latest *start*\_*time* of the sub-job. Sub-jobs having the smaller latest start time will be assigned earlier. The complete procedure can be seen in Quan (2007). Here we outline some main steps. We determine the earliest and the latest start time for each of the sub-jobs of the workflow under ideal conditions. The time period to do data transferring among sub-jobs is computed by dividing the amount of data over a fixed bandwidth. The latest start/stop time for each sub-job and each data transfer depends only on the workflow topology and the runtime and not on the resources context. Those parameters can be determined by using conventional graph algorithms.

#### **Generating the initial population**

In the n-GA algorithm, the citizen is encoded as described in Figure 5. We use this convention encoding as it naturally presents a configuration and thus, it is very convenient to evaluate the timetable of the solution.

10 Will-be-set-by-IN-TECH

The standard application of GA algorithm to find the minimal makespan of a workflow within

The sequence of determining runtime for sub-jobs of the workflow in an RMS can also affect the final makespan, especially in the case of many sub-jobs in the same RMS. Similar to w-Tabu algorithm, the assigning sequence is based on the latest *start*\_*time* of the sub-job. Sub-jobs having the smaller latest start time will be assigned earlier. The complete procedure can be seen in Quan (2007). Here we outline some main steps. We determine the earliest and the latest start time for each of the sub-jobs of the workflow under ideal conditions. The time period to do data transferring among sub-jobs is computed by dividing the amount of data over a fixed bandwidth. The latest start/stop time for each sub-job and each data transfer depends only on the workflow topology and the runtime and not on the resources context.

In the n-GA algorithm, the citizen is encoded as described in Figure 5. We use this convention encoding as it naturally presents a configuration and thus, it is very convenient to evaluate

Those parameters can be determined by using conventional graph algorithms.

1: Determine assignning sequence for all sub-jobs of the workflow

4: Improve the solution as far as possible with the modified Tabu search

an SLA context is presented in Algorithm 2. We call it the n-GA algorithm.

8: Select parent couple configurations according to their *makespan* 9: Crossover the parent with a probability to form new configurations

1: Determine assigning sequence for all sub-jobs of the workflow

**Algorithm 1** w-Tabu algorithm

5: **end for**

**4. w-GA algorithm**

**Algorithm 2** n-GA algorithm

3: **while** *num*\_*mv < max* **do**

5: a"= best configuration

13: *num*\_*mv* ← *num*\_*mv* + 1

**Determining the assigning sequence**

**Generating the initial population**

the timetable of the solution.

**4.1 Standard GA**

12: **end while**

14: **end while** 15: return a"

2: Generate reference solution set 3: **for all** solution in reference set **do**

6: Pick the solution with best result

2: Generate reference configuration set

6: Add a" to the new population

4: Evaluate the *makespan* of each configuration

7: **while** the new population is not enough **do**

10: Mutate the new configuration with a probability 11: Put the new configuration to the new population

Fig. 5. The encoded configuration

Each sub-job has different resource requirements and there are a lot of RMSs with different resource configurations. The initial action is finding among those heterogeneous RMSs the suitable RMSs, which can meet the requirement of the sub-job. The matching between the sub-job's resource requirement and the RMS's resource configuration is done by several logic checking conditions in the WHERE clause of the SQL SELECT command. This work will satisfy Criterion 1. The set of candidate lists is the configuration space of the mapping problem.

The crossover operation of the GA will reduce the distance between two configurations. Thus, to be able to search over a wide search area, the initial population should be distributed widely. To satisfy the space spreading requirement, the number of the same map sub-job:RMS between two configurations must be as small as possible. We apply the same algorithm for creating the initial set of the configuration in Quan (2007). The number of the member in the initial population set depends on the number of available RMSs and the number of sub-jobs.

For example, from Table 1 and 2, the configuration space of the sample problem is presented in Figure 6a. The initial population will be presented in Figure 6b.
