**3. Motivation**

The motivation of this chapter is to develop a grid scheduling algorithm that can introduce a high utilization of grid resources, speed up convergence to the optimal or sub-optimal solution, shorten time consumed by algorithm as much as possible, improve load balancing of grid resources to the best optimization level, and minimize the schedule length, i.e., makespan.

The quality of solution produced by Yin's algorithm for grid scheduling is low. In other words, Yin proposed GA where mutation with all genes of chromosome are changed (Yin et al., 2007). This type of mutation is not always suitable to solve complex optimization problem such as grid task scheduling. It destroys the information in the chromosomes and does not help to find the optimal or near-optimal solution for the problem at hand. Moreover, the population's initialization of Yin's algorithm is generated randomly without using any heuristic in initialization phase of GA. The heuristics allow GA to search closer to the optimal solution area reducing the time consumed by an algorithm to reach the reasonable level of solution.

The four subfigures 2(a), 2(b), 2(c) and 2(d) show that the single exchange mutation (two genes exchanges), represented by dashed curves, approaches the optimal solution faster than the all genes changed mutation, represented by straight curves, in terms of the number of generations. Furthermore, subfigures 2(c) and 2(d) highlight the importance of using heuristic algorithm random-MCT (demonstrated in subsection 5.2) at the initialization stage of GA.

In this chapter, two models are introduced in term of random initialization of tasks and resources. First, Random Model (RM), which follows the uniform distribution in generating the matrix *EETij* with low-value ranges for computing capacities of resources and the workload of tasks. Second model, Expected Time to Compute (ETC) 11, in which the workload of tasks and computing resource capacity are generated randomly, in different ranges, low and

(a) Makespan results with one point crossover and random initialization of GA (b) Makespan results with two points crossover and random initialization of GA

(c) Makespan results with Random-MCT and one point crossover (d) Makespan results with Random-MCT and two points crossover

Fig. 2. Makespan results of experiment 8 (mentioned in table 3) which consists of 1000 tasks and 50 resources with one/two point(s) crossover, with/without Random-MCT, and two/all genes exchanged.

high. Figure 3 shows the relationships among RM and ETC models, on one hand, and both algorithms RGSGCS and MSA, on the another hand.

Fig. 3. The relationship among RM/ETC model and the RGSGCS/MSA

#### **4. Problem formulation**

4 Will-be-set-by-IN-TECH

update functions perform balancing to the system load. While local pheromone update function updates the status of the selected resource after tasks assignment. Global pheromone

In this chapter MSA maintains two solutions at a time, and it uses single exchange mutation

Previous works, namely, Wael et al. (2009c;b;a) considered the minimization of the makespan using GA based on Rank Roullete Wheel Selection (RRWSGA). They use standard deviation of fitness function as a termination condition of the algorithm. The aim of using standard deviation is to shorten the the time consumed by the algorithm with taking into account

Yin introduced GA which used standard deviation less than (0.1) as stopping criterion to limit the number of iterations of GA (Yin et al., 2007). This algorithm has drawbacks such as low quality solutions ( almost same as low quality solutions of standard GA ), generating initialization population randomly (even though the time consumed by algorithm is small comparing with standard GA), and mutation depends on exchange of every gene in the chromosome. This mutation will destroy the good information in subsequent chromosomes in next generations. In order to illustrate the usefulness of this work, next section explains the

The motivation of this chapter is to develop a grid scheduling algorithm that can introduce a high utilization of grid resources, speed up convergence to the optimal or sub-optimal solution, shorten time consumed by algorithm as much as possible, improve load balancing of grid resources to the best optimization level, and minimize the schedule length, i.e.,

The quality of solution produced by Yin's algorithm for grid scheduling is low. In other words, Yin proposed GA where mutation with all genes of chromosome are changed (Yin et al., 2007). This type of mutation is not always suitable to solve complex optimization problem such as grid task scheduling. It destroys the information in the chromosomes and does not help to find the optimal or near-optimal solution for the problem at hand. Moreover, the population's initialization of Yin's algorithm is generated randomly without using any heuristic in initialization phase of GA. The heuristics allow GA to search closer to the optimal solution area reducing the time consumed by an algorithm to reach the reasonable level of

The four subfigures 2(a), 2(b), 2(c) and 2(d) show that the single exchange mutation (two genes exchanges), represented by dashed curves, approaches the optimal solution faster than the all genes changed mutation, represented by straight curves, in terms of the number of generations. Furthermore, subfigures 2(c) and 2(d) highlight the importance of using heuristic algorithm random-MCT (demonstrated in subsection 5.2) at the initialization stage of GA. In this chapter, two models are introduced in term of random initialization of tasks and resources. First, Random Model (RM), which follows the uniform distribution in generating the matrix *EETij* with low-value ranges for computing capacities of resources and the workload of tasks. Second model, Expected Time to Compute (ETC) 11, in which the workload of tasks and computing resource capacity are generated randomly, in different ranges, low and

update the status of each resource for all tasks after completion of all tasks.

operator as well as random-MCT heuristic (demonstrated in subsection 5.2).

reasonable performance of Computing resources (97%).

motivation behind it.

**3. Motivation**

makespan.

solution.

For any problem formulation is fundamental issue which help to understand the problem at hand. This chapter considers a grid with sufficient arriving tasks to GA for scheduling. Let *N* be the total number of tasks to be scheduled and *Wi*, where *i* = 1, 2, ··· , *N*, be the workload of each task in number of cycles. The workload of tasks can be obtained by analyzing historical data, such as determining the data size of a waiting task. Let *M* be the total number of computing resources and *CPj*, where *j* = 1, 2, ··· , *M*, be the computing capacity of each resource expressed in number of cycles per unit time. The Expected Execution Time *EETij* of task *Ti* on resource *Rj* is defined in the following formula:

$$EET\_{i\bar{j}} = \frac{\mathcal{W}\_{\bar{i}}}{\mathcal{CP}\_{\bar{j}}} \tag{1}$$

Task Scheduling in Grid Environment Using Simulated Annealing and Genetic Algorithm 95

 

One of the important steps in GA is initialization of population. This initialization supports GA to find best solutions within the available search space. In this step, in GA, if bad solutions are generated randomly, the algorithm provides bad solutions or local optimal solutions. To overcome the posed problem, generating individuals using well-known heuristics in the initial step of the algorithm is required. These heuristics generate near-optimal solutions and the meta-heuristic algorithm combines these solutions in order to obtain better final solutions. Scheduling heuristics such as Min-Min, Minimum Completion Time (MCT), Minimum Execution Time (MET) (Braun et al., 2001), are proposed for independent tasks. Most of these

First, the expected execution time *EETij* is deterministic and does not vary with time.

Second, each task has exclusive use of the resource. Traditional MCT heuristic assigns each task to the resource that completes it earliest. The new algorithm, Random-MCT, is described below: For the first tasks in the grid, which their number equals to total resources number in the grid, the resources are assigned randomly. The remaining tasks in the grid are assigned according to earliest finishing time. Where *RTj* is the Ready Time of resource *j*. The time complexity of Random-MCT heuristic is O(*M*.*N*). After completion of RGSGCS's

**--**

**-**

**-**

 **-**

Fig. 5. Flow Chart of RGSGCS Algorithm

**5.2 Population initialization of RGSGCS algorithm**

heuristics are based on the following two assumptions.

initialization, the evaluation phase is introduced.

### **5. Rank Genetic Scheduler for Grid Computing Systems (RGSGCS) algorithm**

GA is a robust search technique that allows a high-quality solution to be derived from a large search space in polynomial time, by applying the principle of evolution. In other words, GA is used to solve optimization problems by imitating the genetic process of biological organisms (Goldberg, 1989). In GA, a potential solution to a specific problem is represented as a chromosome containing a series of genes. A set of chromosomes make up population. GA evolves the population, that generates an optimal solution, using selection, crossover and mutation operators.

Therefore, GA combines the exploitation of best solutions from past searches with the exploration of new regions of the solution space. Any solution in the search space of the problem is represented by a chromosome.

RGSGCS is GA for solving task scheduling in grid environment. It is presented in the algorithm 2 and the flowchart 5, ( (Wael et al., 2010) and (Wael et al., 2011)).

In order to successfully apply RGSGCS to solve the problem at hand, one needs to determine the following :


The main steps in RGSGCS are as follows:

#### **5.1 Chromosome representation of RGSGCS algorithm**

In GA, a chromosome is represented by a series of genes. Each gene, in turn, represents an index of computing resource *Rj* as shown below:

$$\text{Chromosome} = \text{gen}\_i(\mathbb{R}\_j) \tag{2}$$

Where *i* = 1, 2, ··· , *N*, and *j* = 1, 2, ··· , *M*. Figure 4 shows an example of the chromosome's representation consists of three resources and thirteen tasks.

$$\begin{array}{l|cccccccccccc} \text{Task No.} & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 \\ \text{Resource No.} & \begin{array}{|c|c|c|c|c|c|c|c|} \hline 1 & 3 & 1 & 3 & 2 & 2 & 3 & 1 & 2 & 1 & 3 & 2 & 2 \\ \hline \end{array} \\\hline \end{array}$$

Fig. 4. Task-Resource Representation for the Grid Task Scheduling Problem

Fig. 5. Flow Chart of RGSGCS Algorithm

6 Will-be-set-by-IN-TECH

each task in number of cycles. The workload of tasks can be obtained by analyzing historical data, such as determining the data size of a waiting task. Let *M* be the total number of computing resources and *CPj*, where *j* = 1, 2, ··· , *M*, be the computing capacity of each resource expressed in number of cycles per unit time. The Expected Execution Time *EETij*

*EETij* <sup>=</sup> *Wi*

GA is a robust search technique that allows a high-quality solution to be derived from a large search space in polynomial time, by applying the principle of evolution. In other words, GA is used to solve optimization problems by imitating the genetic process of biological organisms (Goldberg, 1989). In GA, a potential solution to a specific problem is represented as a chromosome containing a series of genes. A set of chromosomes make up population. GA evolves the population, that generates an optimal solution, using selection, crossover and

Therefore, GA combines the exploitation of best solutions from past searches with the exploration of new regions of the solution space. Any solution in the search space of the

RGSGCS is GA for solving task scheduling in grid environment. It is presented in the

In order to successfully apply RGSGCS to solve the problem at hand, one needs to determine

3. Genetic operators (i.e., selection, crossover, Mutation and Elitism) which have to be used and the parameter values (population size, probability of applying operators, maximum

In GA, a chromosome is represented by a series of genes. Each gene, in turn, represents an

Where *i* = 1, 2, ··· , *N*, and *j* = 1, 2, ··· , *M*. Figure 4 shows an example of the chromosome's

 

Fig. 4. Task-Resource Representation for the Grid Task Scheduling Problem

*Chromosome* = *genei*(*Rj*) (2)

algorithm 2 and the flowchart 5, ( (Wael et al., 2010) and (Wael et al., 2011)).

1. The representation of possible solutions, or the chromosomal encoding. 2. The fitness function which accurately represents the value of the solution.

**5. Rank Genetic Scheduler for Grid Computing Systems (RGSGCS) algorithm**

*CPj*

(1)

of task *Ti* on resource *Rj* is defined in the following formula:

mutation operators.

the following :

problem is represented by a chromosome.

number of generatons, etc.), which are suitable.

**5.1 Chromosome representation of RGSGCS algorithm**

representation consists of three resources and thirteen tasks.

index of computing resource *Rj* as shown below:

The main steps in RGSGCS are as follows:
