**1. Introduction**

88 Grid Computing – Technology and Applications, Widespread Coverage and New Horizons

Marzolla, M.; Mordacchini, M. & Orlando, S. (2007). Peer-to-peer systems for discovering

Mastroianni, C.; Talia, D. & Verta, O. (2005). A super-peer model for resource discovery services in large-scale grids, *Future Gener. Comput. Syst.* 21 (8) 1235–1248. Mastroianni, C.; Talia, D. & Versta, O. (2007). Evaluating resource discovery protocols for

Mutka, M.; Livny, M. (1987). Scheduling remote processing capacity in a workstation

Nazir, F.; Ahmad, H.F.; Burki, H.A.; Tarar, T.H.; Ali, A. & Suguri, H. (2005). A Resource

Neary, M.O.; Brydon, S.P.; Kmiec, P.; Rollins, S. & Capello, P. (1999). JavelinCC: Scalability issues in global computing, *Future Gener. Comput. Syst.* J. 15 (5–6), 659–674. Nejdl, W.; Wolf, B.; Qu, C.; Decker, S.; SIntek, M.; Naeve, A.; Nilsson, M.; Palmer, M.; Risch,

Oppenheimer, O.; Albrecht, J.; Patterson, D.; Vahdat, A. (2004). Scalable wide-area resource

Pipan, G. (2010). Use of the TRIPOD overlay network for resource discovery, Future

Qi, X.S.; Li, K.L. & Yao, F.J. (2006). A time-to-live based multi-resource reservation algorithm

*Symposium on Cluster Computing and the Grid*, CCGRID'06, pp. 115–122. Shen, H. (2009). A P2P-based intelligent resource discovery mechanism in Internetbased

Talia, D.; Trunfio, P.; Zeng, J. & Högqvist, M. (2006). A DHT-based Peer-to-Peer framework for resource discovery in grids. Technical Report TR-0048, Univ. of California. Tangpongprasit, S.; Katagiri, T.; Honda, H. & Yuba, T. (2005). A time-to-live based

Trunfio, P.; Talia, D.; Papadakis, H.; Fragopoulou, P.; Mordacchini, M.; Pennanen, M.;

Zerfiridis, K.G.; Karatza, H.D. (2003). Centralized and decentralized service Discovery on a

Zhu, C.; Liu, Z.; Zhang, W.; Xiao, W.; Xu, Z. & Yang, D. (2005). Decentralized grid resource discovery based on resource information community, *J. Grid Comput*. Zhuge, H. (2004). Semantics, resource and grid, Future Gener. Comput. Syst. 20 (1) 1–5.

distributed systems, *J. Parallel Distrib. Comput.* 69, 197–209.

Models and systems, *Future Gener. Comput. Syst.* 23, 864–878.

processing bank computing system, in: Proc. of ICDCS, September.

of the WWW2002, May 7–11, Honolulu, Hawaii, USA, pp. 604–615.

discovery, Technical Report TR CSD04-1334, Univ. of California.

Grid, SAG 2004, in: LNCS, vol. 3458, pp. 188–196.

Generation Computer Systems 26, 1257\_1270.

*Parallel Comput*. 31 (6) 529–543.

11th April, pp. 171–177.

hierarchical and super-peer grid information systems, in: *Proceedings of the 15th EUROMICRO International Conference on Parallel, Distributed and Network-Based* 

Monitoring and Management Middleware Infrastructure for Semantic Resource

T.; Edutella. (2002). A P2P networking infrastructure based on RDF, in: Proceedings

on resource discovery in Grid environment, in: *Proceedings of the 2006 1st International Symposium on Pervasive Computing and Applications*, pp. 189–193. Ramos, T.G. & de Melo, A.C.M.A. (2006). An extensible resource discovery mechanism for

grid computing environments, in: *Proceedings of the Sixth IEEE International* 

reservation algorithm on fully decentralized resource discovery in grid computing,

Popov, K.; Vlassov, V. & Haridi, S. (2007). Peer-to-Peer resource discovery in Grids:

peer-to-peer Network – a simulation study, in: Proceedings of the Sixth United Kingdom Simulation Society Conference, UKSim 2003, Cambridge, England, 9th–

resources in a dynamic grid, *Parallel Comput*. 33 (4–5) 339–358.

*Processing*, PDP'07, pp. 147–154.

Grid computing enables access to geographically and administratively dispersed networked resources and delivers functionality of those resources to individual users. Grid computing systems are about sharing computational resources, software and data at a large scale. The main issue in grid system is to achieve high performance of grid resources. It requires techniques to efficiently and adaptively allocate tasks and applications to available resources in a large scale, highly heterogeneous and dynamic environment.

In order to understand grid systems, three terms are reviewed as shown below:

1. **Virtualization**: The Virtualization term in grids refers to seamless integration of geographically distributed and heterogeneous systems, which enables users to use the grid services transparently. Therefore, they should not be aware of the location of

Fig. 1. Two virtual organizations are formed by combining a three real organizations

**2. Related works**

One salient issue in grid is to design efficient schedulers, which will be used as a part of middleware services to provide efficient planning of users' tasks to grid resources. Various scheduling approaches that were suggested in classical parallel systems literature are adopted for the grid systems with appropriate modifications. Although these modifications made them suitable for execution in grid environment, these approaches failed to deliver on the performance factor. For this reason, Genetic Algorithm (GA) and Simulated Annealing (SA) algorithm, among others, used to solve difficulties of task scheduling in grid environment. They gave reasonable solutions comparing with classical scheduling algorithms. GA solutions for grid scheduling are addressed in several works (Abraham et al., 2000; Carretero & Xhafa, 2006; Abraham et al., 2008; Martino & Mililotti, 2002; Martino, 2003; Y. Gao et al., 2005). These

Task Scheduling in Grid Environment Using Simulated Annealing and Genetic Algorithm 91

Furthermore, SA algorithm was studied in previous works Fidanova (2006); Manal et al. (2011). These works show important results and high quality solutions indicating that SA is a powerful technique and can be used to solve grid scheduling problem. Moreover, Jadaan,

The authors Wook& Park (2005) proved that both GA and SA algorithms have complementary strengths and weaknesses, accordingly, they proposed a new SA-selection to enhance GA performance to solve combinatorial optimization problem. The population size which they use is big that makes time consumed by algorithm large, specially when problem size increases. While Kazem tried to solve a static task scheduling problem in grid computing using a modified SA (Kazem et al., 2008). Prado propose a fuzzy scheduler obtained by means of evolving a fuzzy scheduler to improve the overall response time for the entire workflow (Prado et al., 2009). Rules of this evolutionary fuzzy system is obtained using

Wael proposed an algorithm that minimizes makespan, flowtime and time to release as well as it maximizes reliability of grid resources (Wael & Ramachandram, 2011). It takes transmission time and waiting time in resource queue into account. It uses stochastic universal sampling

Lee et al. (2011) provided Hierarchical Load Balanced Algorithm (HLBA) for Grid environment. He used the system load as a parameter in determining a balance threshold. the scheduler adapts the balance threshold dynamically when the system load changes. The

P.K. Suri & Singh Manpreet (2010) proposed a Dynamic Load Balancing Algorithm (DLBA) which performs an intra-cluster and inter cluster load balancing. Intra-cluster load balancing is performed depending on the Cluster Manager (CM). CM decides whether to start the local balancing based on the current workload of the cluster which is estimated from the resources below it. Inter-cluster load balancing is done when some CMs fail to balance their workload. In this case, the tasks of the overloaded cluster will be transferred to another cluster which is underloaded. In order to check the cluster overloading, they introduced a balanced threshold. If the load of cluster is larger than balanced threshold, load balancing will be executed. The value of balanced threshold is fixed. Therefore, the balanced threshold is not appropriate for

Chang et al. (2009) introduced Balanced Ant Colony Optimization algorithm (BACO) to choose suitable resources to execute tasks according to resources status. The pheromone

loads of resource are CPU utilization, network utilization and memory utilization.

studies ignored how to speed up convergence and shorten the search time of GA.

in Jadaan et al. (2009; 2010; 2011), exposed the importance of rank in GA.

genetic learning process based on Pittsburgh approach.

the dynamic characteristics in the grid system.

selection and single exchange mutation to outperform other GAs.

computing resources and have to submit their service request at just one point of entry to the grid system. Foster introduced the concept of Virtual Organization (VO) (Foster et al., 2001). He defines VO as a *"dynamic collection of multiple organizations providing flexible, secure, coordinated resource sharing"*. Figure 1 shows three actual organizations with both computational and data resources to share across organizational boundaries. Moreover, the same figure forms two VOs, A and B, each of them can have access to a subset of resources in each of the organizations (Moallem, 2009). Virtualization is a mechanism that improves the usability of grid computing systems by providing environment customization to users.


Grid systems provide the ability to perform higher throughput computing by usage of many networked computers to distribute process execution across a parallel infrastructure. Nowadays, organizations around the world are utilizing grid computing in such diverse areas as collaborative scientific research, drug discovery, financial risk analysis, product design and 3−D seismic imaging in the oil and gas industry (Dimitri et al., 2005).

Interestingly, task scheduling in grid has been paid a lot of attention over the past few years. The important goal of task scheduling is to efficiently allocate tasks as fast as possible to avialable resources in a global, heterogeneous and dynamic environment. Kousalya pointed out that the grid scheduling consists of three stages: First, resource discovery and filtering. Second, resource selection and scheduling according to certain objective. Third, task submission. The third stage includes the file staging and cleanup (Kousalya & Balasubramanie, 2009; 2008). High performance computing and high throughput computing are the two different goals of grid scheduling algorithm. The main aim of the high performance computing is to minimize the execution time of the application. Allocation of resources to a large number of tasks in grid computing environment presents more difficulty than in conventional computational environments.

The scheduling problem is well known NP-complete (Garey & Johnson, 1979). It is a combinatorial optimization problem by nature. Many algorithms are proposed for task scheduling in grid environments. In general, the existing heuristic mapping can be divided into two categories (Jinquan et al., 2005):

First, online mode, where the scheduler is always in ready mode. Whenever a new task arrives to the scheduler, it is immediately allocated to one of the existing resources required by that task. Each task is considered only once for matching and scheduling.

Second, batch mode, the tasks and resources are collected and mapped at prescheduled time. This mode takes better decision because the scheduler knows the full details of the available tasks and resources. This chapter proposes a heuristic algorithm that falls in batch mode Jinquan et al. (2005).

However, this chapter studies the problem of minimizing makespan, i.e., the total execution time of the schedule in grid environment. The proposed Mutation-based Simulated Annealing (MSA) algorithm is proved to have high performance computing scheduling algorithm. MSA algorithm will be studied for random and Expected Time to Compute (ETC) Models.
