**2.2.1 InteGrade BSP applications**

InteGrade's support for executing BSP applications adheres to the Oxford BSP API 2, targeted for the C language. Thus, an application based on the Oxford BSPlib can be executed over InteGrade with little or even no modification of its source code, requiring only its recompilation and linkage with the appropriate InteGrade libraries.

A BSP computation proceeds in a series of global supersteps. Each superstep comprises three ordered stages: (1) *concurrent computation*: computations take place on every participating process. Each process only uses values stored on its local memory. Computations are independent in the sense that they occur asynchronously of all others; (2) *communication*: at this stage, the processes exchange data between themselves; (3) *barrier synchronization*: when a process reaches this point (the barrier), it waits until all other processes have finished their communication actions. The synchronization barrier is the end of a superstep and the beginning of another one.

InteGrade's implementation of the BSP model uses CORBA (OMG, 2011) for inter-process communication. CORBA has the advantage of being an easier and cleaner communication environment, shortening development and maintenance time and facilitating system evolution. Also, since it is based on a binary protocol, the performance of CORBA-based communication is an order of magnitude faster than the performance of technologies based on XML, requiring less network bandwidth and processing power. On the shared grid machines, InteGrade uses OiL ((Maia et al., 2006)), a very light-weight version of a CORBA ORB that imposes a small memory footprint. Nevertheless, CORBA usage is completely transparent to the InteGrade application developer, who only uses the BSP interface (Goldchleger et al., 2005).

InteGrade's BSPLib associates to each process of a parallel application a *BspProxy*. The *BspProxy* is a CORBA servant responsible for receiving related communications from other processes, such as a virtual shared address read or write, or the receipt of messages signaling the end of the synchronization barrier. The creation of *BspProxies* is entirely handled by the library and is totally transparent to users.

The first created process of a parallel application is called *Process Zero*. Process Zero is responsible for assigning an unique identifier to each application process, broadcasting the

<sup>2</sup> The Oxford BSP Toolset http://www.bsp-worldwide.org/implmnts/oxtool

non-dedicated resources in an opportunistic way further contributes to scale up the amount of available resources. In addition, MPICH-IG enables legacy MPI applications to be transparently deployed on an InteGrade grid, without the need to modify their source code.

Efficient Parallel Application Execution on Opportunistic Desktop Grids 119

Grid scheduling is a decision making process involving resources belonging to multiple administrative domains. As usual, this process includes a resource search for running applications. However, unlike traditional schedulers for distributed and parallel systems, grid schedulers have no control over the resources and applications in the system. Thus, it is necessary to have components that allow, among other features, resource discovery, monitoring and storage of information regarding resources and applications, mechanisms to allow access to different administrative domains and, depending on the adopted scheduling strategy, an approach for estimating the resources performance and the prediction of the

According to Schopf et al. (Schopf, 2004), a grid scheduling process can be broadly divided into three stages: filtering and resource discovery, resource selection, and preparing the environment for application execution. In the first stage, the grid scheduler creates a filter to select resources based on the restrictions and preferences provided by users during the application submission process. An information system usually provides a set of static and dynamic data about the available resources in the grid, such as their CPU capacity, the amount of available memory, and the network delay for delivering packets. Depending on the adopted scheduling heuristic, a cost estimator can sort the resources according to their efficiency to perform a certain type of code by using an analytic benchmark. In the second stage of the scheduling process, the scheduler will generate the applications mapping to resources in accordance with the system objectives, such as to minimize the response time of applications or to maximize the number of applications completed per time unit (throughput) (Dong & Akl, 2006; Zhu, 2003). In the third stage, a component running on the selected grid resource receives the application sent by the grid scheduler and prepares the environment for its execution by, for example, transferring the files containing the application input data.

A grid scheduling system can be organized in different schemes, according to different

• Centralized: in this model, the scheduler maintains information about all administrative domains. All applications are submitted to the scheduler. Based on the queue of submitted applications and the information of all administrative domains, it makes the scheduling

• Hierarchical: in this scheme, each domain has its own local scheduler that are interconnected in a hierarchical structure. A request for an application execution that can not be handled with the locally available resources is sent up in the hierarchy, reaching an

• Distributed: in this model, each domain has its own scheduler and the schedulers regularly consult each other to collect updated information about local loads. An application submitted for a given domain can then be transferred for execution in another domain

interests regarding performance and scalability (Subramani et al., 2002):

scheduler that has a broader view of the grid resources.

**3. Application scheduling and execution management**

applications execution times.

decisions.

that is less burdened.

CORBA IORs of each process to allow them to communicate directly, and coordinating synchronization barriers. Moreover, *Process Zero* executes its normal computation on behalf of the parallel application.

On InteGrade, the synchronization barriers of the BSP model are used to store checkpoints during execution, since they provide global, consistent points for application recovery. In this way, in the case of failures, it is possible to recover application execution from a previous checkpoint, which can be stored in a distributed way as described in Section 4.3.1. Application recovery is also available for sequential, bag-of-tasks, and MPI applications.
