**3.3.3 The integrated service interface**

14 Will-be-set-by-IN-TECH

**Cluster C**

Channel Interface

Chameleon SHMEM P4

Device Device

**ch\_p4**

Generic Implementation of the Abstact Device Interface

Chameleon SHMEM P4 . . . . . . . . .

**ch2 ch\_shmem**

Device

All−to−All **Cluster A Cluster B**

Fig. 7. The Layer Model of MPICH that enables the Multi-Device Support of MetaMPICH

MetaMPICH (the all-to-all approach as well as the router-based approach) in turn, rely on the so-called *multi-device* feature of MPICH. This feature allows the utilization of multiple *abstract communication devices*, which are data structures representing the actual interfaces to lower level communication layers, at the same time. That way, for example, communication via both TCP and shared memory within one MPI session becomes possible. MetaMPICH in turn uses this feature to directly access the interfaces of cluster-internal high-speed interconnects like SCI, Myrinet or InfiniBand via customized devices, while other devices are used to link the clusters via TCP, UDT<sup>10</sup> or SCTP. However, when running a router-based configuration, certain cluster nodes need to act as routers. That means that messages to remote clusters are at first forwarded via the cluster-native interconnect (and thus by means of a customized communication device) to a router node. The router node then sends the message to a corresponding router node at the remote site that finally tunnels the message via that

Fig. 6. Example for a Mixed Configuration Supported by MetaMPICH

MPI Interface (API)

MPI Application

Profiling Interface (PMPI)

MPIR Layer

(platform−independent parts)

Abstract Device Interface

MPICH

MPID Layer

(platform−dependent abstract device)

HW/OS Interface

Operating System Hardware

cluster-native interconnect to the actual receiver.

<sup>10</sup> UDT: a UDP-based Data Transfer Protocol, see Section 3.1.1.

Routers

A further key strength of MetaMPICH is an integrated service interface that can be accessed within the Grid environment via *remote procedure calls* (RPC). Although there exist several approaches for implementing RPC facilities in Grid environments, we have decided to base our implementation on the raw XML-RPC specification (Winer, 1999). Therefore, all service queries have to be handled via XML-coded remote method invocations. Simple services just provide the caller with status information about the current session, as for instance whether a certain connection has already been established, which transport protocol is in use, or how many bytes of payload have already been transfered on this connection. However, also quality-of-service metrics like latency and bandwidth of a connection can be inquired. All these information can then be evaluated by an external entity like a Grid monitoring daemon in order to detect bottlenecks or deadlocks in communication. Besides such query-related services, MetaMPICH also offers RPC interfaces that allow external entities actually to control session-related settings. In doing so, external monitoring or scheduling instances are given the ability to reconfigure an already established session even at runtime. Besides such external control capabilities, also *self-referring* monitoring services are supported by MetaMPICH. These services react automatically on session-internal events, as for instance the detection of a bottleneck or the requirement of a cleanup triggered by a timeout (Clauss et al., 2008).
