**3.4 Computational ElectroMagnetics description**

A complete discussion about the EM modeling and the mathematical aspects of the formulation goes beyond the scope of this paper; only a short summary of the problem will be presented in the following. In order to simplify the mathematical model, in the following the analysis of the electromagnetic behavior of Perfectly Electric Conducting (PEC) objects in a free space environment will be briefly introduced. Nevertheless, the authors would like to point out that the described approach is not limited to PEC objects in free space, in fact it is applicable to different formulations as well (dielectric objects or layered media problems for instance): in other terms it is a kernel free method. Besides, the focus of this chapter will be on a Grid Computing approach applied to computationally demanding electromagnetic problems: rather than considering more complicate approaches, that would divert the attention from the subject of this chapter, we prefer to introduce and apply the method to PEC objects in a homogeneous background, but we stress that it can be applied to other formulations as well. The Electric Field Integral Equation (EFIE) is a very versatile approach to the full-wave analysis of complex electromagnetic problems: for PEC objects the EFIE can be written by enforcing the boundary condition on the surface of the object, i.e. the tangential component of the electric field vanishes on the surface *S*:

$$
\hat{n} \times \underline{E}|\_{\Sigma} = \hat{n} \times \left( \underline{E}^{\text{scat}} + \underline{E}^{\text{inc}} \right)|\_{\Sigma} = \mathbf{0} \tag{1}
$$

The surface *S* is discretized by a mesh with triangular cells, over which a usual system of RWG functions *f <sup>n</sup>* is defined. The unknown surface current *<sup>J</sup>* is approximated by the above set of RWG basis functions

$$\underline{I}\left(\underline{\mathbf{r}}\right) \approx \sum\_{n=1}^{N} I\_n \underline{f}\_n\left(\underline{\mathbf{r}}\right) \tag{2}$$

Fig. 5. Example of geometric block partitioning and definitions; a sphere is split into 8 blocks.

Grid Infrastructure for Domain Decomposition Methods in Computational ElectroMagnetics 259

Δ; the algorithm stops when the last level L contains non-simplex cells with (linear) dimension around Δ. The non overlapping blocks, on which the synthetic functions will be generated, will then be the cells of the last level *ML*. The only parameter the user has to control for the algorithm is the threshold where the grouping has to be stopped. We underline that the grouping procedure used to find the domains where the Synthetic Functions are defined has

The next step consists in the generation of the basis functions to model the current distribution over each block; these will be referred as *synthetic functions* (SFs), whose support extends over the entire block. The synthetic functions are chosen in the set of responses to the required incident field, and to other sources placed at the borders of the block and around it, to make up the space of all (rigorous) solutions restricted to that block. Once the set of the solutions to all these sources is computed, the minimum number of necessary responses has to be determined. This is done through a combination of a Singular Value Decomposition (SVD) and a Gram-Schmidt (GS) procedure, applied to the matrix that collects the responses from all the sources: the functions are then selected with a proper thresholding on the associated singular value. Finally, the set of RWG functions defined over the connections of contacting blocks is added to the set of SFs in order to guarantee continuity across blocks. Since the SFs are expressed as a linear combination of the initial RWG functions, the scheme can be seen as

It should be clear now that the process of generation of the synthetic functions is carried out independently on different blocks, i.e. the set of SFs is generated by solving the electromagnetic problem on each block in *isolation*. As a consequence, a Grid Computing approach is particularly well suited for the generation of the SFs: each block can be processed

Finally, after the complete set of SFs is generated, the original system matrix is compressed, and the compressed system is inverted to yield the solution of the full problem. We underline that the goal of the synthetic functions approach is to accelerate the solution of the problem when a large number of excitation vectors (right hand sides, RHSs) is present, which is typical

a *O*(*NlogN*) complexity, where *N* is the number of initial mesh cells.

a purely multiplicative algebraic compression of the standard MoM matrix.

by a different node of the grid.

A Galerkin testing is used to convert the EFIE into the MoM linear system; hence we obtain the matrix equation

$$\mathbb{E}\left[\mathbf{Z}\right] \cdot \left[\boldsymbol{I}\right] = \left(\left[\boldsymbol{Z}^{\Phi}\right] \left[\boldsymbol{Z}^{A}\right]\right) \cdot \left[\boldsymbol{I}\right] = \left[\boldsymbol{V}\right] \tag{3}$$

where a generic element of the scalar potential and of the vector potential matrix, *Z<sup>φ</sup>* and *Z<sup>A</sup>* respectively, is expressed as

$$Z\_{m,n}^{\Phi} = \frac{1}{4\pi\mathfrak{j}\omega\epsilon\_0} \iint\_{\mathcal{S}\_m} dS \ \nabla\_{\mathbf{s}} \cdot \underline{f}\_m(\underline{\mathbf{r}}) \iint\_{\mathcal{S}\_n} dS' \ \mathcal{G}(\underline{\mathbf{r}}, \underline{\mathbf{r}}') \ \nabla\_{\mathbf{s}} \cdot \underline{f}\_n(\underline{\mathbf{r}}') \tag{4}$$

$$Z\_{m,n}^{A} = \frac{j\omega\mu\_0}{4\pi} \iint\_{S\_m} dS \ \underline{f}\_m(\underline{\mathbf{r}}) \cdot \iint\_{S\_n} dS' \ G(\underline{\mathbf{r}}, \underline{\mathbf{r}}') \ \underline{f}\_n(\underline{\mathbf{r}}') \tag{5}$$

where: *G*(*r*,*r*� ) = *<sup>e</sup>*−*jk*0*<sup>R</sup> <sup>R</sup>* , *<sup>k</sup>*<sup>0</sup> <sup>=</sup> *<sup>ω</sup>*√*�*0*μ*0, *<sup>R</sup>* <sup>=</sup> <sup>|</sup>*<sup>r</sup>* <sup>−</sup> *<sup>r</sup>*� |, and *Sm* is the definition domain of the function *f <sup>m</sup>*. The coefficients *In* in (2) are collected in the vector [*I*], and the *<sup>m</sup>*th element of [*V*] is equal to

$$V\_m = -\iint\_{S\_m} dS \, f\_{\underline{m}}(\underline{\mathbf{r}}) \cdot \underline{\mathbf{E}}^i(\underline{\mathbf{r}}) \tag{6}$$

where *E<sup>i</sup>* is the impressed field in absence of bodies.

The first step of the Domain Decomposition approach is a subdivision of the overall geometry, breaking down the scatterer surface *S* into *NB* sub-scatterers, which will be referred as *blocks* in the following. An example of the splitting of a sphere into 8 blocks is shown in Figure 5.

To subdivide the structure in blocks, on which entire domain synthetic functions will be generated, we use a fully automatic procedure. We would like to stress this aspect of our implementation, as it is of vital importance to be able to properly generate the subdomains for the synthetic function approach, and it is especially critical for arbitrary and complex structures. The block generation algorithm is based on the multi-level cell grouping described in Andriulli et al. (2008); Vipiana et al. (2009) for the generation of the Multi-Resolution basis functions. The starting point is the usual mesh for the analysis of the considered structure, without any constraint on the mesh properties and on the topology of the structure. Then a nested family of meshes, with non-simplex cells, is generated through subsequent groupings of the initial triangular mesh cells. We denote this initial mesh with *M*0, and call it level-0 mesh. All other meshes will be composed of groups of adjacent cells of the initial mesh. We start by considering groupings of adjacent cells in *M*<sup>0</sup> formed so that their average area is about four times the average area of the cells in *M*0. This covering will be called the level-1 mesh, *M*1. The same procedure applied to *M*<sup>1</sup> will generate the generalized mesh *M*2, and so forth. We stop grouping a cell with its adjacent cells when its size reaches a chosen threshold 12 Will-be-set-by-IN-TECH

The surface *S* is discretized by a mesh with triangular cells, over which a usual system of RWG

*N* ∑ *n*=1 *In f*

A Galerkin testing is used to convert the EFIE into the MoM linear system; hence we obtain

*<sup>m</sup>*(*r*) *Sn*

 *Sn*

The first step of the Domain Decomposition approach is a subdivision of the overall geometry, breaking down the scatterer surface *S* into *NB* sub-scatterers, which will be referred as *blocks* in the following. An example of the splitting of a sphere into 8 blocks is shown in Figure 5. To subdivide the structure in blocks, on which entire domain synthetic functions will be generated, we use a fully automatic procedure. We would like to stress this aspect of our implementation, as it is of vital importance to be able to properly generate the subdomains for the synthetic function approach, and it is especially critical for arbitrary and complex structures. The block generation algorithm is based on the multi-level cell grouping described in Andriulli et al. (2008); Vipiana et al. (2009) for the generation of the Multi-Resolution basis functions. The starting point is the usual mesh for the analysis of the considered structure, without any constraint on the mesh properties and on the topology of the structure. Then a nested family of meshes, with non-simplex cells, is generated through subsequent groupings of the initial triangular mesh cells. We denote this initial mesh with *M*0, and call it level-0 mesh. All other meshes will be composed of groups of adjacent cells of the initial mesh. We start by considering groupings of adjacent cells in *M*<sup>0</sup> formed so that their average area is about four times the average area of the cells in *M*0. This covering will be called the level-1 mesh, *M*1. The same procedure applied to *M*<sup>1</sup> will generate the generalized mesh *M*2, and so forth. We stop grouping a cell with its adjacent cells when its size reaches a chosen threshold

*dS*� *G*(*r*,*r*�

) *f n* (*r*�

*dS*� *G*(*r*,*r*�

*<sup>m</sup>*. The coefficients *In* in (2) are collected in the vector [*I*], and the *<sup>m</sup>*th element of [*V*]

*dS f <sup>m</sup>*(*r*) · *<sup>E</sup><sup>i</sup>*

*<sup>E</sup>scat* <sup>+</sup> *<sup>E</sup>inc*

*<sup>n</sup>* is defined. The unknown surface current *<sup>J</sup>* is approximated by the above set of


*<sup>n</sup>* (*r*) (2)

· [*I*] = [*V*] (3)

) ∇*<sup>s</sup>* · *f n* (*r*�


(*r*) (6)

*Z<sup>φ</sup>* and

) (4)

) (5)

*J* (*r*) ≈

 *Z<sup>φ</sup> Z<sup>A</sup>* 

where a generic element of the scalar potential and of the vector potential matrix,

*dS* ∇*<sup>s</sup>* · *f*

 *Sm*

*dS f <sup>m</sup>*(*r*) ·

*<sup>R</sup>* , *<sup>k</sup>*<sup>0</sup> <sup>=</sup> *<sup>ω</sup>*√*�*0*μ*0, *<sup>R</sup>* <sup>=</sup> <sup>|</sup>*<sup>r</sup>* <sup>−</sup> *<sup>r</sup>*�

*Vm* = −

tangential component of the electric field vanishes on the surface *S*:

functions *f*

 *Z<sup>A</sup>*

RWG basis functions

the matrix equation

where: *G*(*r*,*r*�

function *f*

is equal to

respectively, is expressed as

*<sup>m</sup>*,*<sup>n</sup>* <sup>=</sup> <sup>1</sup>

*<sup>m</sup>*,*<sup>n</sup>* <sup>=</sup> *<sup>j</sup>ωμ*<sup>0</sup> 4*π*

4*πjω�*<sup>0</sup>

where *E<sup>i</sup>* is the impressed field in absence of bodies.

 *Sm*

*Zφ*

*Z<sup>A</sup>*

) = *<sup>e</sup>*−*jk*0*<sup>R</sup>*

*n*ˆ × *E*|<sup>Σ</sup> = *n*ˆ ×

[*Z*] · [*I*] =

 *Sm*

Fig. 5. Example of geometric block partitioning and definitions; a sphere is split into 8 blocks.

Δ; the algorithm stops when the last level L contains non-simplex cells with (linear) dimension around Δ. The non overlapping blocks, on which the synthetic functions will be generated, will then be the cells of the last level *ML*. The only parameter the user has to control for the algorithm is the threshold where the grouping has to be stopped. We underline that the grouping procedure used to find the domains where the Synthetic Functions are defined has a *O*(*NlogN*) complexity, where *N* is the number of initial mesh cells.

The next step consists in the generation of the basis functions to model the current distribution over each block; these will be referred as *synthetic functions* (SFs), whose support extends over the entire block. The synthetic functions are chosen in the set of responses to the required incident field, and to other sources placed at the borders of the block and around it, to make up the space of all (rigorous) solutions restricted to that block. Once the set of the solutions to all these sources is computed, the minimum number of necessary responses has to be determined. This is done through a combination of a Singular Value Decomposition (SVD) and a Gram-Schmidt (GS) procedure, applied to the matrix that collects the responses from all the sources: the functions are then selected with a proper thresholding on the associated singular value. Finally, the set of RWG functions defined over the connections of contacting blocks is added to the set of SFs in order to guarantee continuity across blocks. Since the SFs are expressed as a linear combination of the initial RWG functions, the scheme can be seen as a purely multiplicative algebraic compression of the standard MoM matrix.

It should be clear now that the process of generation of the synthetic functions is carried out independently on different blocks, i.e. the set of SFs is generated by solving the electromagnetic problem on each block in *isolation*. As a consequence, a Grid Computing approach is particularly well suited for the generation of the SFs: each block can be processed by a different node of the grid.

Finally, after the complete set of SFs is generated, the original system matrix is compressed, and the compressed system is inverted to yield the solution of the full problem. We underline that the goal of the synthetic functions approach is to accelerate the solution of the problem when a large number of excitation vectors (right hand sides, RHSs) is present, which is typical for Radar Cross Section (RCS) calculations. On the other hand, when a single RHS is present, approaches based on the combination of a fast factorization scheme and an iterative solver, are more efficient.

In order to exploit the Grid Computing approach, each node of the grid has to sequentially solve a number of blocks. The synthetic functions are chosen in the set of responses to the required incident field, and to other sources placed at the borders of the block and around it, to make up the space of all (rigorous) solutions restricted to that block. Since the grid is in general heterogeneous, i.e., the nodes have different processing and memory characteristics, the blocks should be properly dimensioned and assigned to the computing nodes. When the block dimensions are properly taken into account, the computation of the full MoM matrix and its handling do not pose a limitation within a single block.

However, once the full set of synthetic functions is generated, one needs to apply the basis change to the full system matrix related to the original structure, in order to compress the system and invert it. For practical problems this is not feasible though, due to the total dimension of the problem. Therefore we perform the compression within a fast scheme, which avoids computing the full system matrix and makes the solution of large problems possible. The compressed matrix in the new basis can be written as:

$$\begin{bmatrix} \mathbf{Z}\_{\rm SF} \end{bmatrix} = \begin{bmatrix} T \end{bmatrix} \begin{bmatrix} Z \end{bmatrix} \begin{bmatrix} T \end{bmatrix}^H \tag{7}$$

where [*ZSF*] is the compressed matrix, [*T*] is the change of basis matrix, whose dimensions are *NSF* × *NRWG* (the total number of synthetic functions and the total number of RWG functions, respectively), [*Z*] is the MoM matrix in the RWG basis, and []*<sup>H</sup>* represents the hermitian operator. The same change of basis is performed on the RHS, namely:

$$\begin{bmatrix} \begin{bmatrix} V\_{\text{SF}} \end{bmatrix} = \begin{bmatrix} T \end{bmatrix} \begin{bmatrix} V \end{bmatrix} \tag{8}$$

Fig. 6. GridCEM Architecture.

execution.

**4.1 Global Scheduler**

time and computational wastes Gradwell (2003).

depends on the geometry of the structure. Moreover, the MN hosts the Global Scheduler (GS), that represents the smart component of the grid, and it holds information related to grid status, that are sent from each Worker Node, and stored in a database. The WN, instead, receives blocks, elaborates them and sends results to MN. The execution phase is managed by the Local Scheduler (LS) and the node status is monitored by an always active agent. The agent monitors the availability of each service on the node and sends periodically its status to the database on the MN. The agent role is crucial since the choice to send a job to a node depends on the information it gathers: if all services are available, the node is in condition to receive a job. All Worker Nodes have same operating system and are equipped with the same software: the middleware, the monitoring agents, the Local Scheduler and the code for

Grid Infrastructure for Domain Decomposition Methods in Computational ElectroMagnetics 261

As mentioned before the Master Node holds the brain of the grid that is represented by the Global Scheduler. It is a software module developed in Java responsible for the distribution of smaller parts of the input data (blocks) to each Worker Nodes. It communicates with other nodes thanks to grid services provided by the Globus Toolkit. The files are sent by the scheduler in order to balance the execution on each node: this is achieved by checking WNs availability and the number of files to sent. In order to know the overall status of the grid, the GS queries the database and transfers file only to nodes that have communicate their status within a specific time period. It is worth noting that this monitoring system that periodically push data into the database instead of gather information from each machine, allows to reduce

where [*VSF*] is the compressed RHS in the SF basis. Finally the compressed system is solved. At the present stage of the work, the compression and the solution of the complete system is not carried out in the grid though; this will be object of future research.

#### **4. GridCEM architecture**

The GridCEM project aims to improve performance of CEM application. This objective is achieved by the adoption of a grid architecture that exploits the benefits derived from the parallel computation. The proposed system manages data automatically and without impact for users. Furthermore, virtualized resources make system flexible, simplify the underlying infrastructure and improve scalability. To this aim an hybrid grid infrastructure was built and tests were performed in order to compare grid results to the ones of the same task executed on a single machine. The grid is composed of two types of nodes, a Master Node (MN) and Worker Nodes (WN). The Master Node is in charge to manage, control, and grant the security of the entire infrastructure. In addition, it coordinates other node operations. The middleware used to create the grid infrastructure is the Globus Toolkit (Globus (2010)). The toolkit, as explained in Section 3.1.1, includes functionalities for security, information systems, resource and data management, node communication, fault detection, portability and all the features need to safely deploy grid. The input data is a file that contains the mesh structure to analysis. It is split by the MN in smaller files, called blocks, that are distributed to WN. Since these blocks are generated according to Domain Decomposition technique and are independent, they can executed in parallel. The size of split blocks is not uniform but

Fig. 6. GridCEM Architecture.

14 Will-be-set-by-IN-TECH

for Radar Cross Section (RCS) calculations. On the other hand, when a single RHS is present, approaches based on the combination of a fast factorization scheme and an iterative solver,

In order to exploit the Grid Computing approach, each node of the grid has to sequentially solve a number of blocks. The synthetic functions are chosen in the set of responses to the required incident field, and to other sources placed at the borders of the block and around it, to make up the space of all (rigorous) solutions restricted to that block. Since the grid is in general heterogeneous, i.e., the nodes have different processing and memory characteristics, the blocks should be properly dimensioned and assigned to the computing nodes. When the block dimensions are properly taken into account, the computation of the full MoM matrix

However, once the full set of synthetic functions is generated, one needs to apply the basis change to the full system matrix related to the original structure, in order to compress the system and invert it. For practical problems this is not feasible though, due to the total dimension of the problem. Therefore we perform the compression within a fast scheme, which avoids computing the full system matrix and makes the solution of large problems possible.

[*ZSF*] = [*T*] [*Z*] [*T*]

where [*ZSF*] is the compressed matrix, [*T*] is the change of basis matrix, whose dimensions are *NSF* × *NRWG* (the total number of synthetic functions and the total number of RWG functions, respectively), [*Z*] is the MoM matrix in the RWG basis, and []*<sup>H</sup>* represents the hermitian

where [*VSF*] is the compressed RHS in the SF basis. Finally the compressed system is solved. At the present stage of the work, the compression and the solution of the complete system is

The GridCEM project aims to improve performance of CEM application. This objective is achieved by the adoption of a grid architecture that exploits the benefits derived from the parallel computation. The proposed system manages data automatically and without impact for users. Furthermore, virtualized resources make system flexible, simplify the underlying infrastructure and improve scalability. To this aim an hybrid grid infrastructure was built and tests were performed in order to compare grid results to the ones of the same task executed on a single machine. The grid is composed of two types of nodes, a Master Node (MN) and Worker Nodes (WN). The Master Node is in charge to manage, control, and grant the security of the entire infrastructure. In addition, it coordinates other node operations. The middleware used to create the grid infrastructure is the Globus Toolkit (Globus (2010)). The toolkit, as explained in Section 3.1.1, includes functionalities for security, information systems, resource and data management, node communication, fault detection, portability and all the features need to safely deploy grid. The input data is a file that contains the mesh structure to analysis. It is split by the MN in smaller files, called blocks, that are distributed to WN. Since these blocks are generated according to Domain Decomposition technique and are independent, they can executed in parallel. The size of split blocks is not uniform but

*<sup>H</sup>* (7)

[*VSF*] = [*T*] [*V*] (8)

and its handling do not pose a limitation within a single block.

The compressed matrix in the new basis can be written as:

operator. The same change of basis is performed on the RHS, namely:

not carried out in the grid though; this will be object of future research.

are more efficient.

**4. GridCEM architecture**

depends on the geometry of the structure. Moreover, the MN hosts the Global Scheduler (GS), that represents the smart component of the grid, and it holds information related to grid status, that are sent from each Worker Node, and stored in a database. The WN, instead, receives blocks, elaborates them and sends results to MN. The execution phase is managed by the Local Scheduler (LS) and the node status is monitored by an always active agent. The agent monitors the availability of each service on the node and sends periodically its status to the database on the MN. The agent role is crucial since the choice to send a job to a node depends on the information it gathers: if all services are available, the node is in condition to receive a job. All Worker Nodes have same operating system and are equipped with the same software: the middleware, the monitoring agents, the Local Scheduler and the code for execution.
