**2. Mapping graphs of parallel programs onto graphs of distributed computer systems by recurrent neural networks**

Let us consider a matrix *v* of neurons with size *n n* , each row of the matrix corresponds to some branch of a parallel program and every column of the matrix corresponds to some EC. Each row and every column of the matrix *v* must contain only one nonzero entry equal to one, other entries must be equal to zero. Let the distance between the neighboring nodes of the CS graph is taken as a unit distance and *ij d* is the length of the shortest path between nodes *i* and *j* in the CS graph. Then we define the energy of the corresponding Hopfield neural network by the Lyapunov function

Optimization of Mapping Graphs of Parallel

From (1) and (2) we have

A choice of parameters

min

convergence is

where min

Programs onto Graphs of Distributed Computer Systems by Recurrent Neural Network 207

Fig. 4. Example of optimal mapping of "line"-graph onto torus (the mapping is distinguished by bold lines; the line-graph's node numbers are shown in brackets)

> 

A difference approximation of Equation (3) yields

( )

*yj ij*

program graph of a line type min *f* 1 . For example, taking 0.995

solution of the equation (4). Therefore we state 1 *t* and have the equation

,

*p*

*y Nb x j i f v d* 

1

*t t*

*xi xj yi yj ij j y y Nb x j i <sup>u</sup> C v v D vd t*

*xi xi xj yi yj ij*

accordance with (Feng & Douligeris, 2001) for the problem (1)-(4) a necessary condition of

*<sup>f</sup> C D*

[0,1) and

*C D* 100 . From (4) and (5) it follows that the parameters *t* and *D* are equally influenced on the

 min , 2 1

*u u t C v v D vd*

where *t* is a temporal step. Initial values <sup>0</sup> *uxi* ( , 1,..., ) *xi n* are stated randomly.

2

*j y y Nb x j i*

( ) 2 . *p*

(3)

( )

, (4)

(5)

being a value close to 1. For a parallel

for the line we have

*p*

, ,, *tCD* determines a quality of the solution *v* of Equation (4). In

$$\begin{aligned} L &= \mathbf{C} \cdot L\_c + D \cdot L\_{df} \\ L\_c &= \frac{1}{2} \Bigg[ \sum\_{\mathbf{x}} \left( \sum\_{j} \upsilon\_{xj} - \mathbf{1} \right)^2 + \sum\_{i} \left( \sum\_{y} \upsilon\_{yi} - \mathbf{1} \right)^2 \Bigg], \\ L\_d &= \frac{1}{2} \sum\_{\mathbf{x}} \sum\_{i} \sum\_{y \in \text{Nlb}\_p(\mathbf{x})} \sum\_{j \neq i} \upsilon\_{xi} \upsilon\_{yj} d\_{ij}. \end{aligned} \tag{1}$$

Here *ij d* is the distance between nodes *i* and *j* of the system graph corresponding to adjacent nodes of the program graph (a "dilation" of the edge of the program graph on the system graph), ( ) *Nb x <sup>p</sup>* is a neighborhood of the node *x* on the program graph.

The value *xi v* is a state of the neuron in the row *x* and column *i* of the matrix *v*, *C* and *D* are parameters of the Lyapunov function. *Lc* is minimal when each row and every column of *v* contains only one unity entry (all other entries are zero). Such matrix *v* is a correct solution of the mapping problem (Fig. 3).


Fig. 3. Example of correct matrix of neuron states

The minimum of *Ld* provides minimum of the sum of distances between adjacent *Gp* nodes mapped onto nodes of the system graph *Gs* (Fig. 4).

The Hopfield network minimizing the function (1) is described by the equation

$$\frac{\partial u\_{xi}}{\partial t} = -\frac{\partial L}{\partial v\_{xi}}\tag{2}$$

where *uxi* is an activation of the neuron with indices *x*, *i* ( , 1,..., ) *xi n* ,

$$w\_{xi} = \frac{1}{1 + \exp\left(-\beta u\_{xi}\right)}$$

is the neuron state (output signal), is the activation parameter.

Fig. 4. Example of optimal mapping of "line"-graph onto torus (the mapping is distinguished by bold lines; the line-graph's node numbers are shown in brackets)

From (1) and (2) we have

206 Recurrent Neural Networks and Soft Computing

<sup>1</sup> <sup>1</sup> 1 , <sup>2</sup>

Here *ij d* is the distance between nodes *i* and *j* of the system graph corresponding to adjacent nodes of the program graph (a "dilation" of the edge of the program graph on the

The value *xi v* is a state of the neuron in the row *x* and column *i* of the matrix *v*, *C* and *D* are parameters of the Lyapunov function. *Lc* is minimal when each row and every column of *v* contains only one unity entry (all other entries are zero). Such matrix *v* is a correct

The minimum of *Ld* provides minimum of the sum of distances between adjacent *Gp* nodes

*u L t v*

*xi*

1

*<sup>u</sup>*

is the activation parameter.

*xi*

(2)

The Hopfield network minimizing the function (1) is described by the equation

where *uxi* is an activation of the neuron with indices *x*, *i* ( , 1,..., ) *xi n* ,

*v*

*xi*

1 exp *xi*

( )

<sup>1</sup> . <sup>2</sup>

*c xj yi x j i y*

*Lv v*

 *p*

system graph), ( ) *Nb x <sup>p</sup>* is a neighborhood of the node *x* on the program graph.

*L vvd*

*d xi yj ij x i y Nb x j i*

,

*L CL DL*

*c d*

solution of the mapping problem (Fig. 3).

Fig. 3. Example of correct matrix of neuron states

mapped onto nodes of the system graph *Gs* (Fig. 4).

is the neuron state (output signal),

2 2

(1)

$$\frac{\partial u\_{xi}}{\partial t} = -C \left(\sum\_{j} \upsilon\_{xj} + \sum\_{y} \upsilon\_{yi} - 2\right) - D \sum\_{\substack{y \alpha \text{Nb}\_{\mathbb{P}}\{\mathbf{x}\}}} \sum\_{j \neq i} \upsilon\_{yj} d\_{ij}. \tag{3}$$

A difference approximation of Equation (3) yields

$$u\_{xi}^{t+1} = u\_{xi}^t - \Delta t \cdot \left[ \mathbf{C} \left( \sum\_{j} \upsilon\_{xj} + \sum\_{y} \upsilon\_{yi} - \mathbf{2} \right) + D \sum\_{\substack{y \in \text{Nb}\_p(\text{x}) \ j \neq i}} \sum\_{j \neq i} \upsilon\_{yj} d\_{ij} \right],\tag{4}$$

where *t* is a temporal step. Initial values <sup>0</sup> *uxi* ( , 1,..., ) *xi n* are stated randomly.

A choice of parameters , ,, *tCD* determines a quality of the solution *v* of Equation (4). In accordance with (Feng & Douligeris, 2001) for the problem (1)-(4) a necessary condition of convergence is

$$C > D \frac{f\_{\text{min}}}{2(1 - a)} \,\text{\,}\tag{5}$$

where min ( ) min *p yj ij y Nb x j i f v d* , [0,1) and being a value close to 1. For a parallel

program graph of a line type min *f* 1 . For example, taking 0.995 for the line we have

*C D* 100 .

From (4) and (5) it follows that the parameters *t* and *D* are equally influenced on the solution of the equation (4). Therefore we state 1 *t* and have the equation

Optimization of Mapping Graphs of Parallel

Fig. 6. Histograms of mappings for the neural network (8)

**Frequency**

with *n* nodes is obviously equal to *n*-1.

cyclic subgroup order, are carried out.

the maximal mapping rank, is small.

*ij d* by the values

**Frequency**

Programs onto Graphs of Distributed Computer Systems by Recurrent Neural Network 209

**12345678 Rank**

**0 2 5 7 9 11 13 15 Rank**

b) 16 *n*

As an example for investigation of the mapping algorithm we consider the mapping of a line-type program graph onto a 2D-torus. Maximal value of the mapping rank for a line

For experimental investigation of the mapping quality, the histograms of the mapping rank frequencies are used for a number of experiments equals to 100. The experiments for mapping the line onto the 2D-torus with the number of nodes <sup>2</sup> *nll* , 3,4, where *l* is the

For 8 *D* the correct solutions are obtained for 9 *n* and 16 *n* , but as follows from Fig. 5а and Fig. 5b for 8 *D* , the number of solutions with optimal mapping, corresponding to

To increase the frequency of optimal solutions of Equation (6) we replace the distance values

а) 9 *n*

$$\mathfrak{u}\_{\rm xi}^{t+1} = \mathfrak{u}\_{\rm xi}^t - \mathbb{C} \cdot \left(\sum\_j \upsilon\_{\rm xj} + \sum\_y \upsilon\_{\rm yi} - 2\right) - D \cdot \sum\_{\substack{y \in N \mathcal{b}\_p(\mathbf{x}) \ j \neq i}} \sum\_{j \neq i} \upsilon\_{\rm yj} d\_{\vec{\eta}}.\tag{6}$$

Let 0.1 (this value was stated experimentally). We will try to choose the value *D* to provide the absence of incorrect solutions.

#### **2.1 Mapping by the Hopfield network**

Let us evaluate the mapping quality by a number of coincidences of the program edges with edges of the system graph. We call this number a mapping rank. The mapping rank is an approximate evaluation of the mapping quality because the mappings with different dilations of the edges of the program graph may have the same mapping rank. Nevertheless, the maximum rank value, which equals to the number *Ep* of edges of the program graph, corresponds to optimal mapping, i.e. to a global minimum of *Ld* in (1). Our objective is to determine the mapping algorithm parameters providing maximum probability of the optimal mapping.

а) 9 *n*

b) 16 *n*

Fig. 5. Histograms of mappings for the neural network (6)

*xi xi xj yi yj ij*

Let us evaluate the mapping quality by a number of coincidences of the program edges with edges of the system graph. We call this number a mapping rank. The mapping rank is an approximate evaluation of the mapping quality because the mappings with different dilations of the edges of the program graph may have the same mapping rank. Nevertheless, the maximum rank value, which equals to the number *Ep* of edges of the program graph, corresponds to optimal mapping, i.e. to a global minimum of *Ld* in (1). Our objective is to determine the mapping algorithm parameters providing maximum

> **12345678 Rank**

**0 2 5 7 9 11 13 15 Rank**

b) 16 *n*

а) 9 *n*

*u u C v v D vd*

 

*j y y Nb x j i*

0.1 (this value was stated experimentally). We will try to choose the value *D* to

( ) 2 . *p*

(6)

1

provide the absence of incorrect solutions.

 **2.1 Mapping by the Hopfield network** 

probability of the optimal mapping.

**Frequency**

Fig. 5. Histograms of mappings for the neural network (6)

**Frequency**

Let  *t t*

а) 9 *n*

b) 16 *n*

Fig. 6. Histograms of mappings for the neural network (8)

As an example for investigation of the mapping algorithm we consider the mapping of a line-type program graph onto a 2D-torus. Maximal value of the mapping rank for a line with *n* nodes is obviously equal to *n*-1.

For experimental investigation of the mapping quality, the histograms of the mapping rank frequencies are used for a number of experiments equals to 100. The experiments for mapping the line onto the 2D-torus with the number of nodes <sup>2</sup> *nll* , 3,4, where *l* is the cyclic subgroup order, are carried out.

For 8 *D* the correct solutions are obtained for 9 *n* and 16 *n* , but as follows from Fig. 5а and Fig. 5b for 8 *D* , the number of solutions with optimal mapping, corresponding to the maximal mapping rank, is small.

To increase the frequency of optimal solutions of Equation (6) we replace the distance values *ij d* by the values

Optimization of Mapping Graphs of Parallel

Programs onto Graphs of Distributed Computer Systems by Recurrent Neural Network 211

From Fig. 6b and Fig. 8a we see that the splitting method essentially increases the frequency of optimal mappings. The increase of the parameter *D* up to the value 32 *D* results in

> **0 2 5 7 9 11 13 15 Rank**

**0 2 5 7 9 11 13 15 Rank**

b) 16 *n* , *K* 2 , 32 *D* .

In a recurrent Wang neural network (Wang, 1993; Hung & Wang, 2003) *Ld* in Expression (1)

*k k p t t <sup>t</sup> xi xi xj yi yj ij jI yI y Nb x j i*

( ) 2 exp ,

is a parameter. For the Wang network

(10)

where

<sup>1</sup> , , , 1,2,..., . 1 exp

*v xi I k K u*

*u u C v v D vc*

a) 16 *n* , *K* 2 , 8 *D* .

additional increase of the frequency of optimal mappings (Fig. 8b).

Fig. 8. Histograms of mappings for the neural network (9)

*xi k xi*

**2.3 Mapping by the Wang network** 

is multiplied by the value exp *<sup>t</sup>*

1

Equation (9) is reduced to

**Frequency**

**Frequency**

$$\mathbf{c}\_{ij} = \begin{cases} d\_{ij} & d\_{ij} = 1 \\ p \cdot d\_{ij} & d\_{ij} > 1 \end{cases} \tag{7}$$

where *p* is a penalty coefficient for the distance *ij d* exceeding the value 1, i.e. for noncoincidence of the edge of the program graph with the edge of the system graph. So, we obtain the equation

$$
\mu\_{\rm xi}^{t+1} = \mu\_{\rm xi}^t - C \cdot \left(\sum\_j \upsilon\_{\rm xj} + \sum\_y \upsilon\_{\rm yi} - 2\right) - D \cdot \sum\_{\rm ywNb\_p\{\mathbf{x}\}} \sum\_{j \neq i} \upsilon\_{\rm yj} c\_{ij}.\tag{8}
$$

For the above mappings with *p n* we obtain the histograms shown on Fig. 6a and Fig. 6b. These histograms indicate the improvement of the mapping quality but for 16 *n* the suboptimal solutions with the rank 13 have maximal frequency.

#### **2.2 Splitting method**

To decrease a number of local extremums of Function (1), we partition the set 1,2,...,*n* of subscripts *x* and *i* of the variables *xi v* to *K* sets

$$I\_k = \left\{ (k-1)q, \ (k-1)q + 1, \ldots, k \cdot q \right\}, \ q = n \;/\ K, \ k = 1, 2, \ldots, K \;/\ .$$

and map the subscripts *<sup>k</sup> x I* only to the subscripts *<sup>k</sup> i I* , i.e. we reduce the solution matrix *v* to a block-diagonal form (Fig. 7) and the Hopfield network is described by the equation

$$\begin{split} \boldsymbol{u}\_{xi}^{t+1} &= \boldsymbol{u}\_{xi}^{t} - \mathbb{C} \cdot \left( \sum\_{j \in I\_{k}} \boldsymbol{\upsilon}\_{xi} + \sum\_{y \in I\_{k}} \boldsymbol{\upsilon}\_{yi} - 2 \right) - D \cdot \sum\_{\substack{y \in Nb\_{p}\{\boldsymbol{\upsilon}\} \ j \neq i}} \sum\_{j \neq i} \boldsymbol{\upsilon}\_{yj} \boldsymbol{c}\_{ij}, \\ \boldsymbol{\upsilon}\_{xi} &= \frac{1}{1 + \exp\left( -\beta \boldsymbol{\mu}\_{xi} \right)}, \quad \boldsymbol{\upsilon}, i \in I\_{k}, k = 1, 2, \dots, \mathsf{K}. \end{split} \tag{9}$$

In this case 0 *xi v* for , , 1,2,..., . *k k x Ii Ik n*


Fig. 7. Example of block-diagonal solution matrix for *K* 2

In this approach which we call a splitting, for mapping line with the number of nodes *n* 16 onto 2D-torus, we have for *K* 2 the histogram presented on Fig. 8a.

*ij ij*

*d d*

*pd d* 

where *p* is a penalty coefficient for the distance *ij d* exceeding the value 1, i.e. for noncoincidence of the edge of the program graph with the edge of the system graph. So, we

*xi xi xj yi yj ij*

For the above mappings with *p n* we obtain the histograms shown on Fig. 6a and Fig. 6b. These histograms indicate the improvement of the mapping quality but for 16 *n* the

To decrease a number of local extremums of Function (1), we partition the set 1,2,...,*n* of

*I k q k q kq <sup>k</sup>* ( 1) , ( 1) 1,..., , *q nK* / , *k K* 1,2,..., ,

and map the subscripts *<sup>k</sup> x I* only to the subscripts *<sup>k</sup> i I* , i.e. we reduce the solution matrix *v* to a block-diagonal form (Fig. 7) and the Hopfield network is described by the

*xi xi xj yi yj ij*

In this approach which we call a splitting, for mapping line with the number of nodes

*u u C v v D vc*

*k k p*

*jI yI y Nb x j i*

33 34 35 43 44 45 53 54 55

*vvv vvv vvv*

000 000 000

*u u C v v D vc*

 

*j y y Nb x j i*

*ij ij*

*ij*

*c*

obtain the equation

**2.2 Splitting method** 

equation

1

subscripts *x* and *i* of the variables *xi v* to *K* sets

1

*t t*

In this case 0 *xi v* for , , 1,2,..., . *k k x Ii Ik n*

*t t*

suboptimal solutions with the rank 13 have maximal frequency.

*u*

*xi k xi*

<sup>1</sup> , , , 1,2,..., . 1 exp

 

*v xi I k K*

00 01 02 10 11 12 20 21 22

*vvv vvv vvv*

000 000 000

*n* 16 onto 2D-torus, we have for *K* 2 the histogram presented on Fig. 8a.

Fig. 7. Example of block-diagonal solution matrix for *K* 2

1 1

( ) 2 . *p*

( ) 2 ,

(9)

(8)

(7)

From Fig. 6b and Fig. 8a we see that the splitting method essentially increases the frequency of optimal mappings. The increase of the parameter *D* up to the value 32 *D* results in additional increase of the frequency of optimal mappings (Fig. 8b).

a) 16 *n* , *K* 2 , 8 *D* .

$$\text{(b)}\quad n=16\text{ , }K=\text{2 , }D=3\text{2 .}$$

Fig. 8. Histograms of mappings for the neural network (9)

### **2.3 Mapping by the Wang network**

In a recurrent Wang neural network (Wang, 1993; Hung & Wang, 2003) *Ld* in Expression (1) is multiplied by the value exp *<sup>t</sup>* where is a parameter. For the Wang network Equation (9) is reduced to

$$\begin{split} u\_{\text{xi}}^{t+1} &= u\_{\text{xi}}^{t} - \mathbb{C} \cdot \left( \sum\_{j \neq l\_{k}} \upsilon\_{\text{xi}j} + \sum\_{y \in I\_{k}} \upsilon\_{y\text{i}} - 2 \right) - D \cdot \sum\_{y \in \text{Nb}\_{p}\left(\text{x}\right)} \sum\_{j \neq i} \upsilon\_{yj} c\_{ij} \exp\left(-\sharp\_{r}^{t}\right), \\ \upsilon\_{\text{xi}} &= \frac{1}{1 + \exp\left(-\beta \mu\_{\text{xi}i}\right)}, \text{ x, i } i \in I\_{k}, k = 1, 2, \dots, \text{K}. \end{split} \tag{10}$$

Optimization of Mapping Graphs of Parallel

nodes of the CS graph taken as a unit distance.

Siqueira, Steiner & Scheer, 2007, 2010)

described by the differential equation

*u t*

*t*

where ( ( )) *ij ij x g u t* , *g*( ) 1 / 1 exp( ) *u u*

yields

under the constraints

increased.

Programs onto Graphs of Distributed Computer Systems by Recurrent Neural Network 213

Further investigations must be directed to increasing the probability of getting optimal solutions of the mapping problem when the number of the parallel program nodes is

In this section, we consider algorithms for nesting ring structures of parallel programs of distributed CSs, which are based on using recurrent neural networks, under the condition *nV V <sup>p</sup> <sup>s</sup>* . Such nesting reduces to constructing a Hamiltonian cycle in the CS graph and is based on solving the traveling salesman problem using the matrix of distances ( , 1,..., ) *ij d i j n* between the CS graph nodes, with the distance between the neighboring

The traveling salesman problem can be formulated as an assignment problem (Wang, 1993;

1

0,1 ,

1

*ij n ij i n ij j*

*x*

1

the element *i* is assigned to the position *j*, then 1 *ij x* , otherwise, 0 *ij x* .

*n n ij*

1 1

*k l*

*i ji*

*ij ij*

, (11)

(12)

. A difference approximation of Equation (13)

*c x*

1, 1,..., ,

*x j n*

*xi n*

Here, , *ij c i j* is the cost of assignment of the element *i* to the position *j*, which corresponds to motion of the traveling salesman from the city *i* to the city *j*; *ij x* is the decision variable: if

For solving problem (11) − (12), J. Wang proposed a recurrent neural network that is

( ) ( ) ( ) 2 exp( / )

*ik lj ij*

, (13)

*C x t x t Dc t*

1, 1,..., .

min *n*

As a result we have high frequency of optimal solutions (for 100 experiments):

1. more than 80% for the two-dimensional tori ( <sup>2</sup> *n* 3 9 and <sup>2</sup> *n* 4 16 );

**3. Construction of Hamilton cycles in graphs of computer systems** 

2. more than 70% for three-dimensional torus <sup>3</sup> ( 3 27) *n* .

We note that in experiments we frequently have incorrect solutions if for a given maximal number of iterations max *t* (for example, max *t* 10000 ) the condition of convergence

1 , *t t xi xi x i u u* , 0.01 is not satisfied. The introduction of factor exp *<sup>t</sup>* accelerates

the convergence of the recurrent neural network and the number of incorrect solutions is reduced.

So, for the three-dimensional torus with <sup>3</sup> *n* 3 27 nodes and *p nK D* , 3, 4096, 0.1 in 100 experiments we have the following results:


Fig. 9. Histogram of mappings for the Hopfield network ( <sup>3</sup> *n* 3 27 )

Fig. 10. Histogram of mappings for the Wang network ( <sup>3</sup> *n* 3 27 )

As a result we have high frequency of optimal solutions (for 100 experiments):


Further investigations must be directed to increasing the probability of getting optimal solutions of the mapping problem when the number of the parallel program nodes is increased.
