**3. Construction of Hamilton cycles in graphs of computer systems**

In this section, we consider algorithms for nesting ring structures of parallel programs of distributed CSs, which are based on using recurrent neural networks, under the condition *nV V <sup>p</sup> <sup>s</sup>* . Such nesting reduces to constructing a Hamiltonian cycle in the CS graph and is based on solving the traveling salesman problem using the matrix of distances ( , 1,..., ) *ij d i j n* between the CS graph nodes, with the distance between the neighboring nodes of the CS graph taken as a unit distance.

The traveling salesman problem can be formulated as an assignment problem (Wang, 1993; Siqueira, Steiner & Scheer, 2007, 2010)

$$\min \sum\_{i=1}^{n} \sum\_{j \neq i} c\_{ij} x\_{ij} \,. \tag{11}$$

under the constraints

212 Recurrent Neural Networks and Soft Computing

We note that in experiments we frequently have incorrect solutions if for a given maximal number of iterations max *t* (for example, max *t* 10000 ) the condition of convergence

the convergence of the recurrent neural network and the number of incorrect solutions is

So, for the three-dimensional torus with <sup>3</sup> *n* 3 27 nodes and *p nK D* , 3, 4096, 0.1

1. On the Hopfield network (9) we have 23 incorrect solutions, 43 solutions with Rank 25

correct solutions, where 27 solutions have Rank 25 and 73 solutions are optimal (with

**0 2 5 7 9 11 13 15 17 19 21 23 25 Rank**

**0 2 5 7 9 11 13 15 17 19 21 23 25 Rank**

is not satisfied. The introduction of factor exp *<sup>t</sup>*

we have all (100)

accelerates

1

, 0.01

Rank 26) (Fig. 10).

in 100 experiments we have the following results:

and 34 optimal solutions (with Rank 26) (Fig. 9).

Fig. 10. Histogram of mappings for the Wang network ( <sup>3</sup> *n* 3 27 )

**Frequency**

Fig. 9. Histogram of mappings for the Hopfield network ( <sup>3</sup> *n* 3 27 )

**Frequency**

2. On the Wang network (10) with the same parameters and 500

*t t xi xi*

*u u*

reduced.

,

*x i*

$$\begin{aligned} \mathbf{x}\_{ij} & \in \{0, 1\}, \\ \sum\_{i=1}^{n} \mathbf{x}\_{ij} &= \mathbf{1}\_{\prime} \, j = \mathbf{1}\_{\prime} \dots \prime n, \\ \sum\_{j=1}^{n} \mathbf{x}\_{ij} &= \mathbf{1}\_{\prime} \, i = \mathbf{1}\_{\prime} \dots \prime n. \end{aligned} \tag{12}$$

Here, , *ij c i j* is the cost of assignment of the element *i* to the position *j*, which corresponds to motion of the traveling salesman from the city *i* to the city *j*; *ij x* is the decision variable: if the element *i* is assigned to the position *j*, then 1 *ij x* , otherwise, 0 *ij x* .

For solving problem (11) − (12), J. Wang proposed a recurrent neural network that is described by the differential equation

$$\frac{\partial \hat{u}\_{\vec{\eta}}(t)}{\partial t} = -\mathbb{C}\left(\sum\_{k=1}^{n} \mathbf{x}\_{ik}(t) + \sum\_{l=1}^{n} \mathbf{x}\_{lj}(t) - 2\right) - D \cdot c\_{\vec{\eta}} \exp(-t/\tau) \text{ ,\tag{13}$$

where ( ( )) *ij ij x g u t* , *g*( ) 1 / 1 exp( ) *u u* . A difference approximation of Equation (13) yields

Optimization of Mapping Graphs of Parallel

1. <sup>1</sup> , , 1,..., ; *tt t u u tr i ij ij ij <sup>j</sup> <sup>n</sup>*

1 11

*t tt*

corresponding value of *<sup>t</sup>* <sup>1</sup> *uij*

values *<sup>t</sup>* <sup>1</sup> *uij*

With 1 

parameter 1 

value of *<sup>t</sup>* <sup>1</sup>

2.

*ij ij ij t*

According to this method, new values of *<sup>t</sup>* <sup>1</sup>

travelling agent cycle with the edges of the CS graph.

Programs onto Graphs of Distributed Computer Systems by Recurrent Neural Network 215

The experiments show that it is not always possible to construct a Hamiltonian cycle at *t* 1 , but cycle construction is successfully finalized if the step *t* is reduced. We reduced

The parameters , *ij c i j* , are calculated by the formula (7) where *ij d* is the distance between the nodes *i* and *j* of the graph, and *p >* 1 is the penalty coefficient applied if the distance *ij d* exceeds 1. The penalty coefficient was introduced to ensure coincidence of transition in the

We studied the use of iterative methods (Jacobi, Gauss–Seidel, and successive overrelaxation (SOR) methods (Ortega, 1988)) in solving Wang's system of equations. With the notation

*ij <sup>x</sup>* , *ij n* , 1,..., , are calculated only after all

). Figure

2 exp /

the step *t* as /2 *t* if a correct cycle could not be constructed after ten attempts.

1 1

<sup>1</sup> , , , 1,..., .

*u*

1 exp

In the SOR method, the calculations are performed by the formulas

1

with *n* = 16 (the cycle is indicated by the bold line).

*t t new ij ij*

*t t ij new ij*

*uu u x gu ij n*

1 1

*u u tr*

, the SOR method turns to the Gauss–Seidel method.

*t t ij ij*

*x g u g u i j n*

:

*ij ik lj ij k l*

the Jacobi method (method of simple iterations) of solving system (14) has the form

1

, *ij n* , 1,..., , are found. In contrast to the method of simple iterations, the new

*ij <sup>x</sup>* in the Gauss–Seidel method is calculated immediately after finding the

1 , 0,2 ,

1 1 <sup>1</sup> , , , 1,..., . *t t tt t u u tr x ij ij ij ij ij <sup>g</sup> u i <sup>j</sup> <sup>n</sup>*

,

, , 1,..., .

 

Experiments on 2D-tori with the group of automorphisms *EC C* <sup>2</sup> *m m* , <sup>2</sup> *n m* show that

The SOR method can be used for tori with *m*{3, 4,6} with appropriate selection of the

11 shows an example of a Hamiltonian cycle constructed by a neural network in a 2D-mesh

. For *m ≥* 8, it is reasonable to use the Gauss–Seidel method ( 1

the Jacobi method can only be used for tori with a small number of nodes ( *m*{3, 4} ).

*ij*

 

*r C x x Dc t*

*n n t tt*

$$\mathbf{u}\_{ij}^{t+1} = \mathbf{u}\_{ij}^t - \Delta t \cdot \left\lfloor \mathbf{C} \left( \sum\_{k=1}^n \mathbf{x}\_{ik}(t) + \sum\_{l=1}^n \mathbf{x}\_{lj}(t) - 2 \right) - D \cdot \mathbf{c}\_{ij} \exp \left( -\frac{t}{\tau} \right) \right\rfloor \tag{14}$$

Here , , ,, *CD t* are parameters of the neural network.

Siqueira et al. proposed a method of accelerating the solution of the system (14), which is based on the WTA ("Winner takes all") principle. The algorithm proposed below was developed on the basis of this method.

1. A matrix (0) *ij x* of random values *xij*(0) 0,1 is generated. Iterations (14) are performed until the following inequality is satisfied for all *ij n* , 1,..., :

$$\sum\_{k=1}^{n} \mathbf{x}\_{ik}(t) + \sum\_{l=1}^{n} \mathbf{x}\_{lj}(t) - \mathbf{2} \le \delta\_{\mathbf{x}\_{k}}$$

where is the specified accuracy of satisfying constraints (12).

2. Transformation of the resultant decision matrix *ij x* is performed:

2.1. 1 *i* .

2.2. The maximum element max *i j*, *x* is sought in the *i*th row of the matrix ( max *j* is the number of the column with the maximum element).

2.3. The transformation max , 1 *i j x* is performed. All the remaining elements of the *i*th row and of the column numbered max *j* are set to zero. Then, there follows a transition to the row numbered max *j* . Steps 2.2 and 2.3 are repeated until the cycle returns to the first row, which means that the cycle construction is finalized.

3. If the cycle returns to the row 1 earlier than the value 1 is assigned to *n* elements of the matrix *ij x* , this means that the length of the constructed cycle is smaller than *n*. In this case, steps 1 and 2 are repeated.

To ensure effective operation of the algorithm of Hamiltonian cycle construction, the following values of the parameters of system (14) were chosen experimentally (by the order of magnitude): *D C* 1, 10, 1000, 0.1 . Significant deviations of these parameters from the above-indicated values deteriorate algorithm operation, namely:


It follows from (Feng & Douligeris, 2001) that max *t t* , where max 1 1 <sup>1</sup> 0.1 10 *t C* .

Siqueira et al. proposed a method of accelerating the solution of the system (14), which is based on the WTA ("Winner takes all") principle. The algorithm proposed below was

1. A matrix (0) *ij x* of random values *xij*(0) 0,1 is generated. Iterations (14) are

() () 2

2.2. The maximum element max *i j*, *x* is sought in the *i*th row of the matrix ( max *j* is the number

2.3. The transformation max , 1 *i j x* is performed. All the remaining elements of the *i*th row and of the column numbered max *j* are set to zero. Then, there follows a transition to the row numbered max *j* . Steps 2.2 and 2.3 are repeated until the cycle returns to the first row, which

3. If the cycle returns to the row 1 earlier than the value 1 is assigned to *n* elements of the matrix *ij x* , this means that the length of the constructed cycle is smaller than *n*. In this

To ensure effective operation of the algorithm of Hamiltonian cycle construction, the following values of the parameters of system (14) were chosen experimentally (by the order

1. Deviations of the parameter *C* from the indicated value (at a fixed value of *D* )

increases the number of non-Hamiltonian ring-shaped routes.

deteriorates the solution quality. A decrease in

. Significant deviations of these parameters

*t*

*C* .

increases the

1 1 <sup>1</sup> 0.1 10

from the above-indicated values deteriorate algorithm operation, namely:

deteriorate the solution quality (the cycle length increases).

It follows from (Feng & Douligeris, 2001) that max *t t* , where max

,

( ) ( ) 2 exp ,

(14)

1 1

*<sup>t</sup> u u t C x t x t Dc*

*ij ij ik lj ij k l*

performed until the following inequality is satisfied for all *ij n* , 1,..., :

*n n ik lj k l*

1 1

is the specified accuracy of satisfying constraints (12).

2. Transformation of the resultant decision matrix *ij x* is performed:

*xt xt*

*n n t t*

, , ,, *CD t* are parameters of the neural network.

1

developed on the basis of this method.

of the column with the maximum element).

means that the cycle construction is finalized.

case, steps 1 and 2 are repeated.

number of iterations (14).

of magnitude): *D C* 1, 10, 1000, 0.1

 

Here 

where 

2.1. 1 *i* .

2. A decrease in

3. An increase in

The experiments show that it is not always possible to construct a Hamiltonian cycle at *t* 1 , but cycle construction is successfully finalized if the step *t* is reduced. We reduced the step *t* as /2 *t* if a correct cycle could not be constructed after ten attempts.

The parameters , *ij c i j* , are calculated by the formula (7) where *ij d* is the distance between the nodes *i* and *j* of the graph, and *p >* 1 is the penalty coefficient applied if the distance *ij d* exceeds 1. The penalty coefficient was introduced to ensure coincidence of transition in the travelling agent cycle with the edges of the CS graph.

We studied the use of iterative methods (Jacobi, Gauss–Seidel, and successive overrelaxation (SOR) methods (Ortega, 1988)) in solving Wang's system of equations. With the notation

$$r\_{ij}^t = -\mathbf{C} \left( \sum\_{k=1}^n \mathbf{x}\_{ik}^t + \sum\_{l=1}^n \mathbf{x}\_{lj}^t - \mathbf{2} \right) - Dc\_{ij} \exp \left( -t \;/\; \tau \right)$$

the Jacobi method (method of simple iterations) of solving system (14) has the form

$$\begin{aligned} \mathbf{1}. \qquad \boldsymbol{\mu}\_{ij}^{t+1} &= \boldsymbol{\mu}\_{ij}^{t} + \Delta t \cdot \boldsymbol{\nu}\_{ij}^{t}, \ i\_{\prime} \, j = \mathbf{1}, \ldots, n; \\\ \mathbf{2}. \qquad \boldsymbol{\chi}\_{ij}^{t+1} &= \, \mathbf{g}\left(\boldsymbol{\mu}\_{ij}^{t+1}\right), \, \mathbf{g}\left(\boldsymbol{\mu}\_{ij}^{t+1}\right) = \frac{1}{1 + \exp\left(-\beta \boldsymbol{\mu}\_{ij}^{t+1}\right)}, \, i\_{\prime} \, j = \mathbf{1}, \ldots, n. \end{aligned}$$

According to this method, new values of *<sup>t</sup>* <sup>1</sup> *ij <sup>x</sup>* , *ij n* , 1,..., , are calculated only after all values *<sup>t</sup>* <sup>1</sup> *uij* , *ij n* , 1,..., , are found. In contrast to the method of simple iterations, the new value of *<sup>t</sup>* <sup>1</sup> *ij <sup>x</sup>* in the Gauss–Seidel method is calculated immediately after finding the corresponding value of *<sup>t</sup>* <sup>1</sup> *uij* :

$$\mu\_{ij}^{t+1} = \mu\_{ij}^t + \Delta t \cdot r\_{ij}^t \; \; \propto\_{ij}^{t+1} = \operatorname{g} \left( \mu\_{ij}^{t+1} \right) , \; i , j = 1 , \dots , n.$$

In the SOR method, the calculations are performed by the formulas

$$\begin{aligned} \boldsymbol{\mu}\_{new} &= \boldsymbol{\mu}\_{ij}^{t} + \Delta t \cdot \boldsymbol{r}\_{ij}^{t}, \\ \boldsymbol{\mu}\_{ij}^{t+1} &= \boldsymbol{\phi} \cdot \boldsymbol{\mu}\_{new} + \left(1 - \boldsymbol{\phi}\right) \cdot \boldsymbol{\mu}\_{ij}^{t}, \boldsymbol{\phi} \in \left(0, 2\right), \\ \boldsymbol{\lambda}\_{ij}^{t+1} &= \boldsymbol{g} \left(\boldsymbol{\mu}\_{ij}^{t+1}\right), \ i, j = 1, \dots, n. \end{aligned}$$

With 1 , the SOR method turns to the Gauss–Seidel method.

Experiments on 2D-tori with the group of automorphisms *EC C* <sup>2</sup> *m m* , <sup>2</sup> *n m* show that the Jacobi method can only be used for tori with a small number of nodes ( *m*{3, 4} ).

The SOR method can be used for tori with *m*{3, 4,6} with appropriate selection of the parameter 1 . For *m ≥* 8, it is reasonable to use the Gauss–Seidel method ( 1 ). Figure 11 shows an example of a Hamiltonian cycle constructed by a neural network in a 2D-mesh with *n* = 16 (the cycle is indicated by the bold line).

Optimization of Mapping Graphs of Parallel

Table 4. Hypercube

2.3%.

0.125 .

It follows from Tables 3 and 4 that:

cycles by no more than 1.6%.

without edge defects for 0.5

the torus with a deleted edge (0, 12) for 0.125

Table 3. 3D-torus

Programs onto Graphs of Distributed Computer Systems by Recurrent Neural Network 217

n <sup>3</sup> 64 4 <sup>3</sup> 216 6 <sup>3</sup> 512 8 <sup>3</sup> 1000 10 *t* 0.012 0.1 0.1 0.1 p 64 32 32 32 L 64 218 520 1010

 0 0.01 0.016 0.01 Time, s 0.625 0.313 12.36 97.81

n 16 64 256 1024 *t* 0.1 0.1 0.1 0,1 p 32 32 32 32 L 16 64 262 1034

 0 0 0.016 0.023 Time, s 0 0.062 99.34 1147

1. In 3D-tori, the Hamiltonian cycle was constructed for *n* = 64. With *n* = 216, 512, and 1000, suboptimal cycles were constructed, which were longer than the Hamiltonian

2. In hypercubes, the Hamiltonian cycles were constructed for *n* = 16 and 64 (it should be noted that the hypercube is isomorphous to the 2D-torus with *n* = 16). For *n* = 256 and *n*  = 1024, suboptimal cycles were constructed, which were longer than *n* by no more than

The capability of recurrent neural networks to converge to stable states can be used for mapping program graphs to CS graphs with violations of regularity caused by deletion of edges and/or nodes. Such violations of regularity are called defects. In this work, we study the construction of Hamiltonian cycles in toroidal graphs with edge defects. Experiments in 2D-tori with a deleted edge and with *n* = 9 to *n* = 256 nodes for *p* = *n* were conducted. The experiments show that the construction of Hamiltonian cycles in these graphs by the above-described algorithm is possible, but the value of the step *t* at which the cycle can be constructed depends on the choice of the deleted edge. The method of automatic selection of the step *t* is described at the beginning of Section 3. Table 5 illustrates the dependence of the step *t* on the choice of the deleted edge in constructing an optimal Hamiltonian cycle by the SOR method in a 2D-torus with *n* = 16 nodes for

Examples of Hamiltonian cycles constructed by the SOR method in a 2D-torus with *n* = 16 nodes are given in Figs. 12a and 12b. Figure 12a shows the cycle constructed in the torus

and 0.25 *t* . Figure 12b shows the cycle constructed in

and 0.008 *t* .

**3.1 Construction of Hamiltonian cycles in toroidal graphs with edge defects** 

Fig. 11. Example of a Hamiltonian cycle in a 2D-mesh

In our experiments, we obtained Hamiltonian cycles (with the cycle length *L* = *n*) in 2Dmeshes and 2D-tori for a number of experiments equals to 100 with up to *n* = 1024 nodes for *m* = 2*k* and suboptimal cycle lengths *L* = *n* + 1 at *m* = 2*k* + 1, *k* = 2*,* 3*, . . . ,* 16. The penalty coefficients *p* and the values of *t* with which the Hamiltonian cycles were constructed for *n* = 16*,* 64*,* 256, and 1024, and also the times of algorithm execution on Pentium (R) Dual-Core CPU E 52 000, 2.5 GHz (the time equal to zero means that standard procedures did not allow registering small times shorter than 0.015 s) are listed in Tables 1 and 2.


Table 1. 2D-mesh


Table 2. 2D-torus

In addition to the quantities listed in Tables 1 and 2, Tables 3 and 4 give the relative increase *L n n* in the travelling salesman cycle length, as compared with the Hamiltonian cycle length, for a 3D-torus and hypercube.


Table 3. 3D-torus

216 Recurrent Neural Networks and Soft Computing

In our experiments, we obtained Hamiltonian cycles (with the cycle length *L* = *n*) in 2Dmeshes and 2D-tori for a number of experiments equals to 100 with up to *n* = 1024 nodes for *m* = 2*k* and suboptimal cycle lengths *L* = *n* + 1 at *m* = 2*k* + 1, *k* = 2*,* 3*, . . . ,* 16. The penalty coefficients *p* and the values of *t* with which the Hamiltonian cycles were constructed for *n* = 16*,* 64*,* 256, and 1024, and also the times of algorithm execution on Pentium (R) Dual-Core CPU E 52 000, 2.5 GHz (the time equal to zero means that standard procedures did not

*n* <sup>2</sup> 16 4 <sup>2</sup> 64 8 <sup>2</sup> 256 16 <sup>2</sup> 1024 32 *t* 1 1 1 0,5 *p n* / 2 8 16 128 512 Time, s 0 0.015 0.75 73.36

*n* <sup>2</sup> 16 4 <sup>2</sup> 64 8 <sup>2</sup> 256 16 <sup>2</sup> 1024 32 *t* 1 1 1 0.5 *p n* 16 64 256 1024 Time, s 0 0 4.36 73.14

In addition to the quantities listed in Tables 1 and 2, Tables 3 and 4 give the relative increase

in the travelling salesman cycle length, as compared with the Hamiltonian cycle

allow registering small times shorter than 0.015 s) are listed in Tables 1 and 2.

Fig. 11. Example of a Hamiltonian cycle in a 2D-mesh

Table 1. 2D-mesh

Table 2. 2D-torus

length, for a 3D-torus and hypercube.

*L n n*


Table 4. Hypercube

It follows from Tables 3 and 4 that:

