**2. Relevant related work**

Because of the importance of the construction of (near) optimal CAs, much research has been carried out in developing effective methods for construct them. There are several reported methods for constructing these combinatorial models. Among them are: (a) direct methods, (b) recursive methods, (c) greedy methods, and d) meta-heuristics methods. In this section we describe the relevant related work to the construction of CAs.

Direct methods construct CAs in polynomial time and some of them employ graph or algebraic properties. There exist only some special cases where it is possible to find the covering array number using polynomial order algorithms. Bush (1952) reported a direct method for constructing optimal CAs that uses Galois finite fields obtaining all *CA*(*q<sup>t</sup>* ; *t*, *q* + 1, *q*) where *q* is a prime or a prime power and *q* ≤ *t*. Rényi (1971) determined sizes of CAs for the case *t* = *v* = 2 when *N* is even. Kleitman & Spencer (1973) and Katona (1973) independently determined covering array numbers for all *N* when *t* = *v* = 2. Williams & Probert (1996) proposed a method for constructing CAs based on algebraic methods and combinatorial theory. Sherwood (2008) described some algebraic constructions for strength-2 CAs developed from index-1 orthogonal arrays, ordered designs and CAs. Another direct method that can construct some optimal CAs is named zero-sum (Sherwood, 2011). Zero-sum leads to *CA*(*v<sup>t</sup>* ; *t*, *t* + 1, *v*) for any *t >* 2; note that the value of degree is in function of the value of strength. Recently, cyclotomic classes based on Galois finite fields have been shown to provide examples of binary CAs, and more generally examples are provided by certain Hadamard matrices (Colbourn & Kéri, 2009).

Recursive methods build larger CAs from smaller ones. Williams (2000) presented a tool called TConfig to construct CAs. TConfig constructs CAs using recursive functions that concatenate small CAs to create CAs with a larger number of columns. Moura et al. (2003) introduced a set of recursive algorithms for constructing CAs based on CAs of small sizes. Some recursive methods are product constructions (Colbourn & Ling, 2009; Colbourn et al., 2006; Martirosyan & Colbourn, 2005). Colbourn & Torres-Jimenez (2010) presented a recursive method to construct CAs using *perfect hash families* for CAs contruction. The advantage of the recursive algorithms is that they construct almost minimal arrays for particular cases in a reasonable time. Their basic disadvantage is a narrow application domain and impossibility of specifying constraints.

The majority of commercial and open source test data generating tools use greedy algorithms for CAs construction (AETG (Cohen et al., 1996), TCG (Tung & Aldiwan, 2000), IPOG (Lei et al., 2007), DDA (Bryce & Colbourn, 2007) and All-Pairs (McDowell, 2011)). AETG popularized greedy methods that generate one row of a covering array at a time, attempting to select the best possible next row; since that time, TCG and DDA algorithms have developed useful variants of this approach. IPOG instead adds a factor (column) at a time, adding rows as needed to ensure coverage. The greedy algorithms provide the fastest solving method.

A few Grid approaches has been found in the literature. Torres-Jimenez et al. (2004) reported a mutation-selection algorithm over Grid Computing, for constructing ternary CAs. Younis et al. (2008) presented a Grid implementation of the modified IPOG algorithm (MIPOG). 4 Grid Computing

experiments performed in the Grid infrastructure are showed in Section 7. Finally, Section 8

Because of the importance of the construction of (near) optimal CAs, much research has been carried out in developing effective methods for construct them. There are several reported methods for constructing these combinatorial models. Among them are: (a) direct methods, (b) recursive methods, (c) greedy methods, and d) meta-heuristics methods. In this section we

Direct methods construct CAs in polynomial time and some of them employ graph or algebraic properties. There exist only some special cases where it is possible to find the covering array number using polynomial order algorithms. Bush (1952) reported a direct method for constructing optimal CAs that uses Galois finite fields obtaining all *CA*(*q<sup>t</sup>*

1, *q*) where *q* is a prime or a prime power and *q* ≤ *t*. Rényi (1971) determined sizes of CAs for the case *t* = *v* = 2 when *N* is even. Kleitman & Spencer (1973) and Katona (1973) independently determined covering array numbers for all *N* when *t* = *v* = 2. Williams & Probert (1996) proposed a method for constructing CAs based on algebraic methods and combinatorial theory. Sherwood (2008) described some algebraic constructions for strength-2 CAs developed from index-1 orthogonal arrays, ordered designs and CAs. Another direct method that can construct some optimal CAs is named zero-sum (Sherwood, 2011). Zero-sum

value of strength. Recently, cyclotomic classes based on Galois finite fields have been shown to provide examples of binary CAs, and more generally examples are provided by certain

Recursive methods build larger CAs from smaller ones. Williams (2000) presented a tool called TConfig to construct CAs. TConfig constructs CAs using recursive functions that concatenate small CAs to create CAs with a larger number of columns. Moura et al. (2003) introduced a set of recursive algorithms for constructing CAs based on CAs of small sizes. Some recursive methods are product constructions (Colbourn & Ling, 2009; Colbourn et al., 2006; Martirosyan & Colbourn, 2005). Colbourn & Torres-Jimenez (2010) presented a recursive method to construct CAs using *perfect hash families* for CAs contruction. The advantage of the recursive algorithms is that they construct almost minimal arrays for particular cases in a reasonable time. Their basic disadvantage is a narrow application domain and impossibility

The majority of commercial and open source test data generating tools use greedy algorithms for CAs construction (AETG (Cohen et al., 1996), TCG (Tung & Aldiwan, 2000), IPOG (Lei et al., 2007), DDA (Bryce & Colbourn, 2007) and All-Pairs (McDowell, 2011)). AETG popularized greedy methods that generate one row of a covering array at a time, attempting to select the best possible next row; since that time, TCG and DDA algorithms have developed useful variants of this approach. IPOG instead adds a factor (column) at a time, adding rows as needed to ensure coverage. The greedy algorithms provide the fastest solving method.

A few Grid approaches has been found in the literature. Torres-Jimenez et al. (2004) reported a mutation-selection algorithm over Grid Computing, for constructing ternary CAs. Younis et al. (2008) presented a Grid implementation of the modified IPOG algorithm (MIPOG).

; *t*, *t* + 1, *v*) for any *t >* 2; note that the value of degree is in function of the

; *t*, *q* +

presents the conclusions derived from the research presented in this work.

describe the relevant related work to the construction of CAs.

**2. Relevant related work**

leads to *CA*(*v<sup>t</sup>*

of specifying constraints.

Hadamard matrices (Colbourn & Kéri, 2009).

Calvagna et al. (2009) proposed a solution for executing the reduction algorithm over a set of Grid resources.

Metaheuristic algorithms are capable of solving a wide range of combinatorial problems effectively, using generalized heuristics which can be tailored to suit the problem at hand. Heuristic search algorithms try to solve an optimization problem by the use of heuristics. A heuristic search is a method of performing a minor modification of a given solution in order to obtain a different solution.

Some metaheuristic algorithms, such as TS (Tabu Search) (Gonzalez-Hernandez et al., 2010; Nurmela, 2004), SA (Simulated Annealing) (Cohen et al., 2003; Martinez-Pena et al., 2010; Torres-Jimenez & Rodriguez-Tello, 2012), GA (Generic Algorithm) and ACA (Ant Colony Optimization Algorithm) (Shiba et al., 2004) provide an effective way to find approximate solutions. Indeed, a SA metaheuristic has been applied by Cohen et al. (2003) for constructing CAs. Their SA implementation starts with a randomly generated initial solution *M* which cost *E*(*M*) is measured as the number of uncovered *t*-tuples. A series of iterations is then carried out to visit the search space according to a neighborhood. At each iteration, a neighboring solution *M*� is generated by changing the value of the element *ai*,*<sup>j</sup>* by a different legal member of the alphabet in the current solution *M*. The cost of this iteration is evaluated as Δ*E* = *E*(*M*� ) − *E*(*M*). If Δ*E* is negative or equal to zero, then the neighboring solution *M*� is accepted. Otherwise, it is accepted with probability *P*(Δ*E*) = *e*−Δ*E*/*Tn* , where *Tn* is determined by a cooling schedule. In their implementation, Cohen et al. use a simple linear function *Tn* = 0.9998*Tn*−<sup>1</sup> with an initial temperature fixed at *Ti* = 0.20. At each temperature, 2000 neighboring solutions are generated. The algorithm stops either if a valid covering array is found, or if no change in the cost of the current solution is observed after 500 trials. The authors justify their choice of these parameter values based on some experimental tuning. They conclude that their SA implementation is able to produce smaller CAs than other computational methods, sometimes improving upon algebraic constructions. However, they also indicate that their SA algorithm fails to match the algebraic constructions for larger problems, especially when *t* = 3.

Some of these approximated strategies must verify that the matrix they are building is a CA. If the matrix is of size *N* × *k* and the interaction is *t*, there are ( *k t* ) different combinations which implies a cost of *O*(*N* × ( *k t* )) (given that the verification cost per combination is *O*(*N*)). For small values of *t* and *v* the verification of CAs is overcame through the use of sequential approaches; however, when we try to construct CAs of moderate values of *t*, *v* and *k*, the time spent by those approaches is impractical, for example when *t* = 5, *k* = 256, *v* = 2 there are 8, 809, 549, 056 different combinations of columns which require days for their verification. This scenario shows the necessity of grid strategies to solve the verification of CAs.

The next section presents an algorithm for the verification of a given matrix is a CA. The design of algorithm is presented for its implementation in grid architectures.

#### **3. An algorithm for the verification of covering arrays**

In this section we describe a grid approach for the problem of verification of CAs. See (Avila-George et al., 2010) for more details.

A matrix M of size *N* × *k* is a *CA*(*N*;*t*, *k*, *v*) *iff* every *t*-tuple contains the set of combination of symbols described by {0, 1, ..., *<sup>v</sup>* <sup>−</sup> <sup>1</sup>}*<sup>t</sup>* . We propose a strategy that uses two data structures called *P* and *J*, and two injections between the sets of *t*-tuples and combinations of symbols, and the set of integer numbers, to verify that M is a CA.

Let C = {*c*1, *c*2, ..., *c*( *k t* ) } be the set of the different *t*-tuples. A *t*-tuple *ci* = {*ci*,1, *ci*,2, ..., *ci*,*t*} is formed by *t* numbers, each number *ci*,1 denotes a column of matrix M. The set C can be managed using an injective function *f*(*ci*) : C→I between C and the integer numbers, this function is defined in Eq. 1.

$$f(c\_i) = \sum\_{j=1}^{t} \binom{c\_{i,j} - 1}{i + 1} \tag{1}$$

in each subset of columns corresponding to a *t*-tuple *ci*, and increases the value of the cell

Using Grid Computing for Constructing Ternary Covering Arrays 227

Table 4(a) shows the use of injective function *f*(*ci*). Table 4(b) presents the matrix *P* of *CA*(9; 2, 4, 3). The different combination of symbols *wj* ∈ W are in the first rows. The number appearing in each cell referenced by a pair (*ci*, *wj*) is the number of times that combination *wj*

Table 4. Example of the matrix *P* resulting from *CA*(9; 2, 4, 3) presented in Fig. 1.

In summary, to determine if a matrix M is or not a CA the number of different combination of symbols per *t*-tuple is counted using the matrix *P*. The matrix M will be a CA *iff* the matrix

The grid approach takes as input a matrix M and the parameters *N*, *k*, *v*, *t* that describe the CA that M can be. Also, the algorithm requires the sets C and W. The algorithm outputs the total number of missing combinations in the matrix M to be a CA. The Algorithm 1 shows the pseudocode of the grid approach for the problem of verification of CAs; particularly, the algorithm shows the process performed by each core involved in the verification of CAs. The strategy followed by the algorithm 1 is simple, it involves a block distribution model of the set of *t*-tuples. The set C is divided into *n* blocks, where *n* is the processors number; the size

this model allows the assignment of each block to a different core such that SA can be applied

**Algorithm 1:** Grid approach to verify CAs. This algorithm assigns the set of *t*-tuples C to

**<sup>2</sup>** *readInputFile*() /\* Read M, *N*, *k*, *v* and *t* parameters. \*/

**<sup>4</sup>** K*<sup>l</sup>* ← *rank* · B /\* The scalar corresponding to the first *t*-tuple. \*/ **<sup>5</sup>** K*<sup>u</sup>* ← (*rank* + 1) ·B− 1 /\* The scalar corresponding to the last *t*-tuple. \*/ **<sup>6</sup>** *Miss* ← *t***\_***wise*(M, *N*, *k*, *v*, *t*, K*l*, K*u*) /\* Number of missing t-tuples. \*/

*size* � /\* *<sup>t</sup>*-tuples per processor. \*/

**Input**: A covering array file, the number of processors (*size*) and the current processor id (*rank*).

**Result**: A file with the number of missing combination of symbols.

(b) Matrix *P*.

*<sup>n</sup>* �. The block distribution model maintains the simplicity in the code;

*g*(*wj*) *f*(*ci*) {0,0} {0,1} {0,2} {1,0} {1,1} {1,2} {2,0} {2,1} {2,2} 0 111111111 1 111111111 2 111111111 3 111111111 4 111111111 5 111111111

*p <sup>f</sup>*(*ci*),*g*(*wj*) ∈ *P* in that number.

(a) Applying *f*(*ci*).

*ci* index *t*-tuple *f*(*ci*) *c*<sup>1</sup> {1, 2} 0 *c*<sup>2</sup> {1, 3} 1 *c*<sup>3</sup> {1, 4} 3 *c*<sup>4</sup> {2, 3} 2 *c*<sup>5</sup> {2, 4} 4 *c*<sup>6</sup> {3, 4} 5

*P* contains no zero in it.

of block <sup>B</sup> is equal to � *<sup>C</sup>*

to verify the blocks.

*size* different cores.

*k t* )

**<sup>1</sup> begin**

**<sup>7</sup> end**

**<sup>3</sup>** B←� (

appears in the set of columns *ci* of the matrix *CA*(9; 2, 4, 3).

Now, let W = {*w*1, *w*2, ..., *wvt*} be the set of the different combination of symbols, where *wi* ∈ {0, 1, ..., *<sup>v</sup>* <sup>−</sup> <sup>1</sup>}*<sup>t</sup>* . The injective function *g*(*wi*) : W→I is defined as done in Eq. 2. The function *g*(*wi*) is equivalent to the transformation of a *v*-ary number to the decimal system.

$$\log(w\_i) = \sum\_{j=1}^{t} w\_{i,j} \cdot v^{t-i} \tag{2}$$

The use of the injections represents an efficient method to manipulate the information that will be stored in the data structures *P* and *J* used in the verification process of M as a CA. The matrix *P* is of size ( *k t* ) <sup>×</sup> *<sup>v</sup><sup>t</sup>* and it counts the number of times that each combination appears in M in the different *t*-tuples. Each row of *P* represents a different *t*-tuple, while each column contains a different combination of symbols. The management of the cells *pi*,*<sup>j</sup>* ∈ *P* is done through the functions *f*(*ci*) and *g*(*wj*); while *f*(*ci*) retrieves the row related with the *t*-tuple *ci*, the function *g*(*wi*) returns the column that corresponds to the combination of symbols *wi*. The vector *J* is of size *t* and it helps in the enumeration of all the *t*-tuples *ci* ∈ C.

Table 3 shows an example of the use of the function *g*(*wj*) for the Covering Array *CA*(9; 2, 4, 3) (shown in Fig. 1). Column 1 shows the different combination of symbols. Column 2 contains the operation from which the equivalence is derived. Column 3 presents the integer number associated with that combination.


Table 3. Mapping of the set W to the set of integers using the function *g*(*wj*) in *CA*(9; 2, 4, 3) shown in Fig. 1.

The matrix *P* is initialized to zero. The construction of matrix *P* is direct from the definitions of *f*(*ci*) and *g*(*wj*); it counts the number of times that a combination of symbols *wj* ∈ W appears 6 Grid Computing

called *P* and *J*, and two injections between the sets of *t*-tuples and combinations of symbols,

is formed by *t* numbers, each number *ci*,1 denotes a column of matrix M. The set C can be managed using an injective function *f*(*ci*) : C→I between C and the integer numbers, this

> *t* ∑ *j*=1

Now, let W = {*w*1, *w*2, ..., *wvt*} be the set of the different combination of symbols, where

function *g*(*wi*) is equivalent to the transformation of a *v*-ary number to the decimal system.

*t* ∑ *j*=1

The use of the injections represents an efficient method to manipulate the information that will be stored in the data structures *P* and *J* used in the verification process of M as a CA. The

in M in the different *t*-tuples. Each row of *P* represents a different *t*-tuple, while each column contains a different combination of symbols. The management of the cells *pi*,*<sup>j</sup>* ∈ *P* is done through the functions *f*(*ci*) and *g*(*wj*); while *f*(*ci*) retrieves the row related with the *t*-tuple *ci*, the function *g*(*wi*) returns the column that corresponds to the combination of symbols *wi*.

Table 3 shows an example of the use of the function *g*(*wj*) for the Covering Array *CA*(9; 2, 4, 3) (shown in Fig. 1). Column 1 shows the different combination of symbols. Column 2 contains the operation from which the equivalence is derived. Column 3 presents the integer number

> W *g*(*wi*) I *<sup>w</sup>*<sup>1</sup> <sup>=</sup>{0,0} <sup>0</sup> · 31 <sup>+</sup> <sup>0</sup> · 30 <sup>0</sup> *<sup>w</sup>*<sup>2</sup> <sup>=</sup>{0,1} <sup>0</sup> · 31 <sup>+</sup> <sup>1</sup> · 30 <sup>1</sup> *<sup>w</sup>*<sup>3</sup> <sup>=</sup>{0,2} <sup>0</sup> · 31 <sup>+</sup> <sup>2</sup> · 30 <sup>2</sup> *<sup>w</sup>*<sup>4</sup> <sup>=</sup>{1,0} <sup>1</sup> · 31 <sup>+</sup> <sup>0</sup> · 30 <sup>3</sup> *<sup>w</sup>*<sup>5</sup> <sup>=</sup>{1,1} <sup>1</sup> · 31 <sup>+</sup> <sup>1</sup> · 30 <sup>4</sup> *<sup>w</sup>*<sup>6</sup> <sup>=</sup>{1,2} <sup>1</sup> · 31 <sup>+</sup> <sup>2</sup> · 30 <sup>5</sup> *<sup>w</sup>*<sup>7</sup> <sup>=</sup>{2,0} <sup>2</sup> · 31 <sup>+</sup> <sup>0</sup> · 30 <sup>6</sup> *<sup>w</sup>*<sup>8</sup> <sup>=</sup>{2,1} <sup>2</sup> · 31 <sup>+</sup> <sup>1</sup> · 30 <sup>7</sup> *<sup>w</sup>*<sup>9</sup> <sup>=</sup>{2,2} <sup>2</sup> · 31 <sup>+</sup> <sup>2</sup> · 30 <sup>8</sup>

Table 3. Mapping of the set W to the set of integers using the function *g*(*wj*) in *CA*(9; 2, 4, 3)

The matrix *P* is initialized to zero. The construction of matrix *P* is direct from the definitions of *f*(*ci*) and *g*(*wj*); it counts the number of times that a combination of symbols *wj* ∈ W appears

*ci*,*<sup>j</sup>* <sup>−</sup> <sup>1</sup> *i* + 1

*f*(*ci*) =

*g*(*wi*) =

The vector *J* is of size *t* and it helps in the enumeration of all the *t*-tuples *ci* ∈ C.

} be the set of the different *t*-tuples. A *t*-tuple *ci* = {*ci*,1, *ci*,2, ..., *ci*,*t*}

. The injective function *g*(*wi*) : W→I is defined as done in Eq. 2. The

) <sup>×</sup> *<sup>v</sup><sup>t</sup>* and it counts the number of times that each combination appears

*wi*,*<sup>j</sup>* · *<sup>v</sup>t*−*<sup>i</sup>* (2)

(1)

and the set of integer numbers, to verify that M is a CA.

*k t* )

*k t*

associated with that combination.

Let C = {*c*1, *c*2, ..., *c*(

*wi* ∈ {0, 1, ..., *<sup>v</sup>* <sup>−</sup> <sup>1</sup>}*<sup>t</sup>*

matrix *P* is of size (

shown in Fig. 1.

function is defined in Eq. 1.

in each subset of columns corresponding to a *t*-tuple *ci*, and increases the value of the cell *p <sup>f</sup>*(*ci*),*g*(*wj*) ∈ *P* in that number.

Table 4(a) shows the use of injective function *f*(*ci*). Table 4(b) presents the matrix *P* of *CA*(9; 2, 4, 3). The different combination of symbols *wj* ∈ W are in the first rows. The number appearing in each cell referenced by a pair (*ci*, *wj*) is the number of times that combination *wj* appears in the set of columns *ci* of the matrix *CA*(9; 2, 4, 3).


Table 4. Example of the matrix *P* resulting from *CA*(9; 2, 4, 3) presented in Fig. 1.

In summary, to determine if a matrix M is or not a CA the number of different combination of symbols per *t*-tuple is counted using the matrix *P*. The matrix M will be a CA *iff* the matrix *P* contains no zero in it.

The grid approach takes as input a matrix M and the parameters *N*, *k*, *v*, *t* that describe the CA that M can be. Also, the algorithm requires the sets C and W. The algorithm outputs the total number of missing combinations in the matrix M to be a CA. The Algorithm 1 shows the pseudocode of the grid approach for the problem of verification of CAs; particularly, the algorithm shows the process performed by each core involved in the verification of CAs. The strategy followed by the algorithm 1 is simple, it involves a block distribution model of the set of *t*-tuples. The set C is divided into *n* blocks, where *n* is the processors number; the size of block <sup>B</sup> is equal to � *<sup>C</sup> <sup>n</sup>* �. The block distribution model maintains the simplicity in the code; this model allows the assignment of each block to a different core such that SA can be applied to verify the blocks.

**Algorithm 1:** Grid approach to verify CAs. This algorithm assigns the set of *t*-tuples C to *size* different cores.

**Input**: A covering array file, the number of processors (*size*) and the current processor id (*rank*). **Result**: A file with the number of missing combination of symbols.

**<sup>1</sup> begin <sup>2</sup>** *readInputFile*() /\* Read M, *N*, *k*, *v* and *t* parameters. \*/ **<sup>3</sup>** B←� ( *k t* ) *size* � /\* *<sup>t</sup>*-tuples per processor. \*/ **<sup>4</sup>** K*<sup>l</sup>* ← *rank* · B /\* The scalar corresponding to the first *t*-tuple. \*/ **<sup>5</sup>** K*<sup>u</sup>* ← (*rank* + 1) ·B− 1 /\* The scalar corresponding to the last *t*-tuple. \*/ **<sup>6</sup>** *Miss* ← *t***\_***wise*(M, *N*, *k*, *v*, *t*, K*l*, K*u*) /\* Number of missing t-tuples. \*/ **<sup>7</sup> end**

convert the scalar *F* to the equivalent *t*-tuple *ci* ∈ C. The sequential generation of each *t*-tuple *ci* previous to *cF* can be a time consuming task. There is where lies the main contribution of our grid approach; its simplicity is combined with a clever strategy for computing the initial

Using Grid Computing for Constructing Ternary Covering Arrays 229

We propose the *getInitialTuple* function as a method that generates *cF* (see Algorithm 3), according to a lexicographical, without generating its previous *t*-tuples *ci*, where *i < F*. To explain the purpose of the *getInitialTuple* function, lets consider the *CA*(9; 2, 4, 3) shown in Fig. 1. This CA has as set C the elements found in column 1 of Table 4(a). The *getInitialTuple* function with input *k* = 4, *t* = 2, *F* = 3 must return *J* = {1, 4}, i.e. the values of the *t*-tuple *c*3. The *getInitialTuple* function is optimized to find the vector *J* = {*J*1, *J*2, ..., *Jt*} that corresponds

> *j* ∑ *l*=*Ji*−<sup>1</sup>+1

*Jm*−1 ∑ *l*=*Jm*−<sup>1</sup>+1

*J*<sup>0</sup> = 0.

In summary, the Algorithm 3 only requires the computation of *O*(*t* × *k*) binomials to compute the *n* initial *t*-tuples of the PA. This represents a great improvement in contrast with the naive

The next three sections presents a simulated annealing approach to construct CAs. Section 4 describes in depth the components of our algorithm. Section 5 presents a method to

*k t*

) *t*-tuples, as done in the SA.

*<sup>k</sup>* <sup>−</sup> *<sup>l</sup> t* − *i*

 *<sup>k</sup>* <sup>−</sup> *<sup>l</sup> t* − *m*

*t*-tuple of each block.

where

and

**<sup>2</sup> begin**

**<sup>4</sup>** *kint* ← (

 **for** *i* ← 1 **to** *t* **do while** Θ *> kint* **do** Θ ← Θ − *kint iK* ← *iK* + 1 **<sup>9</sup>** *kint* ← (

**<sup>10</sup> end while <sup>11</sup>** *Ji* ← *iK* **<sup>12</sup>** *iK* ← *iK* + 1 **<sup>13</sup>** *iT* ← *iT* + 1 **<sup>14</sup>** *kint* ← (

**<sup>15</sup> end for <sup>16</sup> return** *J* **<sup>17</sup> end**

to *F*. The value *Ji* is calculated according to

**Algorithm 3:** Get initial *t*-tuple to PA.

**Output**: The initial *t*-tuple. **<sup>1</sup>** *getInitialTuple* **(** *k*, *t*, K*<sup>l</sup>* **)**

**<sup>3</sup>** Θ ← K*l*, *iK* ← 1, *iT* ← 1

*k*−*iK <sup>t</sup>*−*iT*)

> *k*−*iK <sup>t</sup>*−*iT*)

approach that would require the generation of all the (

*k*−*iK <sup>t</sup>*−*iT*) *Ji* = min *j*≥1

Δ*<sup>i</sup>* = *F* −

**Input**: Parameters *k* and *t*; the scalar corresponding to the first *t*-tuple (K*l*).

 Δ*<sup>i</sup>* ≤

> *i*−1 ∑ *m*=1

The *t\_wise* function first counts for each different *t*-tuple *ci* the times that a combination *wj* ∈ W is found in the columns of M corresponding to *ci*. After that, it calculates the missing combinations *wj* ∈ W in *ci*. Finally, it transforms *ci* into *ci*+1, i.e. it determines the next *t*-tuple to be evaluated.

The pseudocode for *t\_wise* function is presented in Algorithm 2. For each different *t*-tuple (lines 5 to 28) the function performs the following actions: counts the expected number of times a combination *wj* appears in the set of columns indicated by *J* (lines 6 to 14, where the combination *wj* is the one appearing in M*n*,*J*, i.e. in row *n* and *t*-tuple *J*); then, the counter *covered* is increased in the number of different combinations with a number of repetitions greater than zero (lines 10 to 12). After that, the function calculates the number of missing combinations (line 15). The last step of each iteration of the function is the calculation of the next *t*-tuple to be analyzed (lines 16 to 27). The function ends when all the *t*-tuples have been analyzed (line 5).

#### **Algorithm 2:** Function to verify a CA.

```
Output: Number of missing t-tuples.
1 t_wise(M, N, k, v, t, Kl, Ku)
2 begin
3 Miss ← 0, iMax ← t − 1, P ← 0
4 J ← getInitialTuple(k,t, Kl)
5 while Jt ≤ k and f(J) ≤ Ku do
6 covered ← 0
7 for n ← 1 to N do
8 Pf(J),g(Mn,J) ← Pf(J),g(Mn,J) + 1
9 end for
10 for j ← 1 to vt do
11 if Pf(J),j > 0 then
12 covered ← covered + 1
13 end if
14 end for
15 Miss ← Miss + vt − covered
       /* Calculates the next t-tuple */
16 Jt ← Jt + 1
17 if Jt > k and iMax > 0 then
18 JiMax ← JiMax + 1
19 for i ← iMax + 1 to t do
20 Ji ← Ji−1 + 1
21 end for
22 if JiMax > k − t + iMax then
23 iMax ← iMax − 1
24 else
25 iMax ← t
26 end if
27 end if
28 end while
29 return Miss
30 end
```
To make the distribution of work, it is necessary to calculate the initial *t*-tuple *f* for each core according to its ID (denoted by *rank*), where *F* = *rank* · B. Therefore it is necessary a method to convert the scalar *F* to the equivalent *t*-tuple *ci* ∈ C. The sequential generation of each *t*-tuple *ci* previous to *cF* can be a time consuming task. There is where lies the main contribution of our grid approach; its simplicity is combined with a clever strategy for computing the initial *t*-tuple of each block.

We propose the *getInitialTuple* function as a method that generates *cF* (see Algorithm 3), according to a lexicographical, without generating its previous *t*-tuples *ci*, where *i < F*. To explain the purpose of the *getInitialTuple* function, lets consider the *CA*(9; 2, 4, 3) shown in Fig. 1. This CA has as set C the elements found in column 1 of Table 4(a). The *getInitialTuple* function with input *k* = 4, *t* = 2, *F* = 3 must return *J* = {1, 4}, i.e. the values of the *t*-tuple *c*3. The *getInitialTuple* function is optimized to find the vector *J* = {*J*1, *J*2, ..., *Jt*} that corresponds to *F*. The value *Ji* is calculated according to

$$J\_i = \min\_{j \ge 1} \left\{ \Delta\_i \le \sum\_{l=j\_{i-1}+1}^j \binom{k-l}{t-i} \right\}.$$

where

8 Grid Computing

The *t\_wise* function first counts for each different *t*-tuple *ci* the times that a combination *wj* ∈ W is found in the columns of M corresponding to *ci*. After that, it calculates the missing combinations *wj* ∈ W in *ci*. Finally, it transforms *ci* into *ci*+1, i.e. it determines the next *t*-tuple

The pseudocode for *t\_wise* function is presented in Algorithm 2. For each different *t*-tuple (lines 5 to 28) the function performs the following actions: counts the expected number of times a combination *wj* appears in the set of columns indicated by *J* (lines 6 to 14, where the combination *wj* is the one appearing in M*n*,*J*, i.e. in row *n* and *t*-tuple *J*); then, the counter *covered* is increased in the number of different combinations with a number of repetitions greater than zero (lines 10 to 12). After that, the function calculates the number of missing combinations (line 15). The last step of each iteration of the function is the calculation of the next *t*-tuple to be analyzed (lines 16 to 27). The function ends when all the *t*-tuples have been

/\* Calculates the next *t*-tuple \*/

To make the distribution of work, it is necessary to calculate the initial *t*-tuple *f* for each core according to its ID (denoted by *rank*), where *F* = *rank* · B. Therefore it is necessary a method to

to be evaluated.

analyzed (line 5).

**<sup>2</sup> begin**

**Algorithm 2:** Function to verify a CA. **Output**: Number of missing *t*-tuples.

**<sup>3</sup>** *Miss* ← 0, *iMax* ← *t* − 1, *P* ← 0 **<sup>4</sup>** *J* ← getInitialTuple(*k*,*t*, K*l*) **<sup>5</sup> while** *Jt* ≤ *k* **and** *f*(*J*) ≤ K*<sup>u</sup>* **do**

**<sup>8</sup>** *Pf*(*J*),*g*(M*n*,*J*) ← *Pf*(*J*),*g*(M*n*,*J*) + <sup>1</sup>

**<sup>12</sup>** *covered* ← *covered* + 1

**<sup>15</sup>** *Miss* <sup>←</sup> *Miss* <sup>+</sup> *<sup>v</sup><sup>t</sup>* <sup>−</sup> *covered*

 **if** *Jt > k* **and** *iMax >* 0 **then** *JiMax* ← *JiMax* + 1 **for** *i* ← *iMax* + 1 **to** *t* **do** *Ji* ← *Ji*−<sup>1</sup> + 1

**<sup>22</sup> if** *JiMax > k* − *t* + *iMax* **then <sup>23</sup>** *iMax* ← *iMax* − 1

**<sup>1</sup>** *t*\_*wise*(M, *N*, *k*, *v*, *t*, K*l*, K*u*)

**<sup>6</sup>** *covered* ← 0 **<sup>7</sup> for** *n* ← 1 **to** *N* **do**

**<sup>10</sup> for** *<sup>j</sup>* <sup>←</sup> <sup>1</sup> **to** *<sup>v</sup><sup>t</sup>* **do <sup>11</sup> if** *Pf*(*J*),*<sup>j</sup> >* 0 **then**

**<sup>9</sup> end for**

**<sup>13</sup> end if <sup>14</sup> end for**

**<sup>16</sup>** *Jt* ← *Jt* + 1

**<sup>21</sup> end for**

**<sup>25</sup>** *iMax* ← *t* **<sup>26</sup> end if <sup>27</sup> end if <sup>28</sup> end while <sup>29</sup> return** *Miss*

**<sup>24</sup> else**

**<sup>30</sup> end**

$$\Delta\_{\bar{l}} = F - \sum\_{m=1}^{\bar{i}-1} \sum\_{l=j\_{m-1}+1}^{J\_m-1} \binom{k-l}{t-m}.$$

and

$$I\_0 = 0.$$

#### **Algorithm 3:** Get initial *t*-tuple to PA.

**Input**: Parameters *k* and *t*; the scalar corresponding to the first *t*-tuple (K*l*). **Output**: The initial *t*-tuple.

```
1 getInitialTuple ( k, t, Kl )
2 begin
3 Θ ← Kl, iK ← 1, iT ← 1
4 kint ← (
           k−iK
           t−iT)
5 for i ← 1 to t do
6 while Θ > kint do
7 Θ ← Θ − kint
8 iK ← iK + 1
9 kint ← (
                  k−iK
                  t−iT)
10 end while
11 Ji ← iK
12 iK ← iK + 1
13 iT ← iT + 1
14 kint ← (
              k−iK
              t−iT)
15 end for
16 return J
17 end
```
In summary, the Algorithm 3 only requires the computation of *O*(*t* × *k*) binomials to compute the *n* initial *t*-tuples of the PA. This represents a great improvement in contrast with the naive approach that would require the generation of all the ( *k t* ) *t*-tuples, as done in the SA.

The next three sections presents a simulated annealing approach to construct CAs. Section 4 describes in depth the components of our algorithm. Section 5 presents a method to

1. Generate the first row *r*<sup>1</sup> at random.

*g*(*ri*) =

*i*−1 ∑ *s*=1

4. Repeat from step 2 until *M* is completed.

*Rows*

matrix M and a candidate row *c*1.

expected solution will be zero missing.

computational complexity is equivalent to *O*(*N*(

**4.3 Evaluations function**

**4.4 Neighborhood function**

setting the *j*

⎧ ⎨ ⎩ *<sup>r</sup>*<sup>1</sup> = �

*r*<sup>2</sup> = �

*<sup>c</sup>*<sup>1</sup> = �

2101 �

1201 �

0210 �

Fig. 2. Example of the hamming distance between two rows *r*1, *r*<sup>2</sup> that are already in the

The *evaluation function E*(*M*) is used to estimate the goodness of a candidate solution. Previously reported metaheuristic algorithms for constructing CA have commonly evaluated the quality of a potential solution (covering array) as the number of combination of symbols missing in the matrix *M* (Cohen et al., 2003; Nurmela, 2004; Shiba et al., 2004). Then, the

In the proposed SA implementation this evaluation function definition was used. Its

Given that our SA implementation is based on Local Search (LS) then a neighborhood function must be defined. The main objective of the neighborhood function is to identify the set of potential solutions which can be reached from the current solution in a LS algorithm. In case two or more neighborhoods present complementary characteristics, it is then possible and interesting to create more powerful compound neighborhoods. The advantage of such an approach is well documented in (Cavique et al., 1999). Following this idea, and based on the results of our preliminary experimentations, a neighborhood structure composed by two

Two *neighborhood functions* were implemented to guide the local search of our SA algorithm. The neighborhood function N1(*s*) makes a random search of a missing *t*-tuple, then tries by

randomly chooses a position (*i*, *j*) of the matrix *M* and makes all possible changes of symbol. During the search process a combination of both N1(*s*) and N2(*s*) neighborhood functions is employed by our SA algorithm. The former is applied with probability *P*, while the latter

*th* combination of symbols in every row of *<sup>M</sup>*. The neighborhood function <sup>N</sup>2(*s*)

different functions is proposed for this SA algorithm implementation.

*k t* )).

added to the *i*

row *c*<sup>1</sup> is 7.

2. Generate two rows *c*1, *c*<sup>2</sup> at random, which will be candidate rows.

*th* row of the matrix *M*.

*k* ∑ *v*=1

3. Select the candidate row *ci* that maximizes the Hamming distance according to Eq. 4 and

Using Grid Computing for Constructing Ternary Covering Arrays 231

An example is shown in Fig. 2; the number of symbols different between rows *r*<sup>1</sup> and *c*<sup>1</sup> are 4 and between *r*<sup>2</sup> and *c*<sup>1</sup> are 3 summing up 7. Then, the hamming distance for the candidate

*<sup>d</sup>*(*ms*,*v*, *mi*,*v*), **where** *<sup>d</sup>*(*ms*,*v*, *mi*,*v*) = � <sup>1</sup> **if** *ms*,*<sup>v</sup>* �<sup>=</sup> *mi*,*<sup>v</sup>*

*Distances*

⎧ ⎨ ⎩

*d*(*r*1, *c*1) = 4 *d*(*r*2, *c*1) = 3 *g*(*c*1) = 7

<sup>0</sup> **Otherwise** (4)

parallelizing our SA algorithm. Section 6 describes how to implement our algorithm on a grid architecture.
