**1. Introduction**

40 Will-be-set-by-IN-TECH

220 Grid Computing – Technology and Applications, Widespread Coverage and New Horizons

[68] The ALICE Collaboration: Elliptic Flow of Charged Particles in Pb-Pb Collisions at

[70] I. Foster et al: Cloud Computing and Grid Computing 360-Degree Compared, Proc. of the Grid Computing Environments Workshop, 2008. GCE '08, Austin, Texas, http://arxiv.org/ftp/arxiv/papers/0901/0901.0131.pdf

[71] C. Loomis: StratusLab: Enhancing Grid Infrastructures with Cloud and Virtualization

[74] The ATLAS Collaboration: Search for a Standard Model Higgs boson in the *H* → *ZZ* → *llνν* decay channel with the ATLAS detector; arXiv:1109.3357v1, Sep. 2011. [75] The CMS Collaboration: New CMS Higgs Search Results for the Lepton Photon 2011 Conference; http://cms.web.cern.ch/news/ new-cms-higgs-search-results-lepton-photon-2011-conference [76] The LHCb Collaboration: A search for time-integrated CP violation in *<sup>D</sup>*<sup>0</sup> <sup>→</sup> *<sup>h</sup>*+*h*<sup>−</sup>

http://cdsweb.cern.ch/record/1349500/files/LHCb-CONF-2011-023.

The LHCb Collaboration: Measurement of the CP Violation Parameter *A*<sup>Γ</sup> in Two-Body

http://cdsweb.cern.ch/record/1370107/files/LHCb-CONF-2011-046.

<sup>√</sup>*sNN* <sup>=</sup> 2.76 TeV, Phys. Rev. Lett. 105 (2010), 252302.

[69] Physics Publications of the ALICE Collaboration in Refereed Journals,

https://tnc2011.terena.org/web/media/archive/11C

channel in pp collisions at <sup>√</sup>*<sup>s</sup>* <sup>=</sup> 7 TeV with the ATLAS detector;

decays and a measurement of the D0 production asymmetry;

arXiv:1109.3615v1, Sep. 2011.

pdf

pdf

Charm Decays;

http://aliceinfo.cern.ch/Documents/generalpublications

Technologies, TERENA Networking Conference (TNC2011),Prague, 2011,

[72] LXCloud: https://twiki.cern.ch/twiki/bin/view/FIOgroup/LxCloud [73] The ATLAS Collaboration: Search for the Higgs boson in the *H* → *WW* → *lνjj* decay

> To continue at the forefront in this fast paced and competitive world, companies have to be highly adaptable and to suit such transforming needs customized software solutions play a key role. To support this customization, software systems must provide numerous configurable options. While this flexibility promotes customizations, it creates many potential system configurations, which may need extensive quality assurance.

> A good strategy to test a software component involves the generation of the whole set of cases that participate in its operation. While testing only individual values may not be enough, exhaustive testing of all possible combinations is not always feasible. An alternative technique to accomplish this goal is called combinatorial testing. Combinatorial testing is a method that can reduce cost and increase the effectiveness of software testing for many applications. It is based on constructing economical sized test-suites that provide coverage of the most prevalent configurations. Covering arrays (CAs) are combinatorial structures which can be used to represent these test-suites.

> A covering array (CA) is a combinatorial object, denoted by *CA*(*N*;*t*, *k*, *v*) which can be described like a matrix with *N* × *k* elements, such that every *N* × *t* subarray contains all possible combinations of *v<sup>t</sup>* symbols at least once. *N* represents the rows of the matrix, *k* is the number of parameters, which has *v* possible values and *t* represents the strength or the degree of controlled interaction.

> To illustrate the CA approach applied to the design of software testing, consider the Web-based system example shown in Table 1, the example involves four parameters each with three possible values. A full experimental design (*t* = 4) should cover 34 = 81 possibilities, however, if the interaction is relaxed to *t* = 2 (pair-wise), then the number of possible combinations is reduced to 9 test cases.

The trivial mathematical *lower bound* for a covering array is *<sup>v</sup><sup>t</sup>* <sup>≤</sup> *CAN*(*t*, *<sup>k</sup>*, *<sup>v</sup>*), however, this number is rarely achieved. Therefore determining achievable bounds is one of the main research lines for CAs. Given the values of *t*, *k*, and *v*, the optimal CA construction problem

Using Grid Computing for Constructing Ternary Covering Arrays 223

The construction of *CAN*(2, *k*, 2) can be efficiently done according with Kleitman & Spencer (1973); the same is possible for *CA*(2, *k*, *v*) when the cardinality of the alphabet is *v* = *pn*, where *p* is a prime number and *n* a positive integer value (Bush, 1952). However, in the general case determining the *covering array number* is known to be a hard combinatorial problem (Colbourn, 2004; Lei & Tai, 1998). This means that there is no known efficient algorithm to find an optimal CA for any level of interaction *t* or alphabet *v*. For the values of *t* and *v* that no efficient algorithm is known, we use approximated algorithms to construct them. Some of these approximated strategies must verify that the matrix they are building is

otherwise it will never be a CA and its verification is pointless). For small values of *t* and *v* the verification of CAs is overcame through the use of sequential approaches; however, when we try to construct CAs of moderate values of *t*, *v* and *k*, the time spent by those approaches is impractical. This scenario shows the necessity of Grid strategies to construct and verify CAs. Grid Computing is a technology which allows sharing resources between different administration domains, in a transparent, efficient and secure way. The resources comprise: computation hardware (supercomputers or clusters) or storage systems, although it is also possible to share information sources, such as databases or scientific equipment. So, the main concept behind the Grid paradigm is to offer a homogeneous and standard interface for accessing these resources. In that sense, the evolution of Grid Middlewares has enabled the deployment of Grid e-Science infrastructures delivering large computational and data storage capabilities. The current infrastructures rely on Globus Toolkit (Globus Alliance, 2011), UNICORE (Almond & Snelling, 1999), GRIA (Surridge et al., 2005) or gLite (gLite, 2011) mainly as core middleware supporting several central services dedicated to: user management, job metascheduling, data indexing (cataloguing) and information system, providing consolidated virtual view of the whole or larger parts of the infrastructure. The availability of hundreds and thousands of processing elements (PEs) and the efficient storage of Petabyes of data is expanding the knowledge on areas such as particle physics, astronomy, genetics or software testing. Thus, Grid Computing infrastructures are the cornerstone in the

In this work is reported the use of Grid Computing by means of the use of the European production infrastructure provided by the European Grid Infrastructure (EGI) (EGI, 2011) project. The availability of this kind of computing platforms makes feasible the execution of computing-intensive applications, such as the construction and verification of CAs. In this

The chapter is structured as follows. First, Section 2 offers a review of the relevant related work. Then, the algorithm for the verification of CAs is exposed in Section 3. Moreover, Section 4 details the algorithm for the construction of CAs by using a simulated annealing algorithm. Next, the Section 5 explains how to parallelize the previous algorithm using a master-slave approach. Taking the previous parallelization, Section 6 describes how to develop a Grid implementation of the construction of CAs. The results obtained in the

work we focus on the construction of ternary CAs when 5 ≤ *k* ≤ 100 and 2 ≤ *t* ≤ 4.

*k t*

)) for the verification (when the matrix has *<sup>N</sup>* <sup>≥</sup> *<sup>v</sup><sup>t</sup>* rows,

) different combinations

(CAC) consists in constructing a *CA*(*N*;*t*, *k*, *v*) such that the value of *N* is minimized.

a CA. If the matrix is of size *N* × *k* and the interaction is *t*, there are (

*k t*

which implies a cost of *O*(*N* × (

current scientific research.


Table 1. Parameters of Web-based system example.

Fig. 1 shows the CA corresponding to *CA*(9; 2, 4, 3); given that its strength and alphabet are *t* = 2 and *v* = 3, respectively, the combinations that must appear at least once in each subset of size *N* × 2 are {0, 0}, {0, 1}, {0, 2}, {1, 0}, {1, 1}, {1, 2}, {2, 0}, {2, 1}, {2, 2}.

```
⎛
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝
  0000
  0111
  0222
  1012
  1120
  1201
  2021
  2102
  2210
            ⎞
            ⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠
```
Fig. 1. A combinatorial design, *CA*(9; 2, 4, 3).

Finally, to make the mapping between the CA and the Web-based system, every possible value of each parameter in Table 1 is labeled by the row number. Table 2 shows the corresponding pair-wise test suite; each of its nine experiments is analogous to one row of the CA shown in Fig. 1.


Table 2. Test-suite covering all 2-way interactions, *CA*(9; 2, 4, 3).

When a *CA* contains the minimum possible number of rows, it is optimal and its size is called the *Covering Array Number* (*CAN*). The *CAN* is defined according to

$$\mathcal{C}AN(t,k,v) = \min\_{N \in \mathbb{N}} \{ N : \exists \, \mathcal{C}A(N;t,k,v) \}.$$

2 Grid Computing

**Browser OS DBMS Connections**

Fig. 1 shows the CA corresponding to *CA*(9; 2, 4, 3); given that its strength and alphabet are *t* = 2 and *v* = 3, respectively, the combinations that must appear at least once in each subset

Finally, to make the mapping between the CA and the Web-based system, every possible value of each parameter in Table 1 is labeled by the row number. Table 2 shows the corresponding pair-wise test suite; each of its nine experiments is analogous to one row of the CA shown in

> **Experiments** 1 Firefox Windows 7 MySQL ISDN 2 Firefox Ubuntu 10.10 PostgreSQL ADSL 3 Firefox Red Hat 5 MaxDB Cable 4 Chromium Windows 7 PostgreSQL Cable 5 Chromium Ubuntu 10.10 MaxDB ISDN 6 Chromium Red Hat 5 MySQL ADSL 7 Netscape Windows 7 MaxDB ADSL 8 Netscape Ubuntu 10.10 MySQL Cable 9 Netscape Red Hat 5 PostgreSQL ISDN

When a *CA* contains the minimum possible number of rows, it is optimal and its size is called

*<sup>N</sup>*∈**N**{*<sup>N</sup>* : <sup>∃</sup> *CA*(*N*;*t*, *<sup>k</sup>*, *<sup>v</sup>*)}.

⎞

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

**0** Firefox Windows 7 MySQL ISDN **1** Chromium Ubuntu 10.10 PostgreSQL ADSL **2** Netscape Red Hat 5 MaxDB Cable

of size *N* × 2 are {0, 0}, {0, 1}, {0, 2}, {1, 0}, {1, 1}, {1, 2}, {2, 0}, {2, 1}, {2, 2}. ⎛

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

Table 1. Parameters of Web-based system example.

Fig. 1. A combinatorial design, *CA*(9; 2, 4, 3).

Table 2. Test-suite covering all 2-way interactions, *CA*(9; 2, 4, 3).

the *Covering Array Number* (*CAN*). The *CAN* is defined according to

*CAN*(*t*, *k*, *v*) = min

Fig. 1.

The trivial mathematical *lower bound* for a covering array is *<sup>v</sup><sup>t</sup>* <sup>≤</sup> *CAN*(*t*, *<sup>k</sup>*, *<sup>v</sup>*), however, this number is rarely achieved. Therefore determining achievable bounds is one of the main research lines for CAs. Given the values of *t*, *k*, and *v*, the optimal CA construction problem (CAC) consists in constructing a *CA*(*N*;*t*, *k*, *v*) such that the value of *N* is minimized.

The construction of *CAN*(2, *k*, 2) can be efficiently done according with Kleitman & Spencer (1973); the same is possible for *CA*(2, *k*, *v*) when the cardinality of the alphabet is *v* = *pn*, where *p* is a prime number and *n* a positive integer value (Bush, 1952). However, in the general case determining the *covering array number* is known to be a hard combinatorial problem (Colbourn, 2004; Lei & Tai, 1998). This means that there is no known efficient algorithm to find an optimal CA for any level of interaction *t* or alphabet *v*. For the values of *t* and *v* that no efficient algorithm is known, we use approximated algorithms to construct them. Some of these approximated strategies must verify that the matrix they are building is a CA. If the matrix is of size *N* × *k* and the interaction is *t*, there are ( *k t* ) different combinations which implies a cost of *O*(*N* × ( *k t* )) for the verification (when the matrix has *<sup>N</sup>* <sup>≥</sup> *<sup>v</sup><sup>t</sup>* rows, otherwise it will never be a CA and its verification is pointless). For small values of *t* and *v* the verification of CAs is overcame through the use of sequential approaches; however, when we try to construct CAs of moderate values of *t*, *v* and *k*, the time spent by those approaches is impractical. This scenario shows the necessity of Grid strategies to construct and verify CAs.

Grid Computing is a technology which allows sharing resources between different administration domains, in a transparent, efficient and secure way. The resources comprise: computation hardware (supercomputers or clusters) or storage systems, although it is also possible to share information sources, such as databases or scientific equipment. So, the main concept behind the Grid paradigm is to offer a homogeneous and standard interface for accessing these resources. In that sense, the evolution of Grid Middlewares has enabled the deployment of Grid e-Science infrastructures delivering large computational and data storage capabilities. The current infrastructures rely on Globus Toolkit (Globus Alliance, 2011), UNICORE (Almond & Snelling, 1999), GRIA (Surridge et al., 2005) or gLite (gLite, 2011) mainly as core middleware supporting several central services dedicated to: user management, job metascheduling, data indexing (cataloguing) and information system, providing consolidated virtual view of the whole or larger parts of the infrastructure. The availability of hundreds and thousands of processing elements (PEs) and the efficient storage of Petabyes of data is expanding the knowledge on areas such as particle physics, astronomy, genetics or software testing. Thus, Grid Computing infrastructures are the cornerstone in the current scientific research.

In this work is reported the use of Grid Computing by means of the use of the European production infrastructure provided by the European Grid Infrastructure (EGI) (EGI, 2011) project. The availability of this kind of computing platforms makes feasible the execution of computing-intensive applications, such as the construction and verification of CAs. In this work we focus on the construction of ternary CAs when 5 ≤ *k* ≤ 100 and 2 ≤ *t* ≤ 4.

The chapter is structured as follows. First, Section 2 offers a review of the relevant related work. Then, the algorithm for the verification of CAs is exposed in Section 3. Moreover, Section 4 details the algorithm for the construction of CAs by using a simulated annealing algorithm. Next, the Section 5 explains how to parallelize the previous algorithm using a master-slave approach. Taking the previous parallelization, Section 6 describes how to develop a Grid implementation of the construction of CAs. The results obtained in the

Calvagna et al. (2009) proposed a solution for executing the reduction algorithm over a set

Using Grid Computing for Constructing Ternary Covering Arrays 225

Metaheuristic algorithms are capable of solving a wide range of combinatorial problems effectively, using generalized heuristics which can be tailored to suit the problem at hand. Heuristic search algorithms try to solve an optimization problem by the use of heuristics. A heuristic search is a method of performing a minor modification of a given solution in order

Some metaheuristic algorithms, such as TS (Tabu Search) (Gonzalez-Hernandez et al., 2010; Nurmela, 2004), SA (Simulated Annealing) (Cohen et al., 2003; Martinez-Pena et al., 2010; Torres-Jimenez & Rodriguez-Tello, 2012), GA (Generic Algorithm) and ACA (Ant Colony Optimization Algorithm) (Shiba et al., 2004) provide an effective way to find approximate solutions. Indeed, a SA metaheuristic has been applied by Cohen et al. (2003) for constructing CAs. Their SA implementation starts with a randomly generated initial solution *M* which cost *E*(*M*) is measured as the number of uncovered *t*-tuples. A series of iterations is then carried out to visit the search space according to a neighborhood. At each iteration, a neighboring solution *M*� is generated by changing the value of the element *ai*,*<sup>j</sup>* by a different legal member of the alphabet in the current solution *M*. The cost of this iteration is evaluated

*M*� is accepted. Otherwise, it is accepted with probability *P*(Δ*E*) = *e*−Δ*E*/*Tn* , where *Tn* is determined by a cooling schedule. In their implementation, Cohen et al. use a simple linear function *Tn* = 0.9998*Tn*−<sup>1</sup> with an initial temperature fixed at *Ti* = 0.20. At each temperature, 2000 neighboring solutions are generated. The algorithm stops either if a valid covering array is found, or if no change in the cost of the current solution is observed after 500 trials. The authors justify their choice of these parameter values based on some experimental tuning. They conclude that their SA implementation is able to produce smaller CAs than other computational methods, sometimes improving upon algebraic constructions. However, they also indicate that their SA algorithm fails to match the algebraic constructions for larger

Some of these approximated strategies must verify that the matrix they are building is a CA.

small values of *t* and *v* the verification of CAs is overcame through the use of sequential approaches; however, when we try to construct CAs of moderate values of *t*, *v* and *k*, the time spent by those approaches is impractical, for example when *t* = 5, *k* = 256, *v* = 2 there are 8, 809, 549, 056 different combinations of columns which require days for their verification.

The next section presents an algorithm for the verification of a given matrix is a CA. The

In this section we describe a grid approach for the problem of verification of CAs.

A matrix M of size *N* × *k* is a *CA*(*N*;*t*, *k*, *v*) *iff* every *t*-tuple contains the set of combination

This scenario shows the necessity of grid strategies to solve the verification of CAs.

design of algorithm is presented for its implementation in grid architectures.

) − *E*(*M*). If Δ*E* is negative or equal to zero, then the neighboring solution

*k t*

. We propose a strategy that uses two data structures

)) (given that the verification cost per combination is *O*(*N*)). For

) different combinations which

of Grid resources.

as Δ*E* = *E*(*M*�

to obtain a different solution.

problems, especially when *t* = 3.

implies a cost of *O*(*N* × (

If the matrix is of size *N* × *k* and the interaction is *t*, there are (

**3. An algorithm for the verification of covering arrays**

See (Avila-George et al., 2010) for more details.

of symbols described by {0, 1, ..., *<sup>v</sup>* <sup>−</sup> <sup>1</sup>}*<sup>t</sup>*

*k t*

experiments performed in the Grid infrastructure are showed in Section 7. Finally, Section 8 presents the conclusions derived from the research presented in this work.
