**7. References**

92 Genetic Programming – New Approaches and Successful Applications

samples with high representative power.

estimates, even with the use of small samples.

accurately, presenting mean global errors below 5%.

*3Federal University of Pernambuco, Recife, Brazil* 

Guilherme Esmeraldo1,3,\*, Robson Feitosa1, Dilza Esmeraldo2, Edna Barros3

**6. Conclusions** 

**Author details** 

Corresponding Author

 \*

*1Federal Institute of Ceará, Crato, 2Catholic College of Cariri, Crato,* 

**Measurement Results Set A B C**  *Mann-Whitney-Wilcoxon* **test (P-Value)** 53.05% 69.11% 59.25% **Mean global error** 4.12% 4.15% 4.75% **Maximum error** 11.11% 14.21% 9.23% **Minimum error** 4.905e-05% 9.171e-02% 8.27e-06% **Table 6.** Test of fitness to the data of the test set and the global mean, maximum and minimum errors.

Still analyzing the results of the measurements presented in Table 6, we notice that the indexes obtained for the three sets, were comparatively very close. Such results may be explained by the used of the technique of selection of the training sets, which returns

In general, the use of the approach proposed in this work, which added methods for evaluation of the LRMs selected by the GP algorithm and the technique of selection of the elements of the training sets, allows the obtainment of solutions capable of providing precise

This work has described an approach for obtainment and formal validation of LRMs, by means of the combination of genetic programming with statistical models. Our approach used the Audze-Eglais Uniform Latin Hypercube technique for the selection of samples with high representative power to form the training set. In order to evaluate the LRMs found with the introduced technique, we used statistical tests of hypothesis and residual analysis,

In order to validate the proposed approach, we used a case study, with the prediction of performance in embedded systems. The problem of the case study consisted in exploring the configurations of a data bus in order to optimize the performance of the embedded application of sorting a set of integers by radix. So, with the use of the proposed technique, we generated LRMs capable of estimating the performance for all of the bus configurations. The validation stages allowed us to realize that the LRMs found are adequate to the prediction of performance of the application, since all the assumptions about the structures of the errors were verified. So, the final LRMs were able to estimate the performances

aiming to verify the assumptions about the structures of the errors of these models.

	- [17] Antony J (2003) Design of Experiments for Engineers and Scientists. Butterworth-Heinemann.

**Chapter 0**

**Chapter 5**

**Parallel Genetic Programming on**

Douglas A. Augusto, Heder S. Bernardino and Helio J.C. Barbosa

In program inference, the evaluation of how well a candidate solution solves a certain task is usually a computationally intensive procedure. Most of the time, the evaluation involves either submitting the program to a simulation process or testing its behavior on many input arguments; both situations may turn out to be very time-consuming. Things get worse when the optimization algorithm needs to evaluate a population of programs for several iterations,

Genetic programming (GP) is well-known for being a computationally demanding technique, which is a consequence of its ambitious goal: to automatically generate computer programs—in an arbitrary language—using virtually no domain knowledge. For instance, evolving a classifier, a program that takes a set of attributes and predicts the class they belong to, may be significantly costly depending on the size of the training dataset, that is, the amount

Fortunately, GP is an inherently parallel paradigm, making it possible to easily exploit any amount of available computational units, no matter whether they are just a few or many thousands. Also, it usually does not matter whether the underlying hardware architecture can process simultaneously instructions and data ("MIMD") or only data ("SIMD").<sup>1</sup> Basically, GP exhibits three levels of parallelism: (i) *population-level* parallelism, when many populations evolve simultaneously; (ii) *program-level* parallelism, when programs are evaluated in parallel; and finally (iii) *data-level* parallelism, in which individual training points for a single program

Until recently, the only way to leverage the parallelism of GP in order to tackle complex problems was to run it on large high-performance computational installations, which are normally a privilege of a select group of researchers. Although the multi-core era has emerged and popularized the parallel machines, the architectural change that is probably going to

> ©2012 Augusto et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly

<sup>1</sup> MIMD stands for *Multiple Instructions Multiple Data* whereas SIMD means *Single Instruction Multiple Data*.

of data needed to estimate the prediction accuracy of a single candidate classifier.

**Graphics Processing Units**

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48364

which is the case of genetic programming.

are evaluated simultaneously.

cited.

**1. Introduction**

