**7. The forcasting delay phenomenon**

14 Will-be-set-by-IN-TECH

Therefore it can be built AR models for these error series, which will have the capability to adequately model the error, which allows, when considering these two models, to obtain a

The most general models used in this work are the Autoregressive Moving Averages ARMA (p, q) that contain the presence of autoregressive components in the observable variable *Zt*

> *p* ∑ *i*=1

*<sup>φ</sup>iZt*−*<sup>i</sup>* +

Once the AR model is obtained for a series it can be built an ARMA model from the acquiring other AR model for the series obtained when considering the *at* errors between the original series and its AR model. When is added to the AR model an additional component that considers the autoregressive terms corresponding to the error is obtained the complete ARMA

The procedure to build the ARMA models is realized in two stages. First is built an AR model for the original series, afterwards it is considered the error series *at* to which it is found other

*<sup>φ</sup>iZt*−*i*)

*γjat*−*<sup>j</sup>*

*q* ∑ *j*=1

*at* = *Zt* − (*δ* +

*p* ∑ *i*=1

*Ft* = *δ* +

model. Figure. 10 shows an example of the error for the series.

Fig. 10. Example of a TS corresponding to the error.

Fig. 9. Example 106.

bigger forecasting capability.

and in the error *at*, where:

and

**6.1 Building of the ARMA models**

Analyzing the graphs of the built models with this methodology for the examples of the NN3-complete it was detected a phenomenon that visually appears as if the graph of the model were almost the same that the original series, but with a displacement of one unit to the right. This phenomenon was observed in the NN3-Complete in 20 examples: 51, 64, 66, 74, 80, 82, 83, 84, 85, 86, 88, 89, 90, 91, 92, 95, 100, 105, 107 and 109.

Given that the first 50 examples of the competition corresponded to series of 50 values (apparently built by experts) and the last 61 examples were series of 150 terms (seemingly of real phenomenon) it was supposed that the 34% of the real examples of the NN3 present this behavior. From this information we can assume that this phenomenon appears in a large percentage of the models built with this metodology and, for this reason the model built with this methodology will give better results when applying to these series. Following is showed in Fig. 11 an example of this phenomenon corresponding to the AR model of the example 74 obtained with the methodology of this work.

Fig. 11. Example 74 of the NN3-Complete.

This phenomenon was called in this work as *forecasting delay* (FD), since is equivalent to forecast in a certain moment what happen in the previous moment.

### **8. The procedure of advancement of forecasting**

The FD phenomenon can be used by modifying the graph of the linear models obtained by applying a displacement of one unit to the left of its graph. This procedure was defined as *advancement of forecasting* (AF) and it is formalized next.

Example AR RSS RSS AF imp % 0.9968 + 0.81291 + 0.659412 − 0.479913 3767.009 3379.4502 10.28 0.3894 + 0.91991 + 0.677812 − 0.599913 3899.114 3495.0649 10.36 0.9984 + 0.92021 + 0.585812 − 0.510313 3893.0544 2803.1406 27.99 0.9993 + 0.94481 + 0.522612 − 0.480013 4894.1655 3340.5911 31.74 −0.9991 + 0.62351 + 0.21612 + 0.190717 4114.1499 1917.1523 53.40 0.9979 + 0.70011 + 0.153111 + 0.143818 2449.5383 1265.7606 48.32 0.9995 + 0.89141 + 0.216912 − 0.107913 1247.8290 339.8757 97.27 1.9980 + 0.90991 + 0.310411 − 0.222513 1513.984 664.8109 56.08

Self Adaptive Genetic Algorithms for Automated Linear Modelling of Time Series 229

To evaluate the performance of a model on a TS data is divided into two sets called training set and test set. The training set has the first values of the series (approximately 90% of the total) and the test set the last 10%. The information of the training set model is used to choose the model and evaluate the parameters. Once chosen the corresponding model is evaluated its ability to forecast the test set, and when it is had different model proposals it is common to choose the best result of the test set. For this assessment can be used several measures of

(a). In this first stage is calculated the AR part of the model. For this, from *K* = 2 are built the models AR with *K* terms and is tested the performance on the test set. As soon as the first *K* value is obtained where the RSS of the model is less than the values obtained for the *K* − 1 and *K* + 1 is considered that the AR part of the model has the already found *K* terms

(b). It is calculated the error series obtained from the original series and the ones calculated by the model obtained in the previous stage. On this new series it is applied the same procedure above mention and it is obtained the part corresponding to the component of the MA moving average of the ARMA model. It may be the case that by including the MA components of the model it will be had the worst approximations in the test set than those obtained with the AR part. In this case the model would only have the AR component. (c). It is checked if the model AR obtained in the stage 1 presents the FD phenomenon occurs, and if so to realize the displacement of the graph one unit to the left according to (9) as

To test the performance of our models of (8) we used the series A, B, C, D, E and F appearing

In (Hansen at all, 1999) are shown the results of building several linear models for these series. The first is the classic BJ, and others apply when BJ model do not satisfy the postulate that the error is a white noise. In (McDonald & Yexiao, 1994) it is indicated that the use of these latest models improved from 8% to 13% the capability of prediction of the model when the error is not white noise. Immediately it is presented the relationship of these models for the linear

performance (Hyndman & Koehler, 2006). In this work preferably is used RSS. To build the models with the methodology of this wok it is proceeded as follows:

Table 2. Comparison of RSS for linear and linear with AF models.

**9. Comparisons with other methodologies**

and passes to the second stage.

models.

long as with this procedure the result is improved.

in (Box & Jenkins, 1976), used and presented in chapter 3.

Definition: Be a time series with model AR or ARMA

$$F\_t = \delta + \sum\_{i=1}^p \phi\_i Z\_{t-i} + \sum\_{j=1}^q \gamma\_j a\_{t-j}$$

The advancement of the forecasting was denominated as the following operation:

$$F\_t = F\_{t-1} \land r\_\prime t > \max\left(p\_\prime q\right) \tag{9}$$

When is applied to an AR or ARMA this operation it is said that is a linear model AR or linear ARMA with AF respectively. In figure 12 is shown the linear model of the example 74 with AF.

Fig. 12. Example 74 of NN3-Complete to which it was applied the advancement of forecasting.

A first result obtained is that if a series that presents FD it is applied the AF, then the value of RSS for these models is smaller than the error of the original ARMA models. This is caused because when is displaced the graph of a model one unit to the left, which is what means the operation (9), almost it is superimposed to the graph of the original series. Extrapolating this behavior to the region of forecasting it is expected that the same effect occurs and that the values of the linear model with AF be a better approximation than those of the linear models. Due to the above it is supposed that the linear models with AF will have a better forecasting capacity. As an example, in Table 2 is showed the improvement of the linear models with AF for 10 examples of NN3 that present DF.

The improvement (imp) in the models here presented ranges from 10.28% to 97.27% with an average of 48.48%, and it is expected that as the percentage is greater the ability of the forecasting model increases by a similar proportion. It should be noted that when it is had an AR model with four terms it is very difficult to improve substantially the value of RSS by incrementing the terms of the AR model or including terms of the part of the moving averages.


Table 2. Comparison of RSS for linear and linear with AF models.

#### **9. Comparisons with other methodologies**

16 Will-be-set-by-IN-TECH

*<sup>φ</sup>iZt*−*<sup>i</sup>* +

When is applied to an AR or ARMA this operation it is said that is a linear model AR or linear ARMA with AF respectively. In figure 12 is shown the linear model of the example 74 with

*q* ∑ *j*=1

*γjat*−*<sup>j</sup>*

*Ft* = *Ft*−1, *f or*, *<sup>t</sup>* > max (*p*, *<sup>q</sup>*) (9)

Definition: Be a time series with model AR or ARMA

AF.

forecasting.

for 10 examples of NN3 that present DF.

*Ft* = *δ* +

*p* ∑ *i*=1

The advancement of the forecasting was denominated as the following operation:

Fig. 12. Example 74 of NN3-Complete to which it was applied the advancement of

A first result obtained is that if a series that presents FD it is applied the AF, then the value of RSS for these models is smaller than the error of the original ARMA models. This is caused because when is displaced the graph of a model one unit to the left, which is what means the operation (9), almost it is superimposed to the graph of the original series. Extrapolating this behavior to the region of forecasting it is expected that the same effect occurs and that the values of the linear model with AF be a better approximation than those of the linear models. Due to the above it is supposed that the linear models with AF will have a better forecasting capacity. As an example, in Table 2 is showed the improvement of the linear models with AF

The improvement (imp) in the models here presented ranges from 10.28% to 97.27% with an average of 48.48%, and it is expected that as the percentage is greater the ability of the forecasting model increases by a similar proportion. It should be noted that when it is had an AR model with four terms it is very difficult to improve substantially the value of RSS by incrementing the terms of the AR model or including terms of the part of the moving averages. To evaluate the performance of a model on a TS data is divided into two sets called training set and test set. The training set has the first values of the series (approximately 90% of the total) and the test set the last 10%. The information of the training set model is used to choose the model and evaluate the parameters. Once chosen the corresponding model is evaluated its ability to forecast the test set, and when it is had different model proposals it is common to choose the best result of the test set. For this assessment can be used several measures of performance (Hyndman & Koehler, 2006). In this work preferably is used RSS.

To build the models with the methodology of this wok it is proceeded as follows:


To test the performance of our models of (8) we used the series A, B, C, D, E and F appearing in (Box & Jenkins, 1976), used and presented in chapter 3.

In (Hansen at all, 1999) are shown the results of building several linear models for these series. The first is the classic BJ, and others apply when BJ model do not satisfy the postulate that the error is a white noise. In (McDonald & Yexiao, 1994) it is indicated that the use of these latest models improved from 8% to 13% the capability of prediction of the model when the error is not white noise. Immediately it is presented the relationship of these models for the linear models.

Table 5 presents the results of comparing the method proposed in this work with those

Self Adaptive Genetic Algorithms for Automated Linear Modelling of Time Series 231

• Holt-Winters Methodology. This methodology is widely used due to its simplicity and accuracy of its forecasting's especially with periodic time series. It is based on four basic equations that represent the regularity, trend, periodicity and forecasting of the series

• Box-Jenkins Methodology that already was widely commented in previous sections (Box

• Evolutionary forecasting method. It is a methodology based on evolutionary

• Evolutionary meta algorithms. It is a metaheuristic that uses two architecture levels, in the first is chosen the ARMA model in question, and in the second the corresponding

To test the performance of the models, were used some of the series in (Hyndman, 2003). which are known as: Passengers, which is a series (144 data) that represents the number of monthly passengers on an airline; Paper, this series (120 data) represents the paper monthly sales in France; Deaths, which is a series (169 data) that represents the death and injury on roads of Germany; Maxtemp represents the maximum temperatures (240 data) in Melbourne, Australia; and Chemical, which is a series (198 data) of readings of the concentrations of a chemical reactor. The training sets of these series contain 90% of the data and remaining 10%

Linear AF 3.9 75 5.9 2.87 87 46 173

Using the method proposed in this work it were obtained the models that are shown in table

In Table 6 were confronted the results for these TS. The results of our models are shown in the column called "Linear AF" and the place gotten when comparing with the other models is

From the results presented in the tables of this section it can be concluded that the model built with our methodology outperform all the models obtained with statistical methods and are competitive with non-linear methods presented here. In addition, it must be added that this methodology is fully automated and allows modelling TS than other traditional

Heuristic NN 4.519 88.312 9.138 2.942 98.873 43.966 GA NN 3.705 72.398 6.684 2.952 69.536 36.4 ARIMA ML 4.005 78.855 11.247 3.114 3.114 49.161 OLS 3.937 83.17 10.74 3.08 114.8 45.5 LAD 3.96 79.47 10.3 3.066 117.6 44.46 GT 3.937 80.68 10.25 3.064 106.5 44.59 EBG2 4.017 81.01 10.3 3.066 111.8 44.5 PLACE 1 2 1 1 2 6 Table 4. Comparison of the models with regard to a sum of values of absolute errors.

Series A Series B Series C Series D Series E Series F Series G

reported in (Cortez at all, 2004). In this paper are confronted the methodologies:

(Chatfield, 2000).

& Jenkins, 1976).

are in the test set.

programming (Cortez at all, 2004).

parameters are estimated (Cortez at all, 2004).

5. Note that form this examples none presents DF.

shown in the column "Place".

methodologies can not.


Additionally in (Hansen at all, 1999) are presented the results of two models of neural networks, one heuristic (Heuristic NN), and another based on genetic algorithms (GANN), which are included in the commercial software BioComp Systemt's NeuroGenetic Optimizer ®.

To make comparisons with the models described above, it will be used the same size of training set and test sets shown in (Hansen at all, 1999), where if the number of elements of the series is greater than 100 the sizes of test sets are set to 10. In the event that they are less than or equal to 100 the test sets will have size five. The size of the training sets is the original size of the series minus the number of elements of the test set.

With the methodology of this work were obtained the models of the Table 3, where for each example is presented the component AR and if necessary the MA. Note that when it is shown "AF" in the last column of the table it was applied the displacement presented in (9).


Table 3. Solution to the Box Jenkins problems.

In Table 4 are shown the results of the different methodologies presented in (Hansen at all, 1999) and those obtained with the algorithm proposed in this work. Table 4 is used as a criterion of comparison of the sum of absolute values of errors. The results of our model are presented in the line called "Linear AF" and the place obtained when confronted with other models is in the line called "Place." It should be noted that each group of comparisons, except in one instance", the results obtained with our methodology are better than those obtained with the confronted statistical methods and also have good results when compared with those obtained by neural networks.

18 Will-be-set-by-IN-TECH

• Standard ARIMA model. Here applies the traditional methodology of BJ where the main components are the autoregressive models with moving averages that are linear in the time

• Ordinary least squares (OLS). These are used when the distribution of the error presents the leptokurtosis problem and allows diminishing the error in the forecasting (Huber,

• Least Absolute Deviation (LAD). It is used to minimize the sum of the absolute values rather the sum of squares. This is done to reduce the influence of the extreme errors (Huber,

• Generalized t-distribution (GT). Here is minimized the objective function in relation to the parameters but assuming that the error has a t-distribution (McDonald & Newey, 1988). • Exponential Generalized beta distribution of the second kind (EGB2). Here it is supposed

Additionally in (Hansen at all, 1999) are presented the results of two models of neural networks, one heuristic (Heuristic NN), and another based on genetic algorithms (GANN), which are included in the commercial software BioComp Systemt's NeuroGenetic Optimizer

To make comparisons with the models described above, it will be used the same size of training set and test sets shown in (Hansen at all, 1999), where if the number of elements of the series is greater than 100 the sizes of test sets are set to 10. In the event that they are less than or equal to 100 the test sets will have size five. The size of the training sets is the original

With the methodology of this work were obtained the models of the Table 3, where for each example is presented the component AR and if necessary the MA. Note that when it is shown

Series AR MA AF A 1.1035 + 0.56481 + 0.19196 + 0.12459 −0.22711 + 0.10462 + 0.05143 No

B 0.8302 + 1.12741 − 0.16852 + 0.06444 0.04602 − 0.05765 + 0.13816 Yes −0.02586 +0.05877 C 0.84251 − 0.84882 No D 0.7609 + 0.89971 + 0.051112 − 0.033516 Yes E 1.9993 + 1.00511 − 0.25903 + 0.153810 Yes F 1.9996 + 0.65552 + 0.29383 No

In Table 4 are shown the results of the different methodologies presented in (Hansen at all, 1999) and those obtained with the algorithm proposed in this work. Table 4 is used as a criterion of comparison of the sum of absolute values of errors. The results of our model are presented in the line called "Linear AF" and the place obtained when confronted with other models is in the line called "Place." It should be noted that each group of comparisons, except in one instance", the results obtained with our methodology are better than those obtained with the confronted statistical methods and also have good results when compared with those

"AF" in the last column of the table it was applied the displacement presented in (9).

that the errors have a distribution of this kind (McDonald & Newey, 1988).

series {*Zt*} and white noise {*at*} (Box & Jenkins, 1976).

size of the series minus the number of elements of the test set.

+0.054413

Table 3. Solution to the Box Jenkins problems.

obtained by neural networks.

2004).

2004).

®.

Table 5 presents the results of comparing the method proposed in this work with those reported in (Cortez at all, 2004). In this paper are confronted the methodologies:


To test the performance of the models, were used some of the series in (Hyndman, 2003). which are known as: Passengers, which is a series (144 data) that represents the number of monthly passengers on an airline; Paper, this series (120 data) represents the paper monthly sales in France; Deaths, which is a series (169 data) that represents the death and injury on roads of Germany; Maxtemp represents the maximum temperatures (240 data) in Melbourne, Australia; and Chemical, which is a series (198 data) of readings of the concentrations of a chemical reactor. The training sets of these series contain 90% of the data and remaining 10% are in the test set.


Table 4. Comparison of the models with regard to a sum of values of absolute errors.

Using the method proposed in this work it were obtained the models that are shown in table 5. Note that form this examples none presents DF.

In Table 6 were confronted the results for these TS. The results of our models are shown in the column called "Linear AF" and the place gotten when comparing with the other models is shown in the column "Place".

From the results presented in the tables of this section it can be concluded that the model built with our methodology outperform all the models obtained with statistical methods and are competitive with non-linear methods presented here. In addition, it must be added that this methodology is fully automated and allows modelling TS than other traditional methodologies can not.

Bäck T. (1992).The interaction of mutation rate, selection, and self-adaptation within genetic

Self Adaptive Genetic Algorithms for Automated Linear Modelling of Time Series 233

Bäck T. (1992). Self-adaptation in genetic algorithms. *Proc. 1st Eur. Conf. on Artificial Life*. MIT

Battaglia F. & Protopapas M.(2011). Time-varying multi-regime models fitting by genetic

Box G. & Jenkins G. (1976). *Time Series Analysis: Forecasting and Control*.Holden-Day, INC. ISBN

Chiogna M. & Gaetan C. & Masarotto G.(2008). Automatic identification of seasonal transfer

*Time Series Analysis*, Vol 29, Issue 1, page numbers (37-50), ISSN 1467-9892. Cortez P. & Rocha M. & Neves J. (1996). Evolving Time Series Forecasting ARMA Models. *Journal of Heuristics*, Vol 10., No 4., page numbers (415-429), ISSN 1381-1231. Eiben Á. E. & Hinterding R. & Michalewicz Z. (1999). Parameter Control in Evolutionary

Flores P. & Garduño R. & Morales L. & Valdez M. (1999). Prediction of Met-enkephalin

Garduño R. & Morales L. & Flores P. (2000). Dinámica de Procesos Biológicos no Covalentes

Garduño R. & Morales L. B. & Flores P. (2001). About Singularities at the Global Minimum

Guerrero V. (2003). *Análisis Estadístico de Series de Tiempo Económicas*,Thomson Editores, ISBN

Hansen J. & McDonald J. & Nelson R. (1999). Time Series Prediction with Genetic Algorithms

Hyndman R.& Koehler A. (2006). Another look at measures of forecast accuracy. *International Journal of Forecasting*, Vol 22, Issue 4, page numbers ( 679-688), ISSN 0169-2070. Hyndman R. (2003). Time Series Data Library.*Available in http;//www.robjhyndman.com/TSDL* Mateo F. & Sovilj D. & Gadea R.(2010). Approximate k -NN delta test minimization method

McDonald J. & Newey W. (1988). Partially adaptive estimation of regression models via the

McDonald J. & Yexiao X. (1994). Some forecasting applications of partially adaptive estimators

Huber P. (2004). *Robust Statistics*. Wiley-IEEE. ISBN 978-0-521-88068-8, USA.

10-12. page numbers (2017-2029), ISSN 0925-2312.

ISBN 0444897305, Elsevier Amsterdam.

0-13-060774-6, Oakland, California USA 1976.

numbers ( 124-141), ISSN 1089-778X.

numbers ( 277-284), ISSN 0166-1280.

page numbers ( 186-190)

135-141), ISSN 0035-001X.

9706863265, México DF.

0266-4666.

0165-1765.

Chatfield C. (2000). *The Series Forecasting*. CRC Press. ISBN 1584880635, USA.

Press. Cambridge, MA.

ISSN 1467-9892.

algorithm. *Proc. 2nd Conf. on Parallel Problem Solving from Nature*, (Brussels,1992),

algorithms. *Journal of Time Series Analysis*, Vol 32, Issue 3. page numbers (237-252),

function models by means of iterative stepwise and genetic algorithms. *Journal of*

Algorithms. *IEEE Transactions on Evolutionary Computation*, Vol 3., No 2., page

Conformation using a Sieve Self Adaptive Genetic Algorithm. *Proceedings of the Second International Symposium on Artificial Intelligence: Adaptive Systems ISSAS'99.*,

a Nivel Molecular. *Revista Mexicana de Física*, Vol 46., Suplemento 2., page numbers (

of Empiric Force Fields for Peptides. *Journal at Molecular Structure (Theochem)*, page

designed neural networks, an empirical comparison with modern statistical models. *Computational Intelligence*, Vol 15., No 3., page numbers ( 171-184), ISSN 0824-7935.

using genetic algorithms; Application to time series. *Neurocomputing*, Vol 73, Issue

generalized t distribution.*Econometric Theory*, Vol 4., page numbers ( 428-457), ISSN

of ARIMA models. *Economics Letters*, Vol 45., Issue 4., page numbers ( 155-160), ISSN


Table 5. Solutions with the methodology proposed in this work.


Table 6. Comparison with other methodologies.

### **10. Conclusions**

From the above it can be obtained several conclusions. The first is that the methodology developed here based on setting out the building of linear models as an optimization problem, where the construction of the problem is guided by the classical TS theory, is correct because allows to build better models than those obtained by the traditional methods.

Another conclusion is that the fact of choosing the SAGA as an alternative to solve the problems set out here is very important since allows exploring the solution space of our problem and finding the most significant variables to solve it. In addition, the SAGA version developed has proved to be very robust in solving many different problems with out adjustment of parameters.

As a result not contemplated it was found that the phenomenon of FD, which allowed us to construct new linear models for TS, which in some cases are better alternatives compared to other linear and nonlinear models. In addition, these new models have great potential for application in areas such as industrial control, economics, finance, etc. In particular, we think that the FD is a characteristic of the phenomenon in question, but that is only detected if the model is built with an appropriate methodology, particularly in the selection and setting limits of variables.

Finally, it should be noted that having a fully automated methodology with the ability to model phenomena that other methodologies can not open a whole world of possibilities in the development of computer systems for modelling and process control.

#### **11. References**

Alberto I. & Beamonte A. & Gargallo P. & Mateo P. & Salvador M.(2010). Variable selection in STAR models with neighbourhood effects using genetic algorithms. *Journal of Forecasting*, Vol 29, Issue 8, page numbers (728-750), ISSN 0277-6693.

20 Will-be-set-by-IN-TECH

Series AR MA

Deaths 1.9941 + 0.905312 + 0.083214 4.3636 + 0.4052 Maxtemp 0.7046 + 0.33621 − 0.06687 + 0.406011 18.662 + 0.10455 − 0.185711

Series Holt Box Heuristic Meta Linear Place Winter Jenkings Evolutionay Evolutionay AF Passengers 16.5 17.8 21.9+ -1.2 17.2+ - 0.2 16.3 1 Paper 49.2 61 60.2+ - 2.2 52.5+ - 0.1 5.59 1 Deaths 135 144 135.9+ - 1.7 137+ - 2 140 3 Maxtemp 0.72 1.07 0.95+ - 0.02 0.93+ - 0.4 0.94 2 Chemical 0.35 0.36 0.36+ - 0.0 0.34+ -0.0 0.34 1

From the above it can be obtained several conclusions. The first is that the methodology developed here based on setting out the building of linear models as an optimization problem, where the construction of the problem is guided by the classical TS theory, is correct because

Another conclusion is that the fact of choosing the SAGA as an alternative to solve the problems set out here is very important since allows exploring the solution space of our problem and finding the most significant variables to solve it. In addition, the SAGA version developed has proved to be very robust in solving many different problems with

As a result not contemplated it was found that the phenomenon of FD, which allowed us to construct new linear models for TS, which in some cases are better alternatives compared to other linear and nonlinear models. In addition, these new models have great potential for application in areas such as industrial control, economics, finance, etc. In particular, we think that the FD is a characteristic of the phenomenon in question, but that is only detected if the model is built with an appropriate methodology, particularly in the selection and setting limits

Finally, it should be noted that having a fully automated methodology with the ability to model phenomena that other methodologies can not open a whole world of possibilities in

Alberto I. & Beamonte A. & Gargallo P. & Mateo P. & Salvador M.(2010). Variable selection

*Forecasting*, Vol 29, Issue 8, page numbers (728-750), ISSN 0277-6693.

in STAR models with neighbourhood effects using genetic algorithms. *Journal of*

the development of computer systems for modelling and process control.

allows to build better models than those obtained by the traditional methods.

+0.291012 −0.193012

Passengers 1.3400 + 0.90871 + 1.061212 − 0.963313 Paper 6.2323 + 0.958312

Chemical 0.5419 + 0.60811 + 0.37537 − 0.014415 Table 5. Solutions with the methodology proposed in this work.

Table 6. Comparison with other methodologies.

**10. Conclusions**

of variables.

**11. References**

out adjustment of parameters.


**12** 

*Mexico* 

**Optimal Feature Generation with** 

and Valentín López-Rivas5

*1,2,5Instituto Tecnológico de Aguascalientes* 

*4Universidad Politécnica de Aguascalientes* 

**Genetic Algorithms and FLDR in a Restricted-**

In every pattern recognition problem there exist the need for variable and feature selection and, in many cases, feature generation. In pattern recognition, the term variable is usually understood as the raw measurements or raw values taken from the subjects to be classified, while the term feature is used to refer to the result of the transformations applied to the variables in order to transform them into another domain or space, in which a bigger discriminant capability of the new calculated features is expected; a very popular cases of feature generation are the use of principal component analysis (PCA), in which the variables are projected into a lower dimensional space in which the new features can be used to visualize the underlying class distributions in the original data [1], or the Fourier Transform, in which a few of its coefficients can represent new features [2], [3]. Sometimes, the literature does not make any distinction between variables and features, using them

Although many variables and features can be obtained for classification, not all of them posse discriminant capabilities; moreover, some of them could cause confusion to a classifier. That is the reason why the designer of the classification system will require to refine his choice of variables and features. Several specific techniques for such a purpose are

Optimal feature generation is the generation of the features under some optimality criterion, usually embodied by a cost function to search the solutions' space of the problem at hand and providing the best option to the classification problem. Examples of techniques like these are the genetic algorithms [6] and the simulated annealing [1]. In particular, genetic

Speech recognition has been a topic of high interest in the research arena of the pattern recognition community since the beginnings of the current computation age [7], [8]; it is due,

available [1], and some of them will be reviewed later on in this chapter.

**1. Introduction** 

indistinctly [4], [5].

algorithms are used in this work.

**Vocabulary Speech Recognition System** 

Julio César Martínez-Romo1, Francisco Javier Luna-Rosas2, Miguel Mora-González3, Carlos Alejandro de Luna-Ortega4

*3Universidad de Guadalajara, Centro Universitario de los Lagos* 

