Introduction to Evolutionary Algorithms

*S. Tamilselvi*

#### **Abstract**

Real-world has many optimization scenarios with multiple constraints and objective functions that are discontinuous, nonlinear, non-convex, and multi-modal in nature. Also, the optimization problems are multi-dimensional with mixed types of variables like integer, real, discrete, binary, and having a different range of values which demands normalization. Hence, the search space of the problem cannot be smooth. Evolutionary algorithms have started gaining attention and have been employed for computational processes to solve complex engineering problems. Because it has become an instrument for research scientists and engineers who need to apply the supremacy of the theory of evolution to shape any optimization-based research problems and articles. In this chapter, there is a comprehensive introduction to the optimization field with the state-of-the-art in evolutionary computation. Though many books have described such areas of optimization in any form as evolution strategies, genetic programming, genetic algorithms, and evolutionary programming, evolutionary algorithms, that is, evolutionary computation is remarkable for considering it to discuss in detail as a general class.

**Keywords:** evolutionary algorithms, genetic operators, non-convex, multi-modal, optimization process

#### **1. Introduction**

Darwin's principle of evolution says that the existence of any creature is based on the law "strongest creature survives." Before computers have entered the human world, in the 50s, knowledge to apply Darwinian principles for automated problem solving was invented. Darwin also proved that the survival of any organism can be maintained with genetic inheritance, such as reproduction, crossover, and mutation. Thus, Darwin's evolution theory was deployed by computational optimization algorithm to search for a solution to any real-world optimization problem in a natural way [1].

In the 60s, three various interpretations of this idea were introduced at different places. Evolutionary programming was developed by Lawrence J. Fogel in the USA when John Henry at Holland started his methodology as a genetic algorithm,

**Figure 1.** *Evolutionary algorithms and their subtypes.*

stimulated by Darwin's evolutionary concepts. Similarly, Ingo Rechenberg and Hans-Paul Schwefel have invented evolution strategies in Germany. Following this fourth one had emerged as genetic programming, in the early 90s. These four different terminologies are seen as different representatives of one technology called evolutionary algorithms (EAs), which denote the whole field by considering evolutionary programming, evolution strategies, genetic algorithms, and genetic programming as sub-areas and is well depicted in **Figure 1** [1, 2].

#### **2. Need for evolutionary algorithms**

Real-world has many optimization scenarios. Optimization, by definition, is a methodology of making the decision as fully perfect as possible to achieve the maximum possible goal, in an engineering system. Nature is a very good optimizer. An optimization problem can be stated as follows [2, 3].

Find *x* ¼ f g *x*1, *x*2, … *xn* , which Minimize*=*Maximize *f x*ð Þ Subject to *g j* ð Þ *x* ≤0, *j* ¼ 1, 2, … , *m h <sup>j</sup>*ð Þ¼ *x* 0, *j* ¼ 1, 2, … , *p*

Any engineering system can be represented with a set of quantities. Certain quantities are usually fixed called as pre-assigned constants. Remaining quantities can be treated as decision variables in the optimization process, *xi* = {*x*1, *x*2, … *xn*}. '*f*(*x*)' is the objective function or goal to be attained, '*gj*(*x*)' represent 'm' the count of inequality constraints and '*hj*(*x*)' represent '*p*' count of equality constraints to be satisfied for attaining feasibility [3].

In real-world engineering problems, the objective function is discontinuous, nonlinear, non-convex, and multi-modal. Also, the problems are multi-dimensional as the number of design variables are more and they are mixed in type like integer, real, discrete, binary. Hence, the search space is not smooth. It may require accessing lookup table data for objective function evaluation. The constraint functions are very complex and the amount of violation of each constraint does not cover the same range, which requires normalization [4].

In general, the optimization problems are categorized based on the existence of constraints, nature of the decision variables, permissible values of the design variables, nature of the equations involved, deterministic nature of the variables, separability of the functions, number of objective functions, etc. Some of them are static optimization problem, dynamic optimization problem, linear programming problem, convex programming problem, nonlinear programming problem, geometric programming, quadratic programming problem, separable programming problem, multiobjective optimization problem, single-variate optimization problem, multi-variate optimization problem [5].

Two different techniques to find the solution for optimization problems are mathematical programming techniques and meta-heuristic techniques. When derivative-based mathematical programming methods are applied in solving nonlinear programming problems, there are several shortcomings. Traditional optimization methods:


For a real-world optimization problem, the surface plot obtained even for optimizing two design variables gives greater number of local minima/maxima. **Figure 2** depicts the surface plot obtained for the design optimization of the distribution transformer problem with two decision variables, width of the core leg, height of the core window, by minimizing the transformer lifetime cost (TLTC) of the transformer. It is

**Figure 2.** *Search space for minimization of TLTC objective optimizing two decision variables.*

very clear from **Figure 2** that the real-world problem is very complex with its multimodal search space [3].

Of course, when it is a multi-dimensional engineering problem, which requires optimization of more decision variables, the multi-modality cannot be imagined. So, the conventional derivative-based mathematical programming technique cannot handle such complex nature of the optimization problem, accurately. It may yield a feasible design; however, it will not be an optimal solution. If there is no knowledge or little knowledge about the behavior of the objective function related to the presence of local minima, location of feasible region, infeasible region in the multi-dimensional parameter space, it is advisable to start the meta-heuristic technique, which is a stochastic strategy [2, 3].

Of all the meta-heuristic techniques, evolutionary algorithms (EAs) are especially effective in the solution of high-complexity, non-convex, nonlinear, multidimensional, mixed variable, multi-objective, constrained optimization problems, for which a traditional mathematical model is difficult to build, where the nature of the input/output relationship is neither well defined nor easily computable. The stages of EAs have not yet been investigated in detail steps with illustration, despite their performances are better in terms of convergence, consistency in obtaining the solution, and computational speed in solving any multi-modal problems.

Hence, this chapter discusses in detail the step-by-step evolutionary process that happens behind the optimization algorithm. It highly helps to find solution for any multi-modal real-world engineering optimization problem, by optimizing design objective, while satisfying simultaneously various constraints imposed by international standards and user specifications.

#### **3. Known optimization problems**

Evolutionary optimization algorithms minimize or maximize an objective function and they are search algorithms. The algorithm checks all the way through a large

#### *Introduction to Evolutionary Algorithms DOI: http://dx.doi.org/10.5772/intechopen.104198*

search space of possible solution set for the optimal best solution. In day-to-day practical life as well as professional life, there are numerous activities that seek optimization. Some of the common well-known real-world optimization problems are the traveling salesman problem, classification problem, economic power dispatch, base station location problem, antenna design, scheduling problem, etc.

In traveling salesman problem, a salesman wants to visit all the towns, with information of list of towns and distances between all the towns. The constraint is that each town has to be visited only once. The optimization problem statement is searching for the optimum shortest distance/route that the salesman travel and visits each town exactly only once and returns to the place where he started [6].

Base station location problem is setting radio and optimizing maximum coverage. Given a set of spots for installing base stations, feasible configurations for every base station, antenna tilt, maximum power, antenna height, sectors orientation, etc. along with the information of traffic and strength of the signal propagation, the optimization algorithm is to choose the right location and appropriate configuration of the base station such that the installation cost is minimum while meeting the traffic demand simultaneously [7].

The job of optimal generator maintenance scheduling problem is to find out the optimum period for which, the generator units must be taken offline for maintenance over the stipulated time horizon, so that the operating costs involved are minimized, meeting the maintenance constraints during the considered time period such as load demand, maintenance window, maintenance duration and manpower availability [8].

Previous research works have applied only machine learning techniques for the prediction and classification of any disease/tumor. However, nowadays due to the capability of evolutionary algorithms, such classification problems have been stated as an optimization problem and solved using the integrated machine learningoptimization technique. Thus, optimal classification problem aims to select optimum elite features from intelligent liver and kidney cancer diagnostic systems of huge data sets, by filtering the redundant features, minimizing the error rate, in order to improve the quality of heart disease classification [9].

Economic power dispatch is a vital optimization problem in power system planning. The aim of the economic dispatch is to schedule the optimum power output for the available generator units of the power system such that the production cost is minimum and power demand is met [10].

#### **4. Optimization process of simple evolutionary algorithm**

EA handles a population of possible random solutions. Each solution is represented through a chromosome. The fitness of each chromosome is calculated to call for a competition among the chromosomes. Competition results in the selection of those better chromosomes/solutions (with high fitness value) that are suited for the environment. The process of the first level of filtration based on the fitness value is called parent selection [3]. The selected individuals, that is, parents act as seeds for creating children through genetic inheritance, that is, recombination and mutation. These genetic operators aid the necessary diversity. Few pairs of chromosomes from the parent pool are chosen based on the random probability to undergo crossover for forming offspring. The resulted offspring individuals obtained after crossover are allowed to take up mutation randomly. Different regions of the search space are explored for identifying possible optimal individuals through "recombination and mutation" operation known as "exploration." The

**Figure 3.** *Working of evolutionary algorithms.*

new individuals/children thus formed have their fitness evaluated to compete for survival in the next generation. As an end to an iteration, in a replacement stage, 80% worst solutions of the initial random population are substituted by the best offspring children filtered after survival selection process based on the evaluated fitness value. Over time and several iterations, "natural selection" operation, which is called exploitation leads to the identification of an individual in the population as global optimum. The complete working of the evolutionary algorithm is pictured in **Figure 3** [11]. The major steps involved in the process of optimization in an evolutionary algorithm are as follows [1, 3].


### **5. Iterative process behind evolutionary algorithms**

To define problem statement:

Consider an equality function, *x* + 2*y* + 3*z* + 4*u* = 30. We shall apply the evolutionary algorithm to find the appropriate values for *x*, *y*, *z*, *u*, such that the equity equation gets satisfied [12].

### **5.1 Formulation of optimization problem**

a. Formulate objective function: *f*(*k*)

The objective/aim is to **Minimize** *f*(*k*) = [(*x* + 2*y* + 3*z* + 4*u*) 30].


Total number of decision variables is the problem dimension = 4.

d. Representation:

A solution generated by an evolutionary algorithm is called a chromosome/ individual, which is made up of genes [1]. After selecting the decision variables, and problem dimension, choice of suitable type for these variables is another important task. The nature of the decision variables is completely problem dependent and thus in this example, [*x*, *y*, *z*, *u*]—they are integers. However, the genes can be mixed like binary, real, integer, discrete variables, etc., depending upon the need of the problem under consideration. Chromosome/solution is thus represented as an integer variable as under:


e. Impose boundary constraint:

This range selection for setting the search space is more often done on a trial basis, in case the problem dimension is high. On contrary, if the objective function is very simple, clear, and possesses straight relationship (mathematical equation) with lesser number of decision variables, then the search space can be decided by inspection [2]. For this example, it is very clear that the integer values of decision variables [*x*, *y*, *z*, *u*] can be restricted to vary between 1 and 30, in order to speed up the computational search.

#### **5.2 Different stages in optimization process**

To illustrate solving a minimization type optimization problem using EA, integer type for decision variables, six for population size, single-point method for crossover, and roulette wheel for selection are assumed. The various stages involved in the process of optimization are given for one iteration in this section [13–15].

**Stage 1**: *Population generation: Initial solution set*

Four genes [*x*, *y*, *z*, *u*] are generated randomly satisfying the lower and upper limits of the boundary constraint. A chromosome thus generated is a vector comprising of four genes. Chromosome refers to the solution/individual of the formulated optimization problem, while the collection of such chromosomes is referred to as a population. For example, Solution [1] = [*x*; *y*; *z*; *u*] = [12, 05, 23, 08]. Then, the initial population will have an array of sizes [population size, problem dimension]. That is, [(6, 4)] as shown in **Table 1**.

**Stage 2**: *Function evaluation: Feval*

All the chromosomes of random population will then go through a process known as fitness evaluation to measure the quality of the solution created by EA. Evaluation


**Table 1.**

*Functional evaluation of initial population.*

of fitness value of chromosome/solution is carried out by calculating the objective function value as Feval = Modulus[*f*(*k*)].

$$f(k) = [(\mathbf{x} + \mathbf{2}y + \mathbf{3z} + 4u) - \mathbf{30}].$$
 
$$\text{For Solution [1], Feval [1] = Modulus [f(Solution [1])]}$$
 
$$= \text{mod } [(\mathbf{12} + \mathbf{2}\ \mathbf{x} \ \mathbf{5} + \mathbf{3}\ \mathbf{x}\ \mathbf{23} + 4\ \mathbf{x}\ \mathbf{8}) - \mathbf{30}].$$

Feval [1] = mod [(12 + 10 + 69 + 32) � 30] = 93. Similarly, the solutions of entire population can be calculated and tabulated as under in **Table 1**. It is found that Chromosome 4 has the least objective function value 46.

**Stage 3**: *Parent selection*

The chromosomes are selected from the initial population to act as parent for reproduction, based on the fitness of the solution/individual. The selection procedure tells how to choose individuals in the population that will create offspring for the next generation. The fittest solution will have a higher probability to be selected as a parent. Two-step selection process is discussed as follows [13, 15].

A. To compute the probability of selection: Prob[*i*]

$$\text{Prob}[i] = \frac{\text{Fit}[i]}{\sum\_{i=1}^{6} \text{Fit}[i]}$$

where,

$$\text{Fit}[i] = \frac{1}{(\mathbf{1} + \mathbf{F}\mathbf{eval}[i])}$$

(to avoid undefined divide by zero error, which may encounter for the optimal solution, it is advisable to add 1 with Feval).

Fit [1] = 1/(1 + Feval [1]) = 1/94 = 0.0106 and so on, till *i* = 6.

Total fitness = 0.0845 (refer **Table 2**).

Prob [1] = 0.0106/0.0845 = 0.1254

B. To select the parent pool: Roulette-wheel (RW) selection process:

Parent selection is vital for the convergence of optimization algorithm as efficient parents force solutions/individuals to optimality. There are different methods in the

*Introduction to Evolutionary Algorithms DOI: http://dx.doi.org/10.5772/intechopen.104198*


#### **Table 2.**

*Selection probability computation.*

process of selecting parents such as stochastic universal sampling, fitness proportionate selection, tournament selection, rank selection, and random selection. In this chapter, roulette wheel selection has been implemented for identifying the right parent pool.

Consider a wheel that is split into '6' pies. Pie refers to the individual in the population. Each solution occupies a portion of the wheel, proportional to its fitness value. It is clear that a fitter solution takes a larger pie on the wheel and has larger probability chance of being selected as a parent when the wheel is made to spin '6' times. Hence, the probability of choosing a chromosome depends on its fitness only. The steps involved in the roulette wheel selection process are:


rand [place 1] = 0.2; rand [place 2] = 0.285; rand [place 3] = 0.098;

rand [place 4] = 0.812; rand [place 5] = 0.397; rand [place 6] = 0.50.

• Fit the random number of each place in the respective range of cumulative probabilities and fetch the color of the pie. For example, rand [place 2] = 0.285, which is between Cum [2] and Cum [3]. So, the gray pie which is individual/ chromosome [3] will occur in place2 of the parent/mating pool.

#### **Stage 4**: *Crossover*

The crossover operation involves three steps: (A) selecting mating chromosomes, (B) determining cut point for crossover, and (C) updating the population.

#### **Figure 4.** *Roulette wheel—Parent selection process.*


#### **Table 3.**

*Place and position of solution—Before RW selection process.*

In this operation pairs of parents are chosen and many children/off-springs are generated using the information available in the gene of the parents. Usually, crossover operation is deployed in EA with high probability(*pc*). Some of the commonly used crossover operators are whole arithmetic recombination, one point crossover, uniform crossover, multi-point crossover, Davis' order crossover, etc. In the equity problem chosen for illustration, one-point crossover has been used for offspring creation. In one-point crossover, a randomly point of crossover has been chosen and the tail ends of the parent pairs are swapped to produce new children. The process is evident in **Table 5**.

A. To select chromosome:

Parent chromosome from parent pool that undergoes the mating process is randomly selected and the number of mate solutions is decided using crossover rate, *pc*. Solution '*i*' will become a parent, if random number, rand[*i*]falls below the crossover rate. Let us assume the *pc* = 25% for solving the problem. Generate number randomly '6' times (population size) below 1.

#### *Introduction to Evolutionary Algorithms DOI: http://dx.doi.org/10.5772/intechopen.104198*


#### **Table 4.**

*Chromosomes in the MATING POOL after RW selection process.*


**Table 5.**

*Population after and before crossover operation.*

rand [1] = 0.19; rand [2] = 0.249; rand [3] = 0.750; rand [4] = 0.005 rand [5] = 0.149; rand [6] = 0.320

Thus, for the generated random numbers, three chromosomes/solutions [1, 4, 5] are selected for crossover operation. Hence, the number of crossovers becomes 3, that is, 3 pairs.

Solution [1] >< Solution [4] — First Crossover Solution [4] >< Solution [5] — Second Crossover Solution [5] >< Solution [1] — Third Crossover

B. To determine cut point:

Followed by mating chromosome selection, the next phase is to determine the position of the crossover point. The steps involved are:


First Crossover: New Chromosome [1] = Solution [1] >< Solution [4]

Second Crossover: New Chromosome [4] = Solution [4] >< Solution [5]


Third Crossover: New Chromosome [5] = Solution [5] >< Solution [1]


C. To update the population after crossover:

#### **Stage 5**: *Mutation*

Mutation operation is a small random sharp change in the chromosome necessary to obtain a new solution. Mutation is used to sustain population diversity and it generally has a low probability, *pm*. Mutation in EA refers "exploration" of search space. It has been proven that mutation operation is crucial for the convergence of the algorithm, however, crossover operation is not so. Some of the commonly used mutation operators are bit flip mutation, swap mutation, scramble mutation, random resetting, inversion mutation, etc. Like the crossover operators, this is not an extensive list since EA designer may deploy a hybrid approach as a combination of these operators or prefer problem-specific mutation operators as more practical.


Number of solutions that undergo mutations in a population is decided by mutation rate *pm*. Let, *ρ<sup>m</sup>* = 10%; Number of mutations = 0.1 24 = 2.4 = 2.

• Calculate gene positions.

Generate two random numbers below 24, say 12 and 18. Mutable genes and chromosomes are Chromosome [3]-gene 4 and Chromosome [5]-gene 2. This process is seen in **Table 6**.

• The value of mutable genes at the mutation point is substituted with random number, satisfying the boundary constraint of decision variables/genes. That is, between 0 and 30; Say 02, 05.

#### **Stage 6**: *Survival selection and replacement mechanism*

After mutation operation one iteration/generation of EA is over. Functional evaluation is again performed on the offspring for survival selection. From the functional evaluation of population after mutation, it is evident that the objective function value *Introduction to Evolutionary Algorithms DOI: http://dx.doi.org/10.5772/intechopen.104198*


#### **Table 6.**

*Population after and before mutation operation.*

of the best solution is reducing—37 in comparison with the minimum objective value —47 of initial random population, as shown in the table. Hence the minimization objective, *f*(*k*) = [(*x* + 2*y* + 3*z* + 4*u*) 30] is met. This means that the solutions obtained by EA at the end of the first iteration is better than the solutions of random population.

To execute the iteration process continuously, population is to be revised at the end of each iteration, as a final process, which is referred to as replacement mechanism. In each iteration end, 80–90% of best solutions from offspring population (4–5 best children) and 20–10% best solutions from the initial population (2–1 random solution) are selected to form new population for next generation [15]. Chromosomes of the next generation will then become as shown in **Tables 7** and **8**.




#### **Table 8.**

*Replacement-population for the next iteration.*

#### **Stage 7**: *Stopping the iteration*

The optimization process is repeated until when objective function value or decision variables values become stagnant, that is, have no/very little change for a greater number of iterations. Thus, over period, the solution will get converge to the final best minimum optimal solution and the optimization process will be stopped, based on any stopping criteria such as the maximum number of iterations, or maximum number of functional evaluations, etc.

#### **6. Conclusion**

The basic processes that occur behind an evolutionary algorithm have been explained and illustrated in this chapter with steps covering solution representation, population generation, functional evaluation, parent selection, genetic operations, offspring evaluations, survival selection, and stopping criteria for a simple optimization problem. This knowledge can be extended very well by researchers across any discipline, working in the field of optimization and for applying evolutionary algorithms to solve any complex engineering problem using computers. Although the process behind EA may appear to be simple, the details of the optimization process form the base and are very much necessary in applying the learned concepts for modifying the existing evolutionary concepts and evolving into better optimization methods in the research level.

### **Author details**

S. Tamilselvi Sri Sivasubramaniya Nadar College of Engineering, Chennai, India

\*Address all correspondence to: tamilselvis@ssn.edu.in

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Introduction to Evolutionary Algorithms DOI: http://dx.doi.org/10.5772/intechopen.104198*

#### **References**

[1] Michalewicz Z. Genetic Algorithms + Data Structures = Evolution Programs. 3rd ed. Berlin, Heidelberg: Springer; 1996. p. 387. DOI: 10.1007/978-3-662-03315-9

[2] Gen M, Cheng R. GA & Engineering Design. Hoboken, New Jersey, United States: John Willey & Sons, Inc.; 1997

[3] Tamilselvi S, Baskar S, Anandapadmanaban L, Kadhar K, Varshini PR. Chaos-assisted multiobjective evolutionary algorithm to the design of transformer. Soft Computing. 2017;**21**(19):5675-5692. DOI: 10.1007/s00500-016-2145-7

[4] Tamilselvi S, Baskar S, Anandapadmanaban L, Karthikeyan V, Rajasekar S. Multi objective evolutionary algorithm for designing energy efficient distribution transformers. Swarm and Evolutionary Computation. 2018;**1**(42): 109-124

[5] Tamilselvi S, Baskar S, Sivakumar T, Anandapadmanaban L. Evolutionary algorithm-based design optimization for right choice of transformer conductor material and stepped core. Electrical Engineering. 2019;**101**(1):259-277

[6] Agatz N, Bouman P, Schmidt M. Optimization approaches for the traveling salesman problem with drone. Transportation Science. 2018;**4**(52): 965-981

[7] Lakshminarasimman N, Baskar S, Alphones A, Willjuice Iruthayarajan M. Evolutionary multiobjective optimization of cellular base station locations using modified NSGA-II. Wireless Networks. 2011;**17**(3):597-609

[8] Tamil Selvi S, Baskar S, Rajasekar S. An Intelligent Approach Based on Metaheuristic for Generator

Maintenance Scheduling—Classical and Recent Aspects of Power System Optimization. 1st ed. Cambridge, Massachusetts, United States: Academic Press; 2018. pp. 99-136

[9] Gunasundari S, Janakiraman S, Meenambal S. Multiswarm heterogeneous binary PSO using Win-Win approach for improved feature selection in liver and kidney disease diagnosis. Computerized Medical Imaging and Graphics. 2018;**70**:135-154

[10] Bhattacharya A, Chattopadhyay PK. Solving complex economic load dispatch problems using biogeography-based optimization. Expert Systems with Applications. 2010;**37**(5):3605-3615

[11] De Jong KA. Evolutionary Computation—A Unified Approach. 1st ed. Berlin/Heidelberg, Germany: Springer; 2017. p. 268

[12] Hermawanto D. Genetic Algorithm for Solving Simple Mathematical Equality Problem. 2013 [arXiv preprint arXiv:1308.4675]

[13] Available from: https://www.tutoria lspoint.com/genetic\_algorithms/gene tic\_algorithms\_parent\_selection.htm

[14] Goldberg DE. Genetic Algorithms in Search, Optimization, and Machine Learning. 13th ed. Boston, Massachusetts, United States: Addison Wesley; 1989. p. 432

[15] Hancock PJ. An empirical comparison of selection methods in evolutionary algorithms. In: AISB Workshop on Evolutionary Computing 1994 Apr 11. Berlin, Heidelberg: Springer; 1994. pp. 80-94

#### **Chapter 2**

## Application of Genetic Algorithm in Numerous Scientific Fields

*Gautam Garai*

#### **Abstract**

The genetic algorithm (GA) and its variants have been used in a wide variety of fields by the scientists efficiently for solving problems. From the pool of evolutionary algorithms, the GA is chosen by the researchers and has been popular as a useful and effective optimizer. It has several advantages and disadvantages. However, it provides solutions for various kinds of problems such as space research, economics, market study, geography, remote sensing, agriculture, data mining, cancer detection, and many more. This chapter discusses the utilization of the GA in some of these fields with a few experimental results such as data clustering, pattern identification and matching, and shape detection. The results are illustrated and explained with reasons for better understanding of the GA application in the scientific fields. Other than these, the GA in bioinformatics for biological sequence alignment is discussed with examples.

**Keywords:** genetic algorithm, optimization, application, scientific field, stochastic search

#### **1. Introduction**

In the designing process of scientific problems, the primary goal of the researchers is to develop a system to provide an efficient low-cost solution. In this endeavor, they need a good optimizer to reach their target. Several optimizers are available, which work efficiently for various kinds of problems. Among these, genetic algorithm (GA) is chosen as a functional tool for good and useful optimizer. The algorithm has proven to be a class of effective optimization techniques for many applications in engineering, economics, manufacturing, artificial intelligence, operations research, space science, agriculture, physics, chemistry, bioinformatics, medical science, and many more. However, there are advantages and disadvantages of using genetic algorithms as an optimizer such as other optimization tools. The amount of *a priori* knowledge required to use genetic algorithms is minimal. One can apply an "off the shelf" genetic algorithm to an optimization problem by mapping all of the function inputs to a prespecified representation, such as binary strings. However, one often specializes a genetic algorithm to a specific problem domain by using additional heuristics, specialized representation, and/or operators, which constitute the *a priori* information for obtaining superior performance. If one opts to use a genetic algorithm without

expending effort to customize it to the problem at hand, the trade-off is typically that the non-specialized genetic algorithm may produce poor results, which may include computationally inefficient optimization resulting to converge to a suboptimal solution. The scientists have incorporated customization in the conventional GA according to the need and specification of their research problems.

This chapter discusses the utilization of GA in various fields with examples and illustrations. It is shown how the GA recognizes a specific pattern from a group of patterns. Genetic-algorithm-based clustering technique is then described for identifying similar group of items among several homogeneous or heterogeneous groups. The discussion is also done on the biological sequence alignment with the GA in achieving accurate alignment.

The rest of the chapter is organized as follows. Section 2 describes briefly the history of evolutionary algorithm along with the GA. A survey on genetic algorithm is narrated in Section 3. The next three sections elaborate several utilizations of the GA in various scientific fields. Section 4 identifies a pattern among a group of patterns. Genetically guided clustering technique is elaborated in Section 5. Section 6 describes the application of genetic algorithm for detection of polygonal shape of a dot pattern. The biological sequence alignment is discussed in Section 7. Section 8 concludes the chapter.

#### **2. A brief history of evolutionary computation**

In the literatures, three primary search methods are identified, namely *calculusbased*, *enumerative,* and *random* [1] search. *Calculus-based* methods are further subdivided into two classes—*indirect* and *direct*. Indirect search method seeks the local extrema by solving a set of equations resulting from the derivative of objective function. On the other hand, direct search method seeks the local optima by hopping on the function and moving in a direction related to the local gradient. This is basically the notion of *hill-climbing*. *Enumerative* scheme has been considered in many shapes and sizes. Here, the search algorithm finds the objective function value at every point within a finite or a discrete infinite space. *Random* search is the most popular among the researchers since the shortcomings of other two techniques are rectified here. The genetic algorithm is an example of a search procedure of this category [1].

For several years since 1950, many computer researchers studied evolutionary systems independently with an idea that in future evolution could be used as an important and essential optimization tool for solving various problems in engineering. The idea was to develop a system with a population of candidate solutions to a given problem with operators influenced by both natural selection and natural genetic variation. Rechenberg [2] introduced "evolution strategies" (*Evolutionsstrategie* in the original German), a method used to optimize real-valued parameters for devices such as air foils. This idea was further developed by Schwefel [3]. Fogel et al. [4] developed a technique called "evolutionary programming" where the candidate solutions to the given tasks were represented as finite-state machines, which were evolved by randomly mutating their state-transition diagrams and selecting the fittest among them. The evolution strategies, evolutionary programming and genetic algorithms together form the backbone of evolutionary computation. Several other researchers developed evolution-inspired algorithms for optimization and machine learning. Box [5], Friedman [6], Bremermann [7], and Reed et al. [8] worked in this area though their studies had little impact on the field. Evolutionary biologists also used computers to simulate evolution for the controlled experiments [9–11].

#### *Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

Finally, in 1960s and 1970s, at the University of Michigan, Holland first invented the genetic algorithms and later on improved by himself along with his colleagues [1]. Unlike the previous evolution strategies, Holland's intention was to formally study the phenomenon of adaptation, which occurs in nature instead of designing algorithms for solving specific problems. He also developed ways by importing the natural adaptation mechanisms in computer systems. The genetic algorithm was presented by Holland as an abstraction of the biological evolution and also gave a theoretical framework for adaptation. His GA was a method for moving from one population of "chromosome" (string of 1's and 0's) to a new population with a kind of "natural selection" along with the genetic operators, namely crossover, mutation, and inversion. Each chromosome was a combination of "genes" (bits), and each gene was a particular "allele" (either 0 or 1). In the population the useful chromosomes for reproduction were chosen by the selection operator. The fitter chromosomes thus produced more off-springs compared with less fit ones. In the process of crossover, the subparts of two chromosomes were exchanged. This basically mimicked the biological recombination between two single-chromosome (haploid) organisms. The allele values of some locations of the chromosome were randomly changed and reversed by mutation and inversion processes, respectively.

For last several years, the researchers had been studying and interacting widely on different evolutionary computation methods and ultimately it had broken down to some extent the boundaries between GAs, evolutionary programming, evolution strategies, and other evolutionary approaches. Today the term "genetic algorithm" is used by the researchers to describe something very far from the original conception of Holland. There are at least three overlapping meanings of *search* in genetic algorithms and other search procedures, as discussed below. *Search for stored data* – Here the problem is to efficiently retrieve information stored in the computer memory. Suppose in a large database of names and addresses stored in some ordered way we want to search for a record corresponding to a given last name using *Binary search* procedure. *Search for paths to goal* – Here the problem efficiently searches to find a set of actions that will reach a given goal starting from a given initial state. This form of search is central to many techniques in artificial intelligence. **Figure 1** illustrates a simple example of *8-puzzle*. A set of tiles numbered 1–8 are placed in a square, leaving one space empty. Sliding one of the adjacent tiles into the blank space is termed a *move*. Typical algorithms are *branch and bound search*, *depth-first search,* etc. *Search for solutions* – This is rather a general class of search compared with *search for paths to goal*. Here, the concept is to efficiently search a final solution to a problem in a huge space of candidate solutions. Among these solutions, one may or may not be the goal and the solution may or may not be improved with the progress of the search. These are the kinds of search problems for where the genetic algorithms are used.

**Figure 1.** *Eight-puzzle problem with the initial state and the final state.*

Although the genetic algorithm is directed to *search for paths to goal* or *search for solutions*, it cannot be guaranteed that it will always reach the goal. This is due to its stochastic or random search nature. On the other hand, this characteristic enables the rapid determination of a solution. With the advancement of the search process, GAs follow certain specific steps, which are not similar to the traditional search procedures.

#### **3. A survey on genetic algorithms**

The GAs have been employed in a wide variety of problems. Some of the studies involving medical image registration, image segmentation, and contour recognition are available in [12–14]. In addition, the classification of endothelial cells in tissue section, normalization of Chinese handwriting, and evaluation of earthquake risk for geological structures are examples of some practical applications [15–17]. GAs have also been used in optimization of feature extraction chain, error-correcting graph isomorphism, and dot pattern matching [18–20]. Moreover, the techniques to generate fuzzy rules for target tracking, constrained placement problems, process planning for job shop matching and blind channel identification with higher-order cumulation fitting have employed GA [21–24]. In the field of information retrieval, GA is used to retrieve the dynamic web-based contents [25]. Raidl et al. worked with biased mutation in evolutionary algorithm for solving subset-selection problems on complete graphs [26].

Although the simple genetic algorithm has been used for solving various problems, researchers have always tried to enhance the performance of GA by modifying the algorithm. Among several approaches for the improvement of performance, in 1970, Cavicchio developed a technique to conserve the best individuals by substituting the inferior parents if the offspring's fitness exceeded that of the inferior parent [27]. The *crowding* scheme of De Jong [28] intended to retain the diversity and the best individuals in the population by exchanging the maximally similar strings. Fogel [29] as well as Back and Schwefel [30] tested some optimizing functions for their algorithms and showed how the evolutionary method with self-adaptive mutation performs better than the method without self-adaptive mutation. Other than simple genetic approaches, Davis suggested that hybridizing genetic algorithms with the most successful optimization methods for particular problems provided the best performance [31]. The hybrid genetic scheme is aimed to hill-climbing from several points scattered in the search space. In case of multimodal objective function, it is obvious that some chromosomes/strings (offspring) will be in the basin of attraction of the global solution, where hill-climbing is an effective and fast form of search. However, a hybrid approach spoils hyperplane sampling, but does not destroy it entirely.

Some more variants of GA are also found in the literature [1, 32–35]. However, the implementation of modified genetic algorithm often increases the computation time for solving various complex problems, and the researchers have tried to increase the speed of the algorithm using parallel/distributed GA when the computation time is large for a particular problem. Various parallel implementations of GAs are discussed in [36, 37]. Multiobjective optimization problem is also solved by parallel genetic algorithm. The major parallel multiobjective genetic algorithms are discussed and some observations are included in [38]. Furthermore, two approaches of parallel GAs, namely the *inland* model and the *diffusion* model, are discussed in [39, 40], respectively. In the inland model, the population is divided into a number of smaller

#### *Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

populations. Among the subpopulations, the migration of individuals happens occasionally during the progress of search process. However, the selection of individuals for migration and the frequency of migration are significant debatable problems [36]. On the contrary, each individual in the diffusion model is related to a spatial location on a low-dimensional grid. The entire population is regarded as a group of active individuals, which interact only with the neighbors. Another important utilization of distributed GA is noticed in the performance-driven VLSI routing problem, which can handle both electrical and topological constraints of the problem [41].

Other than these, the well-known NP-hard Traveling Salesman problem on a cluster of nodes is also solved by the application of parallel genetic algorithm [42]. A *master-slave* technique is implemented in the parallel/distributed genetic approach for obtaining the optimal and/or suboptimal traveling path(s). Moreover, the distributed genetic algorithm is employed in the labor scheduling problems to ascertain the number of employees and their work schedule. The objective is to minimize the labor expenses and expected opportunity costs [43].

The researchers have also successfully employed GAs in solving various kinds of problems in other scientific fields. Notredome and Higgins [44] developed a popular software for aligning sequences with two objective functions. Several GA-based approaches were developed for solving multisequence alignment [45–49]. In chemistry, GAs were incorporated in studying water oxidation [50], magnetic storage [51], catalysis [52], stability of boron wheels [53]. Sathya et al. developed a genetic-based algorithm to classify genes for identification of biomaker genes to suggest an individualized treatment [54]. As a robust heuristic search method, the GA helped for mining information from the large datasets. It was applied to discover interesting and useful relations between data elements with abstraction in the domain of association rules [55]. Another GA-based classification technique was used to improve the feature selection with support vector machine (SVM) classifier. Feature selection was used to classify a breast cancer dataset with 699 instances into two classes, namely benign and malignant, each with 11 attributes [56]. A genetic approach was also applied in machine learning for selecting the proper and the best move in playing of chess. The technique helped in deciding the effective move by classifying the set of rules for a particular solution [57]. In addition, several GAs and hybrid GAs were proposed by the medical practitioners as well as the scientists to detect cancer by different approaches such as feature selection, categorization, and classification, registering temperature difference, and many more [58]. A hybrid genetic approach was developed for early diagnosis of breast cancer using thermography. This technique measured the temperature difference between cancerous and healthy tissues with a high success rate [59].

In the following sections, some applications of genetic algorithm with several illustrations are discussed. The parameters of experimental setup are provided to help implementation of the GAs in solving similar or different kinds of problems. Some exceptional results are also elaborated with illustrations to understand the strength of GA for finding solution in numerous scientific fields.

#### **4. Dot pattern matching identification**

In a 2-D or 3-D feature space, a dot pattern represents some physical objects or class of objects with a set of dots or points. Such dot patterns are used in geographic and cartographic data, spatial information system, astronomy and astrophysics,

remote sensing, image texture analysis, biomedical imaging, and some other areas [60–63]. The involvement of such studies with dot patterns includes the set estimation and shape identification, identification of point process parameter, and clustering and classification.

The dot pattern matching problem is defined as follows. *T* is a given test dot pattern, which is to be identified in an unknown scene of dot patterns *S* (a set of dot patterns or objects). Now, to find the position of best match in *S*,*T* should be translated and rotated. So a reference point (the centroid of the coordinates of the dots of *T*) is required to translate *T*. Let this centroid be *O*. In the space *S*, the translation of dot pattern to a point *P* indicates that their centroid is translated to *P*. Similarly, its rotation by *θ* is the rotation of pattern(s) with respect to *O* as origin.

The matching score is now evaluated as follows for two dot patterns *S* and *T*. Once *T* is transformed and rotated with respect to the origin *O*, the distance *d ti*, *sj* � � between a point *ti* of *T* and each of the points *sj* of *S* is calculated and the minimum distance is considered. If the number of points of *T* is *α*, then the sum of the minimum distances *D min* for all *α* points is as follows.

$$D\_{\min} = \sum\_{i=1}^{a} \text{Min} \left[ d(t\_i, s\_j) \right]\_{j=1,2,\dots,\beta} \tag{1}$$

where *s*1, *s*2, … *s<sup>β</sup>* and *t*1, *t*2, … *t<sup>α</sup>* are the points of *S* and *T*, respectively, and *α* ≤*β*. *D min* indicates the best matching score of two dot patterns or objects.

In pattern recognition problem, the performance of CGA (conventional genetic algorithm) [1] has been tested with the sequential and distribution of two GA-based modified methods, namely, cascaded genetic algorithm (CAGA) [64] and distributed CAGA (DCAGA) [65].

In dot pattern or object matching problem, *S* is a set of dot patterns of different shapes as depicted in **Figures 2** and **3**, and we have taken one of them as a test pattern *T* and matched it with *S*. The scores for matching between T and S for the genetic methods are shown in **Tables 1**–**3**. Here the score for successful matching is denoted by a ratio of two numbers. One of them is the number of times the solution has been reached and the total number of trials in percentage. Two types of matching, namely

**Figure 2.** *A scene of multiple dot patterns in 2-D space [64].*

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

#### **Figure 3.**

*A scene of multiple objects with edge map in 2-D space [64].*


#### **Table 1.**

*Matching results of CAGA and CGA on the dot patterns in Figure 2. The following data have been summarized over 50 runs for each DP.*


#### **Table 2.**

*Matching results of DCAGA and DCGA on the dot patterns in Figure 2. The following data have been summarized over 50 runs for each DP.*


#### **Table 3.**

*Matching results of DCAGA and DCGA on the objects with edge map in Figure 3. The following data have been summarized over 50 runs for each DP.*

*perfect matching* and *imperfect* but *visual matching*, are considered. In case of *visually matched patterns*, it is observed that *T* is superimposed over *S* without any visible error, but the matching error is not negligible. On the contrary, the calculated error of matching between *T* and *S* is close to zero in *perfectly matched situation*. Naturally, the pattern or object is termed as visually matched.

The dot/point patterns depicted in **Figure 3** are basically the *gray-tone* digital images. The pattern recognition experiment performed over these patterns is matching of edge maps. In **Figure 3,** the *gray-tone* digital image is a combination of four different objects of *512X512* pixels with *256* possible gray levels. The *Sobel Operator* is used to convert the digital image (**Figure 3**) into a two-tone edge map [66]. The edge map is usually considered as a discrete dot pattern with a dot represented digitally by 1 and a blank (white space) digitally by 0. In **Figure 3**, the object (P2) is separated from other three objects (P1, P3, and P4), which are very close and touched each other.

We consider the results of GAs for **Figure 2** in the experiment. From **Table 1**, the success rate of CAGA for the best case is 96% for DP3 where 73% of times the pattern is perfectly matched and 23% of times it is visually matched. The success rate of CAGA for DP4 is 97%, but in this case the pattern is perfectly matched for only 17% of times. The performance of CAGA is worst for DP2 with failure rate of 30%. On the other hand, the best performance of CGA is achieved for DP3 where the pattern is perfectly matched for 3% of times and visually matched for 53% of times. In CGA, the successful matching score of DP2 is only 7% (visually matched), which is worst among all four patterns.

From **Table 2**, it is noted that for the DCAGA, the success rate is 100% for all four patterns considering perfect and visual matching. The DCAGA achieves the best performance for DP3. On the contrary, in the worst case, the failure rate of the DCGA is at most 23%. However, for matching of the pattern DP4, a success rate of 100% is reached by the DCGA.

**Table 3** tabulates the experimental results of object matching for the GA-based distributed approaches. It is observed that the DCAGA cannot always achieve the success rate of 100% as noticed for the earlier experiment of dot pattern matching in **Table 2**. The success rate of the DCAGA is 100% for P1 and P2, and the failure rate is a maximum 6% for P3 and P4. It is noted that both techniques perform best for the isolated object P2. The DCGA does not perform well for the objects, which are close to each other.

#### **5. Genetically guided clustering algorithm**

In general, the clustering problem can be presented as follows:

For the set of *<sup>n</sup>* data <sup>X</sup> <sup>¼</sup> f g **<sup>x</sup>**1, **<sup>x</sup>**2, … , **<sup>x</sup>***<sup>n</sup>* to be clustered, each **<sup>x</sup>***<sup>i</sup>* <sup>∈</sup> <sup>ℛ</sup>*<sup>p</sup>* is an attribute vector consisting of *p* real measurement describing the *i*-th object. The objects corresponding to the data are to be clustered into non-overlapping groups *C* ¼ f g *C*1,*C*2, … ,*Ck* where *k* is number of clusters, *C*1∪*C*2∪ … ∪*Ck* ¼ X, *Ci* 6¼ *ϕ*, and *Ci*∩*Cj* ¼ *ϕ* for *i* 6¼ *j*. The objects within each group should be more similar to each other than the objects in any other group, and the value of *k* may or may not be known.

Clustering approaches are semi-optimal ways of arriving at the grouping problem. The number of ways of sorting *n* objects into *k* groups is enormous, which is given as

$$N(n,k) = \frac{1}{k!} \sum\_{i=0}^{k} \left(-\mathbf{1}\right)^{i} \binom{k}{i} \left(k-i\right)^{n} \tag{2}$$

For 25 objects with five clusters, this is in the order of 1016. Clearly, it is impractical to exhaustively search the solution space and obtain the optimal solution.

#### **5.1 Measures of similarity**

Since similarity is fundamental to the definition of a cluster, a measure of similarity between two objects/patterns/entities drawn from the same feature space is essential to most clustering procedures. The proximity of individuals (i.e., objects) is usually expressed as a distance during the clustering of data units. The clustering of variables generally involves a correlation or other such measure of association. The smaller the distance between the objects, the higher is the similarity. Because of the variety of feature types and scales, the distance measure (or measures) must be chosen carefully. Several distance measures are employed for clustering [67, 68]. The most popular and commonly used one is the Euclidean distance,

$$d\left(\mathbf{X}\_{i},\mathbf{X}\_{j}\right) = \sqrt{\left(\mathbf{X}\_{i} - \mathbf{X}\_{j}\right)'\left(\mathbf{X}\_{i} - \mathbf{X}\_{j}\right)} = \left[\sum\_{l=1}^{p} \left(\mathbf{x}\_{il} - \mathbf{x}\_{jl}\right)^{2}\right]^{1/2} \tag{3}$$

which is the line of sight distance between two points representing the objects.

#### **5.2 Genetic-algorithm-based clustering**

Let us consider a set consisting of *n* vectors X ¼ f g **x**1, **x**2, … , **x***<sup>n</sup>* , which will be clustered into *k* groups of homogeneous data. Each **x***<sup>i</sup>* ∈*R<sup>d</sup>* is a vector in a feature space of *d* real-valued measurements for defining the object features denoted by **x***i*. The features can be color, breath, length, etc.

We have discussed data clustering with a GA-based algorithm. This algorithm is dependent on the splitting and merging techniques [69]. Initially the data are split into several sub-clusters, and in the next step those sub-clusters are merged to find out the required clusters. We have considered various types of datasets to show how the data are grouped to isolate properly the required number of clusters.

#### *5.2.1 Cluster partitioning in R*<sup>2</sup> *feature space*

We have studied seven datasets in *R*<sup>2</sup> feature space depicted in **Figures 4**–**10**. Among them, both **Figures 4** and **6** consist of three clusters, although the nature of clusters is different. In both figures, two clusters are located close to each other, whereas the third one is placed far from those two clusters. However, in **Figure 4**, the density of closely placed clusters is nonuniform. On the other hand, the third one is three times in size compared with other two clusters. The data points are more dense at the cluster center. The density gradually gets reduced toward the boundary. **Figure 4(a)** shows the original dataset, and **Figure 4(b)** illustrates how the clusters are isolated after using the genetically guided method.

The dataset in **Figure 5(a)** comprises five clusters. All of them are denser near the center and lighter toward the boundary. Three clusters are equal in size, and two of them are close to each other. The five clusters have been correctly isolated using the algorithm, as shown in **Figure 5(b)**.

**Figure 4.**

*(a) The original dataset with three clusters before the application of GA-based algorithm. (b) Isolated three clusters after completion of the algorithm [69].*

#### **Figure 5.**

*(a) The original dataset with five clusters before the application of algorithm. (b) Isolated five clusters after the completion of algorithm [69].*

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

#### **Figure 6.**

*(a) The original dataset with three clusters before the application of algorithm. (b) Isolated three clusters after the completion of algorithm [69].*

#### **Figure 7.**

*(a) The original dataset with two clusters before the application of algorithm. (b) Isolated two clusters after the completion of algorithm [69].*

On the contrary, **Figure 6** consists of two closely located clusters of different densities. One cluster has uniformly dense data and the other has density tapering away from the center. The data density of the third cluster is also uniform. However, all of them are equal in size. **Figure 6(a)** and **(b)** illustrate the scenario, respectively, before and after the application of algorithm. All three clusters are properly isolated.

**Figure 7** depicts two almost overlapping clusters of equal size and density. The data points distribution of both the clusters is interesting since they follow Gaussian distribution in *R*<sup>2</sup> space. The clusters are so closely placed that they seem to be visually a single cluster. However, the genetic algorithm can correctly identify the clusters, as illustrated in **Figure 7(b)**.

The dataset of **Figure 8** is different and interesting from previous four datasets. Here, one cluster is completely enclosed by the other. The data point density of both clusters is almost uniform. In the dataset, a special feature of the algorithm is used. It

#### **Figure 8.**

*(a) The original dataset with two clusters before the application of algorithm. (b) Isolated two clusters after the completion of algorithm [69].*

#### **Figure 9.**

*(a) The original dataset with six clusters before the application of algorithm. (b) Isolated six clusters after the completion of algorithm [69].*

is to check the adjacency condition of sub-clusters, which are to be merged to isolate the clusters ultimately as depicted in 8(b).

**Figures 9** and **10** consist of special type of clusters. **Figure 9(a)** is a combination of six clusters, which are different in density as well as size. Among six clusters, two are identical with elliptical shape and both are connected with a string of highly dense data points. They appear visually as a single cluster although they are disjoint. The set of clusters are shape-wise three in numbers and have been identified as three clusters. Here, the string of points is defined as a third cluster, and it connects those two elliptical shaped clusters. In **Figure 9(a),** two of the remaining three clusters are almost equal in shape and density. The largest cluster in **Figure 9(a)** has lesser density compared with other two clusters. However, all six clusters including the string of points are correctly isolated after application of the genetic algorithm as shown in **Figure 9(b)**.

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

#### **Figure 10.**

*(a) The original dataset with two clusters and noise before the application of algorithm. (b) Isolated two genuine clusters and separated noise after the completion of algorithm [69].*

The data shown in **Figure 10(a)** are quite different from all other datasets. It is also a special dataset in nature because it consists of two identical shaped clusters with different density in the presence of random noise scattered all over the 2-D feature space. Two clusters are perfectly identified and also the noise, which is identified as the third cluster as shown in **Figure 10(b)**. After application of the algorithm it is seen that the noise is clustered locally in smaller arbitrary shape in the space due to inherent characteristics of the random noise. The third cluster is then formed by merging all smaller clusters of noisy data as depicted in **Figure 10(b)**.

#### *5.2.2 Cluster separation in R<sup>n</sup> feature space*

We have now discussed the cluster identification of a dataset of more than two features. Iris data is a set of four features [70], which is one of the most popular databases in the pattern recognition literature. The dataset contains three classes named as Iris Setosa (Class A), Iris Versicolor (Class B), and Iris Virginica (Class C). Each class has 50 instances and refers to a kind of Iris plant. The four feature attributes of the data are *Petal length*, *Petal width*, *Sepal length,* and *Sepal width*. Among the three classes, two are not linearly separable from each other, whereas the third one is identified separately from other two classes.

One cluster has been separated clearly from other two after invoking the algorithm. However, other two clusters are not perfectly separable (see **Table 4**).


#### **Table 4.**

*Confusion matrix for Iris flower classification.*

#### **6. Shape recognition with genetic algorithm**

The processing of dot patterns (DPs) or point sets in a plane is useful and important in pattern recognition problems. Dot patterns are encountered in various problems including points in feature space [71], pixels in digital image [72], physical objects such as stars in the galaxy [73], or spatial data [74–76].

An attribute of dot patterns is the visual shape generated by it. One simple way of defining the external shape is the convex hull [77]. But in most cases, the underlying shape from which the points emerge is not convex. To detect the nonconvex shape, the GA-based split and merge procedure [78] starts with the initial convex hull polygon. The *Splitting* process first divides the sides of the polygon. Consider a side *AB* and evaluate the average distance of *r* neighboring points of *A* and *B*. Now, if the average distance is smaller than the length of *AB*, the side *AB* is considered as a candidate for splitting. Eventually such development takes care of the concavity of the DP by generating a zigzag polygonal border as depicted in **Figure 11(a)**. The *merging* algorithm is then applied for having a smoother border. Splitting results in inclusion of additional edges while merging does just the opposite, making the border looks less zigzag as in **Figure 11(b)**.

A DP may also be the union of more than one disjoint smaller DP region as shown in **Figure 11(c)**. The isolation process separates such regions by removing two lines of the resulting polygon simultaneously, which act as the bridge between two disjoint regions. The removal of lines is possible if the region within the lines does not contain any dot.

Now, consider a set *S* that contains *n* points in the plane and the convex hull consists of the points of *S*. In that case, the underlying shape of the pattern is the convex hull if the dot pattern does not contain any concavity, else the splitting of

#### **Figure 11.**

*(a) Edge (represented by dashed line) marked for splitting process (edge AB before splitting; edges AC and BC created by splitting). (b) Edge marked for merging process over region XYZ and region PQRS). (c) Edges (represented by dashed lines) marked for isolation process [78].*

edges is required. **Figure 11** shows the splitting and merging of the edge/regions and the region for isolation.

#### **6.1 Polygonal shape detection of dot pattern**

The algorithm is studied on different dot pattern sets in *R*<sup>2</sup> space. The datasets are largely classified into three groups. Among these groups one consists of dot patterns with almost uniform density, as in **Figures 12** and **13**. **Figure 14** represents a dot pattern of variable density data points, and the data pooled from Gaussian distribution is shown in **Figure 15**. A special dot pattern with multiple distinct components is illustrated in **Figure 16**.

In the group of datasets with nearly uniform density data points, **Figures 12** and **13** show a DP with a concavity of spiral shape and a star-shaped DP, respectively. In **Figure 12**, the splitting process starts with the convex hull of 35 edges and ends with a

**Figure 12.** *(a) A spiral shaped DP and its convex hull. (b) Resulting perceptual border of the DP [78].*

#### **Figure 13.**

*(a) A star-shaped DP and its convex hull. (b) Resulting perceptual border of the DP with the GA-based merging process [78].*

zigzag border polygon of 187 sides. After the splitting process, the merging algorithm is applied on the polygon since the DP does not contain any isolated component. Finally, a polygon of 68 edges is generated with a smooth border, as in **Figure 12(b)**. **Figure 13(a)** shows a polygon consisting of five concavities and the splitting process is executed on each side of the five-side convex hull. The resulting polygon of 181 edges is formed with a zigzag border. The GA-based procedure is applied on this polygon to produce a polygon (of 26 sides) with a star-like perceptual border as in **Figure 13(b)**.

The dot patterns of **Figures 14** and **15** are chosen to show the performance of the algorithm on variable density data points. Each DP consists of two concavities. In **Figure 14**, the upper half of the DP (the denser region) has one concavity and the other one is in the lower half of the DP (the lesser density region). The density of dots

**Figure 14.** *(a) A DP with nonuniform density and its convex hull. (b) Resulting perceptual border of the DP [78].*

**Figure 15.**

*(a) A DP with two Gaussian distributions and its convex hull. (b) Resulting perceptual border of the DP applying GA-based merging process [78].*

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

#### **Figure 16.**

*(a) A multicomponent chromosome-like dot pattern and its convex hull. (b) Resulting perceptual border of the components [78].*

in the dot pattern is gradually reduced toward the lower half. The DP of **Figure 15** also contains two concavities located in the middle region and opposite to each other. Its density decreases gradually from the center of each half toward the boundary portion. We have employed the GA-based merging technique on this dataset at the end of the splitting process. In **Figure 14**, the splitting procedure is started with a convex hull polygon of 24 sides (see **Figure 14(a)**) and a polygon of 102 sides is generated. Next, the merging process is executed over it and a 52-edge polygon with a smoother border is obtained as in **Figure 14(b)**. In **Figure 15(a)**, the splitting process starts on a 22-side convex hull polygon to create a polygon of 40 edges. The application of merging process finally generates a 32-side smother border polygon as shown in **Figure 15(b)**.

The ultimate polygons with smother border as in **Figure 17(b)** and **(c)** are generated from the starting polygon of **Figure 17(a)** if the user specifies the number of sides. **Figure 17(a)** shows a *C*-shape DP with its convex hull. In both cases, the splitting process is continued as long as there is no violation of the density-length condition. Now, the ultimate polygonal shape as in **Figure 17(b)** is obtained depending on the user's input of a 36-side polygon (say). Similarly, the polygon of **Figure 17(c)** is produced from **Figure 17(a)** when the user specifies 23 as the number of sides of polygon (say).

**Figure 16** is a special dot pattern of three components where each one contains nearly uniformly dense data points. The convex hull of the DP is a 17-side polygon. The splitting process followed by the isolation process separates the DP into three distinct regions with zigzag polygonal boundaries. Finally, for each isolated polygon, the merging process generates three separated polygons with a smoother border as in **Figure 16(b)**.

#### **7. Biological sequence alignment with genetic algorithm**

An alignment at sequence level represents the primary structure of biological macromolecules, which is represented as a linear sequential chain of residues. Residues are the building blocks of macromolecules in which DNAs and RNAs are made of nucleotide residues and proteins are of amino acid residues. The building blocks encode the

#### **Figure 17.**

*(a) A C-shaped DP and its convex hull. (b) Perceptual border of the DP with 36 edges (user specified). (c) Perceptual border of the DP with 23 edges (user-specified) [78].*

history of molecular evolution. During the process of evolution, residues accumulate many random changes and diverge over time. Some of the changes are accepted and selected by natural selections, and some of them are not. Residues, which are involved in important functional and/or structural roles, tend to be preserved by natural selection and do not accumulate random changes [79]. These parts of sequences help to infer the evolutionary history and identify the common ancestor of two or more related sequences. Therefore, the sequences that share the common evolutionary origin are called homologous sequences. Homology is the conclusion that can be drawn by looking at the degree of similarities between two or more sequences of different species.

#### **7.1 Scoring schemes in sequence alignment**

The similarity between two or multiple sequences is quantified by measuring the alignment score. To choose the best alignment among a set of sequences, quantification of alignment is required. It identifies the evolutionary distances between two or multiple sequences. The best alignment always associated with the maximum shared similarity or identity and the lowest number of mutational events between the sequences. Therefore, the scoring systems always score positive when there are identical or similar residues in the alignment.

On the other hand, the mutational events are always associated with zero or negative scores. Different scoring schemes are used in sequence alignment to know the level of identity or similarity. The process of score evaluation for a pairwise sequence alignment is more straightforward and simple compared with scoring the alignment of multiple sequences together. In two aligned sequences, the total number of identical residues yields a percentage identity between them. This identity count is used mostly for nucleotide sequence alignment, as the nucleotides A, T/U, G, and C play equivalent roles in the structure and function of the DNA or RNA molecule. Therefore, the nucleotides are either identical, which is a match in alignment, or nonidentical, which is a mismatch in an alignment. In contrast, for protein sequences, a similarity score is calculated along with the identity score denoting the amino acids having similar physicochemical properties. The substitution matrices are normally consulted for protein sequence alignment.

#### **7.2 Sequence alignment scheme without gaps**

The GA-based sequence alignment [80] is a simple method to align a pair of sequences (a query sequence and a known database sequence) without gaps for finding similarity. When a biological sequence (DNA/RNA/protein), called the query sequence is given, one usually performs a similarity measure within the databases that consist of all available genomic sequences and the known protein sequences. Eventually, the search yields many sequences with varying degree of similarities. It then depends on the user to identify those that are associated with the higher scores and homologous. In the alignment technique, we consider one sequence as a database sequence denoted by D and the other as a query sequence denoted by Q.

In the initialization step, an initial population of size *N* is randomly generated as the probable alignment solutions for the input Q. Each individual **Pi**, ∀*i* ∈f g 1, 2, … , *N* or a chromosome is a string of bits such that **Pi** ¼ *b*1*b*2*:* … *bL* where *L* is the length of each individual/chromosome, which is equal to the given Q, and each *bj* is 0, 1 f g, ∀*j*∈f g 1, 2, … , *L* . The *1*'s in the chromosome represent the presence of the residues in the corresponding positions and the *0*'s indicate the absence of the residues in Q.

In this approach, all three operators of the conventional GA, namely tournament selection, one-point crossover, and bit-flip mutation are used [1]. In each iteration, the algorithm advances the search toward a better solution by producing a better population. After random selection of two chromosomes from the population pool, the crossover and the mutation operations are performed on them.

The fitness score of a chromosome is evaluated by the following fitness function F.

$$\mathcal{W} = \sum\_{i=1}^{n} \mathcal{W}\_i \tag{4}$$

where *n* is the total number of residues to be aligned, and *w* is the weight of the alignment score.

For protein sequence alignment, the BLOSUM62 matrix is consulted and for DNA sequence, an identical match gets +1 score, and a mismatch gets 0. The fitness score

determines the similarity or identity level between two sequences D and Q. The fitness score is now calculated for pairwise alignment between two sequences D and Q. It is evaluated in three ways. The scoring is for the straight alignment and the alignments with a left or right shift.

For a straight alignment, let us consider the following two nucleotide sequences, *D* and *Q*.

$$\begin{aligned} D &= A \ T \ G \ C \ T \ T \ A \ G \ T \ C \\ Q &= A \ C \ G \ C \ A \ T \ A \ G \ A \ C \end{aligned}$$

Let us now consider the following binary coded chromosome from the population of an intermediate generation as a probable solution of query sequence *Qp*.

$$Q\_p = 0 \,\, 1 \,\, 1 \,\, 0 \,\, 1 \,\, 0 \,\, 1 \,\, 0 \,\, 1$$

The equivalent decoded value of *Qp* by replacing 1 with the corresponding residue of *Q* will be as follows.

$$Q\_p = \\_\text{C}G\\_A\\_A\\_{G\\_C}$$

where "\_" denotes the presence of 0 in *Qp*.

Therefore, the alignment score or the fitness of the chromosome would be 4 and the alignment structure looks like:

$$D = A \ T \ G \ C \ T \ T \ A \ G \ T \ C$$

$$Q\_p = \\_C \ G \\_A \\_A \ G \\_C$$

However, the alignment by shifting is necessary for a different representation of Q to achieve the best alignment score. For example, if Q is ACTTAGTAAC, then the shifting to the right is required to obtain the optimum alignment with alignment score 6, which is illustrated below,

$$D = A \ T \ G \ C \ T \ T \ A \ G \ T \ C$$

$$Q = A \ C \ T \ T \ A \ G \ T \ A \ A \ C$$

Similarly, for Q: GCGTTGCTTAGA, the left shift alignment is necessary to obtain the best alignment with score 7, which is shown below,

$$D = A \ T \ G \ C \ T \ T \ A \ G \ T \ C$$

$$Q = G \ C \ G \ T \ T \ G \ C \ T \ T \ A \ G \ A$$

For the evaluation of the final fitness score of an individual, the scores of all three alignments (straight, shift right, and shift left) techniques are considered. However, the selection of position to start alignment depends on the randomly chosen position number. Since minimum 30% matching between two sequences is necessary to evaluate the homologous relationship [81, 82], the random numbers between 1 and 70% of the length of the sequence is generated. For example, if a sequence length is of 100

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

residues, we shall identify a position randomly between 1*st* and 70*th* residues of *D* or *Q* depending on the alignment type. The length of D is considered for the straight as well as the right shift alignments. The length of *D* is also considered for the left shift alignment if it is smaller than or equal to *Q*. However, the length of *Q* is taken when *D* is larger than *Q*. The total number of times for choosing the position numbers for these three types of alignments depends on the average length of *D* and *Q*. In this case, it is 30% of the average length. For example, if the average length of *D* and *Q* is 200, the alignment techniques will be continued 60 times with different random starting position values. Now the best matching score is extracted from 60 results given by three alignment techniques.

#### **8. Conclusions**

This chapter describes the application of genetic algorithm in various scientific fields as a reliable optimizer. It is narrated how the simple/conventional genetic algorithm has been advanced with time. Apart from these some real applications of the modified genetic algorithm with illustrations have been elaborated in brief (references are provided for detailed description) on the synthetic data. It is shown how the GA identifies a pattern (in single/multiple dot(s) or edge map(s) scene) in 2-D space. The reasoning is given to explain the differences in results of matching scores of two genetic sequential and distributed schemes (CGA and CAGA or DCGA and DCAGA). A genetic-based clustering approach with results provides the efficient use of GA for grouping synthetic as well as real data. The function of GA in complex shape detection and biological sequence alignment on synthetic data further shows the application of GA as a useful and essential optimizer.

#### **Abbreviations**


*Genetic Algorithms*

#### **Author details**

Gautam Garai Saha Institute of Nuclear Physics, Kolkata, India

\*Address all correspondence to: gautam.garai.16@gmail.com

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

#### **References**

[1] Goldberg DE. Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley Publishing; 1989

[2] Rechenberg I. Cybernetic solution path of an experimental problem. Royal Aircraft Establishment (U.K.): Ministry of Aviation; 1965

[3] Schwefel HP. Evolutions Strategie and Numerische Optimierung. Ph.D. Thesis. Berlin: Technische University; 1975

[4] Fogel LJ, Owens AJ, Walsh MJ. Artificial Intelligence Through Simulated Evolution. New York: John Wiley; 1966

[5] Box GEP. Evolutionary operation: A method for increasing industrial productivity. Journal of the Royal Statistical Society, Vol. C. 1957;**6**(2): 81-101

[6] Friedman GJ. Digital simulation of an evolutionary process. General Systems Yearbook. 1959;**4**:171-184

[7] Bremermann HJ. Optimization through evolution and recombination. In: Yovits MC, Jacobi GT, Goldstein GD, editors. Self-Organizing Systems. Washington D. C.: Spartan Books; 1962

[8] Reed J, Toombs R, Barricelli NA. Simulation of biological evolution and machine learning. Journal of Theoretical Biology. 1967;**17**:319-342

[9] Baricelli NA. Numerical testing of evolution theories. Acta Biotheoretica. 1962;**16**:69-126

[10] Fraser AS. Simulation of genetic systems by automatic digital computers: I introduction. Australian Journal of Biological Sciences. 1957;**10**:484-491

[11] Martin GG, Cockerham CC. High speed selection studies. In: Kempthorne O, editor. Biometrical Genetics. USA: Pergamon; 1960

[12] Hill A, Taylor CJ. Model-based image interpretation using genetic algorithm. Image and Vision Computing. 1992;**10**: 295-300

[13] Roth G, Levine MD. Geometric primitive extraction using a genetic algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994; **16**(9):901-905

[14] Toet A, Hajema WP. Genetic contour matching. Pattern Recognition Letters. 1995;**16**:849-856

[15] Lin DS, Leou JJ. A genetic algorithm approach to Chinese handwriting normalization. IEEE Trans. Systems, Man and Cybernetics-Part B. 1997;**27**(6): 999-1007

[16] Yamany SM, Khiani KJ, Farag AA. Application of neural networks and genetic algorithms in the classification of endothelial cells. Pattern Recognition Letters. 1997;**18**:1205-1210

[17] Giacinto G, Paolucci P, Roli F. Application of neural networks and statistical pattern recognition algorithms to earthquake risk evaluation. Pattern Recognition Letters. 1997;**18**:1353-1362

[18] Wang YK, Fan KC, Horng JT. Genetic-based search for errorcorrecting graph isomorphism. IEEE Transactions on Systems, Man and Cybernetics-Part B. 1997b;**27**(4):588-596

[19] Ansari N, Chen MH and Hou ESH. Point pattern matching by a Genetic Algorithm. In: Proc. of the 16th Annual Conf. of IEEE Industrial Electronic Society (IECON'90), Vol. II. Pacific Grove: 1990. pp. 1233-1238

[20] Mirmehdi M, Palmer PL, Kittler J. Genetic optimization of the image feature extraction process. Pattern Recognition Letters. 1997;**18**:355-365

[21] Chan KCC, Lee V, Leung H. Generating fuzzy rules for target tracking using a steady-state genetic algorithm. IEEE Transactions on Evolutionary Computing. 1997;**1**(3):189-200

[22] Schnecke V, Vornberger O. Hybrid genetic algorithms for constrained placement problems. IEEE Transactions Evolutionary Computing. 1997;**1**(4): 266-277

[23] Zhang F, Zhang YF, Nee AYC. Using genetic algorithms in process planning for job shop machining. IEEE Transactions on Evolutionary Computing. 1997;**1**(4):278-289

[24] Chen S, Wu Y, McLanghlin S. Genetic algorithm optimization for blind channel identification with higher order cumulant fitting. IEEE Transactions on Evolutionary Computing. 1997;**1**(4): 259-265

[25] Kushchu I. Web-based evolutionary and adaptive information retrieval. IEEE Transactions on Evolutionary Computation. 2005;**9**(2):117-125

[26] Raidl GR, Koller G, Julstrom BA. Biased mutation operators for subgraphselection problem. IEEE Transactions on Evolutionary Computation. 2006;**10**(2): 145-156

[27] Cavicchio DJ. Adaptive Search using Simulated Evolution. Ph.D. dissertation. Ann Arbor, Michigan: University of Michigan; 1970

[28] De Jong KA. An Analysis of Behavior of a Class of Genetic Adaptive System. Doctoral dissertation. Michigan: University of Michigan; 1975

[29] Fogel DB. Evolutionary Computation: Toward a New Philosophy of Machine Intelligence. Piscataway, NJ: IEEE Press; 1995

[30] Back T, Schwefel HP. An overview of evolutionary algorithm for parameter optimization. Evolutionary Computation. 1993;**1**:1-23

[31] Davis LD. Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold; 1991

[32] Harp SA, Samad T. Genetic synthesis of neural network architecture. In: Davis L, editor. Handbook of Genetic Algorithms. New York: University of Chicago Press; 1992. pp. 202-221

[33] Rizzi S. Genetic operators for hierarchical graph clustering. Pattern Recognition Letters. 1998;**19**:1293-1300

[34] Kim D, Ahu S. A MS-GS VQ codebook design for wireless image communication using genetic algorithms. IEEE Transactions on Evolutionary Computation. 1999;**3**(1):35-52

[35] Maniezzo V. Genetic evolution of the topology and weight distribution of neural networks. IEEE Transactions on Neural Networks. 1994;**5**:39-53

[36] Cantú-Paz E. A summary of research on parallel genetic algorithms. Illinois Genetic Algorithm Lab., Univ. Urbana, IL: Illinois Genetic Algorithm Lab., Univ. Illinois Urbana-Champaign; 1995. p. 950076. Tech. Rep

[37] Tomassini M. Parallel and distributed evolutionary algorithms: A review. In: Miettinen K, Mkel M,

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

Neittaanmki P, Periaux J, editors. Evolutionary Algorithms in Engineering and Computer Science. New York: Wiley; 1999. pp. 113-133

[38] Veldhuize DAV, Zydallis JB, Lamont GB. Considerations in engineering parallel multiobjective evolutionary algorithms. IEEE Transactions on Evolutionary Computation. 2003;**7**:144-173

[39] Hao JK, Dome R. A new populationbased method for satisfiability problems. In: Proc. of the 11th European Conference on Artificial Intelligence. New York: Wiley; 1994. pp. 135-139

[40] Holland JH. Adaptation in Natural and Artificial Systems. Ann Arbor, MI: Univ Michigan Press; 1975

[41] Lienig J. A parallel genetic algorithm for performance-driven VLSI routing. IEEE Transactions on Evolutionary Computation. 1997;**1**:29-39

[42] Sena GA, Megherlu D, Isern G. Implementation of a parallel genetic algorithm on a cluster of workstations: Traveling salesman problem, a case study. Future Generation Computer Systems. 2001;**17**:477-488

[43] Easton FF, Mansour N. A distributed genetic algorithm for deterministic and stochastic labor scheduling problems. European Journal of Operational Research. 1999;**118**:505-523

[44] Notredame C, Higgins DG. SAGA: Sequence alignment by genetic algorithm. Nucleic Acids Research. 1996; **24**(8):1515-1524

[45] Arenas DE, Ochoterena H, Rodriguez VK. Multiple sequence alignment using a genetic algorithm and GLOCSA. Journal of Artificial

Evolution and Applications. 2009;**2009**: 963150

[46] Naznin F, Sarker R, Essam D. Vertical decomposition with genetic algorithm for multiple sequence alignment. BMC Bioinformatics. 2011;**12**: 353

[47] Shyu C, Foster J. Evolving Consensus Sequence for Multiple Sequence Alignment with a Genetic Algorithm. In: Cantú-Paz E et al., editors. Proc. Conf. Genet. and Evol. Comp. (GECCO'03), vol. 2724. Berlin Heidelberg, Chicago, IL, USA: LNCS, Springer-Verlag; 2003. pp. 2313-2324

[48] Michalewicz Z, Fogel DB. How to solve it: modern heuristics. 2nd rev. and extended. ed. Berlin; London: Springer; 2004

[49] Narimani Z, Beigy H, Abolhassani H. A new genetic algorithm for multiple sequence alignment. International Journal of Computational Intelligence and Applications. 2012;**11**(04):1250023

[50] Zhao S, Jin R, Abroshan H, Zeng C, Zhang H, House SD. Gold nanoclusters promote electrocatalytic water oxidation at the nanocluster/cose2 interface. Journal of the American Chemical Society. 2017;**139**:1077-1080. DOI: 10.1021/jacs.6b12529

[51] Bader SD. Colloquium: Opportunities in nanomagnetism. Reviews of Modern Physics. 2006;**78**: 1-15. DOI: 10.1103/RevModPhys.78.1

[52] Pelegrini M, Parreira RLT, Ferrão LFA, Caramori GF, Ortolan AO, Silva EH. Hydrazine decomposition on a small platinum cluster: The role of n2h5 intermediate. Theoretical Chemistry Accounts. 2016;**135**:58. DOI: 10.1007/ s00214-016-1816-x

[53] Islas R, Heine T, Ito K, Schleyer P, v. R., and Merino G. Boron rings enclosing planar hypercoordinate group 14 elements. Journal of the American Chemical Society. 2007; **129**:14767-14774. DOI: 10.1021/ja07 4956m

[54] Sathya M, Jayaselvi M, Joshi S, Pandy E, Pareek PK, Jamal SS, et al. Cancer categorization using genetic algorithm to identify biomaker genes. Journal of Healthcare Engineering. 2022; **ID**:5821938

[55] Xu Y, Zeng M, Liu Q, Wang X. A genetic algorithm based multilevel association rules mining for big datasets. Mathematical Problems in Engineering. 2014;**2014**:867149

[56] Devaraj N. Feature Selection using Genetic Algorithm to Improve SVM Classifier. USA: LAP LAMBERT Academic Publishing; 2019

[57] Bhasin H, Bhatia S. Application of genetic algorithms in machine learning. Intl. Journal of Computer Science and Information Technologies. 2011;**2**(5): 2412-2415 0975-9646

[58] Manszoori TK, Suman A, Mishra AK. Application of genetic algorithm for cancer diagnosis by feature selection. Intl. Journal of Engineering Research and Technology. 2014;**3**(8):1295-1301 2278-0181

[59] Resmini R, Silva L, Aranjo AS, Medeiros P, Muchaluat-Saade D, and Conci A. Combining genetic algorithms and SVM for breast cancer diagnosis using infrared thermography. Sensors. 2021;**21**:4802

[60] Griffin PM, Alexopoulos C. Point pattern matching using centroid bounding. IEEE Transactions on Systems, Man, and Cybernetics. 1989; **19**(5):1274-1276

[61] Lavine D, Lambird BA, Kanal LN. Recognition of spatial point patterns. Pattern Recognition. 1983;**16**(3):289-295

[62] Sprinzak J, Werman M. Affine point matching. Pattern Recognition Letters. 1994;**15**(4):337-339

[63] Zhang L, Xu W. Point-pattern matching using irreducible matrix and relative invariant. Tsinghua Science and Technology. 1999;**4**(4): 1602-1605

[64] Garai G, Chaudhuri BB. A cascaded genetic algorithm for efficient optimization and pattern matching. Image and Vision Computing. 2002;**20**: 265-277

[65] Garai G, Chaudhuri BB. A distributed hierarchical genetic algorithm for efficient optimization and pattern matching. Pattern Recognition. 2007;**40**:212-228

[66] Jain AK. Fundamentals of Digital Image Processing. Englewood Cliffs, N.J: Prentice-Hall; 1989

[67] Everitt BS. Cluster Analysis. London: Edward Arnold; 1993

[68] Johnson RA, Wichern DW. Applied Multivariate Statistical Analysis. New Jersey: Prentice Hall; 1992

[69] Garai G, Chaudhuri BB. A novel genetic algorithm for automatic clustering. Pattern Recognition Letters. 2003;**25**:173-187

[70] Newman DJ, Hettich S, Blake CL and Merz CJ. UCI Repository of machine learning databases, 1998. Univ. of California, Irvine: Dept. of Information and Computer Sciences; 1998. Available from: http://www.ics.uci.edu/mlearn/ MLRepository.html

*Application of Genetic Algorithm in Numerous Scientific Fields DOI: http://dx.doi.org/10.5772/intechopen.105740*

[71] Duda RO, Hart PE. Pattern Classification and Scene Analysis. New York: John Wiley & Sons, Inc.; 1973

[72] Faugeras O. Three-Dimensional Computer Vision: A geometric Viewpoint. Cambridge: The MIT Press; 1993

[73] Ogawa H. Labeled pattern matching by Delaunay triangulation and maximal cliques. Pattern Recognition. 1986;**19**: 35-40

[74] Taylor PJ. Quantitative Methods in Geography: An Introduction to Spatial Analysis. Boston: Houghton Mifflin Company; 1977

[75] Laurini R. and Thompson D. Fundamental of Spatial Information Systems. The A.P.I.C. Series, No. 37. London: Academic Press; 1992

[76] Okabe A, Boots B, Sugihara K. Spatial tessellations: Concepts and Applications of Voronoi Diagrams. New Jersey: John Wiley and Sons; 1992

[77] Ronse C. A bibliography on digital and computational convexity (1961-1988). IEEE Transactions on Pattern Analysis and Machine Intelligence. 1989;**11**:181-190

[78] Garai G, Chaudhuri BB. A split and merge procedure for polygonal border detection of dot pattern. Image and Vision Computing. 1999;**17**:75-82

[79] Xiong J. Essential bioinformatics. NY: Cambridge University Press; 2006

[80] Garai G, Chowdhury B. A novel genetic approach for optimized biological sequence alignment. Journal of Biophysical Chemistry. 2012;**3**:201-205

[81] Sander C, Schneider R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991;**9**(1):56-68

[82] Rost B. Twilight zone of protein sequence alignments. Protein Engineering. 1999;**12**(2):85-94

Section 2
