**3.2.1 The PSO algorithm**

The algorithm maintains a population of particles, where each particle represents a potential solution to an optimization problem. *S* assumed as being the size of the swarm. *I* each particle can be represented as an object with various features. These characteristics are as follows:

*xi*: the current position of the particle;


An Evolutionary Fuzzy Hybrid System for Educational Purposes 407

The startup mentioned in the first step of the algorithm consists of the following:

provided that the initial positions are initialized in a random fashion.

explains why the constants *c*1 and *c*2 are called acceleration coefficients.

best solutions that are in the area nearby the best particle.

**3.2.2 The behavior of the PSO** 

1. initialize each coordinated *xi,j* with a random value in the range [-*x*max, *x*max], for the entire *i* 1 … *s* and *j* 1 … *n*. This distributes the initial positions of the particles along the search space. Select a good random distribution algorithm to obtain a uniform

2. initialize each *vi,j* with a value taken from the range [-*v*max, *v*max] for the entire *i* 1 … *s* and *j* 1 … *n*. Alternatively, the velocities of particles may be initialized with 0 (zero),

The stopping criterion mentioned in the algorithm depends on the type of problem to be solved. Typically the algorithm is run for a predetermined and fixed number of iterations (a fixed number of function evaluation) or until it reaches a specific value of error. It is important to realize that the term speed models the rate of change in the position of the particle. The changes induced by speed update equation (3) represent acceleration, which

A brief description of how the algorithm works is given as follows: Initially, a particle any is identified as being the best particle in the group, based on his ability using the objective function. Then, all particles will be accelerated in the direction of this particle, and at the same time in the direction of own best positions previously found. Occasionally particles explore the search space around the current best particle. This way, all particles will have the opportunity to change their direction and seek a new 'best' particle. Whereas most functions have some form of continuity, chances are good to find the best solutions in the space that surrounds the best particle. Approximation of the particles coming from different directions in the search space towards the best solution increases the chances of finding the

Many interpretations have been suggested regarding the operation and behavior of the PSO. Kennedy, in his research strengthened biological vision partner-PSO, performing experiments to investigate the roles of the different components of the velocity update

*Create and initialize: i* – current particle; *s* – PSO of n-dimensions:

 *for each particle i = [1 .. s] If f (Six.x) < f (Si.y) then Si y = xi S. If f (Si.y) < f (s. y*ˆ *) then S. y*ˆ *= Yi S.* 

*Update S using the equations (3) and (4) until the stopping condition is True*

distribution in the search space.

 *repeat:* 

 *end loop* 

*end* 

The best personal position *i* particle represents the best position that the particle has visited and where he obtained the best evaluation. In the case of a task of minimizing, for example, a position that earned the lowest function value is considered to be the best position or with highest fitness assessment. *F* symbol is used to denote the objective function being minimized. The update equation for the best staff position is given by equation (1) using *t* time explicitly.

$$y\_i(t+1) = \begin{cases} y\_i(t) & \text{if } & f(y\_i(t) \le f(\mathbf{x}\_i(t+1))) \\ \mathbf{x}\_i(t+1) & \text{if } & f(y\_i(t) > f(\mathbf{x}\_i(t+1))) \end{cases} \tag{1}$$

There are two versions of PSO, calls *gbest* templates and *lbest (the global best and the best place)* (Goldberg, 1989). The difference between the two algorithms is based directly in the way that a particular particle interacts with its set of particles. To represent this interaction will be used the symbol *y*ˆ . The details of the two models will be discussed in full later. The definition of *y*ˆ as used in *gbest* model, is shown by equation (2).

$$\begin{aligned} \hat{y}(t) & \in \{ y\_0(t), y\_1(t), \dots, y\_s(t) \} \quad | \quad f(\hat{y}(t)) \\ &= \min \{ f(y\_0(t)), f(y\_1(t)), \dots, f(y\_s(t)) \} \end{aligned} \tag{2}$$

Note that this definition shows that *y*ˆ is the best position until then found by all particles in the swarm *S* size.

The PSO algorithm makes use of two independent random sequences *r*1 *~U*(0,1) and *r*2 *~U*(0,1)*.* These strings are used to give nature to stochastic algorithm, as shown below in the equation (3). The values of *r*1 and *r*2 are scaled through constant *c*<sup>1</sup> 0 and *c*<sup>2</sup> 2. These constants are called *acceleration coefficients*, and they exert influence on the maximum size of a particle can give in a single iteration. The speed that updates the step is specified separately for each dimension *j* 1 … *n*, so that *vi,j* denotes the dimension j vector associated with the particle speed *i*. The update speed is given by the following equation:

$$\begin{aligned} \upsilon\_{i,j}(t+1) &= \upsilon\_{i,j}(t) + c\_1 r\_{1,j}(t) [\upsilon\_{i,j}(t) - \upsilon\_{i,j}(t)] + \\ c\_2 r\_{2,j}(t) [\hat{\upsilon}\_j(t) - \upsilon\_{i,j}(t)] \end{aligned} \tag{3}$$

In the definition of the equation, the constant speed update *c*2 regulates clearly the maximum size of the step in the direction of better global particle, and the constant *c*<sup>1</sup> adjusts the size of the step in the direction of better personal position of the particle. The value of *vi,j* is maintained within the range of [-*v*max, *v*max] by reducing the probability that a particle can exit the search space. If the search space is defined by the interval [-*x*max, *x*max], then the value of *v*max is calculated as follows:

$$
\upsilon\_{\text{max}} = k \quad X \quad \text{x}\_{\text{max}} \quad \text{where} \quad 0.1 \le k \ge 1.0
$$

The position of each particle is updated using your new velocity vector:

$$\mathbf{x}\_i(t+1) = \mathbf{x}\_i(t) + \upsilon\_i(t+1) \tag{4}$$

The algorithm consists of repeated application of the equations above update. Below the basic PSO algorithm code is shown.

*Create and initialize: i* – current particle; *s* – PSO of n-dimensions:

406 Fuzzy Inference System – Theory and Applications

The best personal position *i* particle represents the best position that the particle has visited and where he obtained the best evaluation. In the case of a task of minimizing, for example, a position that earned the lowest function value is considered to be the best position or with highest fitness assessment. *F* symbol is used to denote the objective function being minimized. The update equation for the best staff position is given by equation (1) using *t*

> ( ) ( ( ) ( ( 1))) ( 1) ( 1) ( ( ) ( ( 1))) *i i i*

There are two versions of PSO, calls *gbest* templates and *lbest (the global best and the best place)* (Goldberg, 1989). The difference between the two algorithms is based directly in the way that a particular particle interacts with its set of particles. To represent this interaction will be used the symbol *y*ˆ . The details of the two models will be discussed in full later. The

*y t if f y t f x t y t x t if f y t f x t*

0 1

Note that this definition shows that *y*ˆ is the best position until then found by all particles in

The PSO algorithm makes use of two independent random sequences *r*1 *~U*(0,1) and *r*2 *~U*(0,1)*.* These strings are used to give nature to stochastic algorithm, as shown below in the equation (3). The values of *r*1 and *r*2 are scaled through constant *c*<sup>1</sup> 0 and *c*<sup>2</sup> 2. These constants are called *acceleration coefficients*, and they exert influence on the maximum size of a particle can give in a single iteration. The speed that updates the step is specified separately for each dimension *j* 1 … *n*, so that *vi,j* denotes the dimension j vector associated

ˆ ˆ ( ) ( ), ( ),....., ( ) ( ( ))

*yt y t y t y t f yt*

0 1

with the particle speed *i*. The update speed is given by the following equation:

The position of each particle is updated using your new velocity vector:

, , 1 1, , , 2 2, ,

*i j ij j ij ij j j ij*

*v t v t cr t y t x t cr t y t x t*

( 1) ( ) ( )[ ( ) ( )] ( )[ ( ) ( )] ˆ

In the definition of the equation, the constant speed update *c*2 regulates clearly the maximum size of the step in the direction of better global particle, and the constant *c*<sup>1</sup> adjusts the size of the step in the direction of better personal position of the particle. The value of *vi,j* is maintained within the range of [-*v*max, *v*max] by reducing the probability that a particle can exit the search space. If the search space is defined by the interval [-*x*max, *x*max],

max max *v kXx* where 0.1 1.0 *k*

The algorithm consists of repeated application of the equations above update. Below the

*i i i*

*f y t f y t f y t*

*s*

min ( ( )), ( ( )),......, ( ( )) *s*

 

(1)

(2)

(3)

( 1) ( ) ( 1) *i ii xt xt vt* (4)

time explicitly.

the swarm *S* size.

*i*

then the value of *v*max is calculated as follows:

basic PSO algorithm code is shown.

definition of *y*ˆ as used in *gbest* model, is shown by equation (2).

 

#### *repeat: for each particle i = [1 .. s] If f (Six.x) < f (Si.y) then Si y = xi S. If f (Si.y) < f (s. y*ˆ *) then S. y*ˆ *= Yi S. end loop Update S using the equations (3) and (4) until the stopping condition is True end*

The startup mentioned in the first step of the algorithm consists of the following:


The stopping criterion mentioned in the algorithm depends on the type of problem to be solved. Typically the algorithm is run for a predetermined and fixed number of iterations (a fixed number of function evaluation) or until it reaches a specific value of error. It is important to realize that the term speed models the rate of change in the position of the particle. The changes induced by speed update equation (3) represent acceleration, which explains why the constants *c*1 and *c*2 are called acceleration coefficients.

A brief description of how the algorithm works is given as follows: Initially, a particle any is identified as being the best particle in the group, based on his ability using the objective function. Then, all particles will be accelerated in the direction of this particle, and at the same time in the direction of own best positions previously found. Occasionally particles explore the search space around the current best particle. This way, all particles will have the opportunity to change their direction and seek a new 'best' particle. Whereas most functions have some form of continuity, chances are good to find the best solutions in the space that surrounds the best particle. Approximation of the particles coming from different directions in the search space towards the best solution increases the chances of finding the best solutions that are in the area nearby the best particle.

#### **3.2.2 The behavior of the PSO**

Many interpretations have been suggested regarding the operation and behavior of the PSO. Kennedy, in his research strengthened biological vision partner-PSO, performing experiments to investigate the roles of the different components of the velocity update

An Evolutionary Fuzzy Hybrid System for Educational Purposes 409

Note that *y*ˆ is called *the best overall position*, and belongs to the particle called *the best global* 

The *lbest* model tries to prevent premature convergence keeping multiple attractors. A subset of particles is defined for each particle of which is selected *the best local particle*, *y*ˆ *<sup>i</sup>*. The symbol *y*ˆ *<sup>i</sup>* is called *the best local position* or *better in the vicinity (the local best position or the neighborhood best)*. Assuming that the indexes of the particles are around space *s*, the

> ( ),...., ( ) *i il il i i i i l*

, , 1 1, , , 2 2, ,

*i j ij j ij ij j j ij*

*v t v t cr t y t xt cr t y t x t*

( 1) ( ) ( )[ ( ) ( )] ( )[ ( ) ( )] ˆ

Note that the particles are selected in the subset *Ni* and they have no relation with the other particles within the domain of the search space; the selection is based solely on the index of the particle. This is done for two main reasons: the computational cost is lower, by not requiring grouping, and this also helps to promote the expansion of information on good

Finally, you can observe that the *gbest* model is in fact a special case of *lbest* model, when the

There is a clear relationship of PSO with the evolutionary algorithms (EAs). To some authors, the PSO maintains a population of individuals who represent potential solutions, one of the features found in all EAs. If the best personal positions (*yi*) are treated as part of the population, then clearly there is a weak check (Lee & El-Sharkawi, 2008). In some algorithms of ES, the descendants (*offspring*), parents compete, replacing them if they are more suited. The equation (1) resembles this mechanism, with the difference that the best staff position (the father) can only be replaced by your own current position (descending), provided that the current position is more adapted to the best old staff position. Therefore, it

The speed update equation resembles arithmetic crossover operator (*crossover*) found in AGs. Typically, the intersection arithmetic produces two descendants that are results of mixing both parents involved in the crossing. The equation of speed update, PSO without term *vi,j* (see equation 3), can be interpreted as a form of arithmetic crossover involving two parents, returning only one descendant. Alternatively, the update equation of speed,

*N y t y t y t y t y t yt* 

 1 1

(7)

(9)

ˆ ˆ *i ii* ( 1) ( ( 1)) min ( ) , *<sup>i</sup> y t N f y t f a aN* (8)

( ), ( ),....., ( ), ( ),

*particle*.

**3.2.4 The model of the best location (***lbest***)** 

equations of *lbest* update for a neighborhood size *l* are as follows:

solutions for all particles, although these are local search.

seems to be some weak form check this on the PSO.

without the term *vi,j*. It can be seen as changing operator.

*l = s,* i.e. when the selected set encompasses the entire swarm.

**3.2.5 Considerations about the similarity between PSO and EAs** 

1

equation (Kennedy & Eberhart, 1995). The task of training a neural network was used to compare performance of different models. Kennedy made use of *lbest* model (see *lbest* section for a complete description of this template), instead *gbest* model.

For this update equations developed two speed, the first by using just the experience of the particle, called the *component of cognition*, and the second, using only the interaction between the particles and called *social component.*

Consider the equation speed update (3) presented earlier. The term *c*1*r*1,j(*t*)[*yi,j*(*t*) - *xi,j*(*t*)] is associated only with the cognition, where it takes into account only the experiences of the particle itself. If an OSP is built using only the cognitive component, the upgrade speed equation becomes:

$$\upsilon\_{i,j}(t+1) = \upsilon\_{i,j}(t) + c\_1 r\_{1,j}(t) \lfloor y\_{i,j}(t) - \alpha\_{i,j}(t) \rfloor$$

Kennedy found that the performance of this model of "only with cognition" was less than the original PSO's performance. One of the reasons of bad performance is attributed to total absence of interaction between the different particles.

The third term in the equation, speed update *c*2*r*2,j(*t*)[ *y*ˆ *<sup>j</sup>*(*t*) - *xi,j*(*t*)], represents the social interaction between the particles. A version of PSO with just the social component can be constructed using the following equation: speed update

$$
\upsilon\_{i,j}(t+1) = \upsilon\_{i,j}(t) + c\_2 r\_{2,j}(t) [\hat{y}\_j(t) - x\_{i,j}(t)],
$$

It was observed that in the specific problems that Kennedy, investigated the performance of this model was superior to the original PSO.

In summary, the term speed of PSO update consists of two components, the component of cognition and the social component. Currently, little is known about the relative importance of them, although initial results indicate that the social component is more important in most of the problems studied. This social interaction between the particles develops cooperation between them to resolve the problem.

#### **3.2.3 Model of the best global (***gbest***)**

The model allows *gbest* a faster rate of convergence at the expense of robustness. This model keeps only a single "best solution", called the *best global particle*, between all particles in the swarm. This particle acts as an attractor, pulling all particles to it. Eventually, all particles will converge to this position. If it is not updated regularly, the swarm can converge prematurely. The equations for update *y*ˆ and *xi* are the same as shown above:

$$\begin{aligned} \hat{y}(t) & \in \left\{ y\_0(t), y\_1(t), \dots, y\_s(t) \right\} \quad \mid \quad f(\hat{y}(t)) \\ &= \min \left\{ f(y\_0(t)), f(y\_1(t)), \dots, f(y\_s(t)) \right\} \end{aligned} \tag{5}$$

$$\begin{aligned} \upsilon\_{i,j}(t+1) &= \upsilon\_{i,j}(t) + c\_1 r\_{1,j}(t) [\mathbf{y}\_{i,j}(t) - \mathbf{x}\_{i,j}(t)] + \\ c\_2 r\_{2,j}(t) [\hat{\mathbf{y}}\_j(t) - \mathbf{x}\_{i,j}(t)] \end{aligned} \tag{6}$$

Note that *y*ˆ is called *the best overall position*, and belongs to the particle called *the best global particle*.

#### **3.2.4 The model of the best location (***lbest***)**

408 Fuzzy Inference System – Theory and Applications

equation (Kennedy & Eberhart, 1995). The task of training a neural network was used to compare performance of different models. Kennedy made use of *lbest* model (see *lbest*

For this update equations developed two speed, the first by using just the experience of the particle, called the *component of cognition*, and the second, using only the interaction between

Consider the equation speed update (3) presented earlier. The term *c*1*r*1,j(*t*)[*yi,j*(*t*) - *xi,j*(*t*)] is associated only with the cognition, where it takes into account only the experiences of the particle itself. If an OSP is built using only the cognitive component, the upgrade speed

, , 1 1, , , ( 1) ( ) ( )[ ( ) ( )] *i j ij j ij ij v t v t cr t y txt*

Kennedy found that the performance of this model of "only with cognition" was less than the original PSO's performance. One of the reasons of bad performance is attributed to total

The third term in the equation, speed update *c*2*r*2,j(*t*)[ *y*ˆ *<sup>j</sup>*(*t*) - *xi,j*(*t*)], represents the social interaction between the particles. A version of PSO with just the social component can be

, , 2 2, , ( 1) ( ) ( )[ ( ) ( )] ˆ *i j i j j j ij v t v t cr t y txt*

It was observed that in the specific problems that Kennedy, investigated the performance of

In summary, the term speed of PSO update consists of two components, the component of cognition and the social component. Currently, little is known about the relative importance of them, although initial results indicate that the social component is more important in most of the problems studied. This social interaction between the particles develops

The model allows *gbest* a faster rate of convergence at the expense of robustness. This model keeps only a single "best solution", called the *best global particle*, between all particles in the swarm. This particle acts as an attractor, pulling all particles to it. Eventually, all particles will converge to this position. If it is not updated regularly, the swarm can converge

*f y t fy t fy t*

*s*

(6)

(5)

min ( ( )), ( ( )),......, ( ( )) *s*

( 1) ( ) ( )[ ( ) ( )] ( )[ ( ) ( )] ˆ

prematurely. The equations for update *y*ˆ and *xi* are the same as shown above:

0 1

0 1

, , 1 1, , , 2 2, ,

*i j ij j ij ij j j ij*

*v t v t cr t y t xt cr t y t x t*

ˆ ˆ ( ) ( ), ( ),....., ( ) ( ( ))

*yt y t y t y t f yt*

section for a complete description of this template), instead *gbest* model.

the particles and called *social component.*

absence of interaction between the different particles.

constructed using the following equation: speed update

this model was superior to the original PSO.

cooperation between them to resolve the problem.

**3.2.3 Model of the best global (***gbest***)** 

equation becomes:

The *lbest* model tries to prevent premature convergence keeping multiple attractors. A subset of particles is defined for each particle of which is selected *the best local particle*, *y*ˆ *<sup>i</sup>*. The symbol *y*ˆ *<sup>i</sup>* is called *the best local position* or *better in the vicinity (the local best position or the neighborhood best)*. Assuming that the indexes of the particles are around space *s*, the equations of *lbest* update for a neighborhood size *l* are as follows:

$$\begin{aligned} N\_i &= \{ y\_{i-l}(t), y\_{i-l+1}(t), \dots, y\_{i-1}(t), y\_i(t),\\ y\_{i+1}(t), \dots, y\_{i+l}(t) \} \end{aligned} \tag{7}$$

$$\hat{y}\_i(t+1) \in N\_i \quad \left| f(\hat{y}\_i(t+1)) = \min \left\{ f(a) \right\} \right| \; \forall a \in N\_i \tag{8}$$

$$\begin{aligned} \upsilon\_{i,j}(t+1) &= \upsilon\_{i,j}(t) + c\_1 r\_{1,j}(t) [\upsilon\_{i,j}(t) - \upsilon\_{i,j}(t)] + \\ c\_2 r\_{2,j}(t) [\hat{y}\_j(t) - \upsilon\_{i,j}(t)] \end{aligned} \tag{9}$$

Note that the particles are selected in the subset *Ni* and they have no relation with the other particles within the domain of the search space; the selection is based solely on the index of the particle. This is done for two main reasons: the computational cost is lower, by not requiring grouping, and this also helps to promote the expansion of information on good solutions for all particles, although these are local search.

Finally, you can observe that the *gbest* model is in fact a special case of *lbest* model, when the *l = s,* i.e. when the selected set encompasses the entire swarm.

#### **3.2.5 Considerations about the similarity between PSO and EAs**

There is a clear relationship of PSO with the evolutionary algorithms (EAs). To some authors, the PSO maintains a population of individuals who represent potential solutions, one of the features found in all EAs. If the best personal positions (*yi*) are treated as part of the population, then clearly there is a weak check (Lee & El-Sharkawi, 2008). In some algorithms of ES, the descendants (*offspring*), parents compete, replacing them if they are more suited. The equation (1) resembles this mechanism, with the difference that the best staff position (the father) can only be replaced by your own current position (descending), provided that the current position is more adapted to the best old staff position. Therefore, it seems to be some weak form check this on the PSO.

The speed update equation resembles arithmetic crossover operator (*crossover*) found in AGs. Typically, the intersection arithmetic produces two descendants that are results of mixing both parents involved in the crossing. The equation of speed update, PSO without term *vi,j* (see equation 3), can be interpreted as a form of arithmetic crossover involving two parents, returning only one descendant. Alternatively, the update equation of speed, without the term *vi,j*. It can be seen as changing operator.

An Evolutionary Fuzzy Hybrid System for Educational Purposes 411

large and complicated systems, fuzzy systems become difficult to adjust, depending on manual methods that involve trial and error. The fuzzy relation matrix representing the relationships between concepts and actions can be unwieldy, and the best values for the parameters needed to describe the functions of relevance may be difficult to determine. The performance of a diffuse system can be very sensitive to specific values of the parameters.

Knowledge Saving Learning Optimizing Speed Non-Linear Systems

Fuzzy Systems

Table 1. Comparison of characteristics of fuzzy logic with meta-heuristic techniques

evaluation function, stage of genetic algorithm where the adjustment is determined.

In general, meta-heuristic methods offer distinct advantages of optimization of functions of relevance and even learning fuzzy rules. The meta-heuristic methods result in a more comprehensive search, reducing the chance of finishing in a local minimum, through sampling of several solutions sets simultaneously. Fuzzy logic contributes with the

There are several possible ways to use meta-heuristic methods with fuzzy systems. A type of hybrid system involves the use of separate modules as part of a global system. The modules based on meta-heuristic methods and fuzzy logic can be grouped singly or with other subsystems of computational intelligent or conventional programs that form an

Another use is the design of systems that are primarily of applications with fuzzy logic. The use of genetic algorithms aims to improve the design process and the performance of the operating system based on fuzzy system. The meta-heuristic methods can be used to discover the best values for functions of relevance when the manual selection of values is

There are different types of meta-heuristic methods. Among them, genetic algorithms (GA) and particle swarm optimization (PSO), which are used in this chapter. These two methods, more another variation of the PSO, called hybrid PSO (HPSO), are chosen due to their features for integration with other systems. The general procedure for using the metaheuristic methods with fuzzy systems is shown in Figure 6. For example, a possible solution (represented by a chromosome or a bird) can be defined as a concatenation of the values of all functions of relevance. When the triangular functions are used to represent the functions of relevance, the parameters are the centers for each set widths and fuzzy. An initial range of possible parameter values, the fuzzy system is rotated to determine how much it works well. This information is used to determine the fit of each solution and to establish a new population. The cycle is repeated until you found the best set of values for the parameters of

This process can be expanded to use the population that includes information about the conditions and actions corresponding to fuzzy rules. Include them in meta-heuristic

treatment allows the system to learn or refine the fuzzy rules.

Meta-heuristic Methods

application system.

difficult or takes a long time.

the functions of relevance.

The best way to analyze the term *vi,j* is not to think of each iteration as a population replacement process by a new engine (birth and death), but as a process of continuous adaptation (Eberhart and J. Kennedy, 2001). This way the values of *xi* are not replaced, but continually adapted using vectors speed *vi*. This makes the difference between the OSP and the other EAs clearer: the PSO maintains information on the position and velocity (changes in position); In contrast, traditional EAs only keep information on the position.

In spite of the opinion that there is some degree of similarity between the PSO and the majority of other EAs, the PSO has a few features that currently are not present in any other EAs, especially the fact that the PSO models the speed of the particles as well as their positions.
