**2. Background review**

In this section, ANN, PSO, and the learning process in ANNs are reviewed.

#### **2.1 Artificial neural network (ANN)**

ANNs is considered a type of computational intelligence that is inspired by biological human systems like the brain process information [14]. ANNs are learned by instance and are configured for specific types of applications and problems through a learning system [15]. One of the most widely applied NN models is BP Neural

*Designing Artificial Neural Network Using Particle Swarm Optimization: A Survey DOI: http://dx.doi.org/10.5772/intechopen.106139*

**Figure 1.** *Three-layer topological structure of BPNN.*

Network (**Figure 1**). The framework of BP Neural Network is made of three kinds of layers, input layer, hidden layer, and output layer. The input layer and output layer are representatives of input variables and output variables, so that the number of input and output variables is equal to the number of neurons; depending on the specific problem, there may be one or more hidden layers. An ANN is called a Deep Neural Network when it is made up of more than three layers it means an input layer, multiple hidden layers, and an output layer. In different layers, the neuron junctions have their own weight, each output neuron is multiplied by a given weight and after summing up the result is used as the input to the next neuron. In the next step, the neurons generate the output signals by computations that are based upon the function of transfer, and then the gradient descent method is used to minimize the error function in order that the inferred network value be similar to the value of the target output as far as possible [16].

The learning process in a network consists of two steps: Feedforward (FF) and BP. The key principle is using the gradient descent method to minimize the error function and make a small change to the weights of the network [17].

The learning process is usually implemented in ANNs by instances; the learning process of ANNs has three types: supervised learning (SL), unsupervised learning (UL), and semi-supervised learning. The first type of learning process is SL that is based upon the direct comparison between the expected and actual output. The optimization algorithms are based upon gradient descent like BP algorithm, they can be used to iteratively modify the connection weights hence minimizing the error. UL is the second type that is based upon the correlation of the input data. The learning rule is the most important factor in the learning algorithm and can determine the weight update rules. Some popular learning rules are the Competitive Learning rule, Hebbian rule, and Delta rule [11]. The third type of learning process is semisupervised learning. In this approach, a large amount of unlabeled data is combined

with a small amount of labeled data. In fact, it can be said that semi-supervised learning falls between SL and UL.

#### **2.2 Particle swarm optimization**

The algorithm of PSO is used to optimize continuous nonlinear functions. It was proposed by J Kennedy and R Eberhart [18] and inspired by observations of collective and social behavior. PSO algorithm is considered a metaphor of social behavior. The social behavior is inspired by the movement of the flock to find food for the case of a bird flocking.

One of the advantages of PSO is the ability to deal with problems of multi-modal (i.e., multiple local optima) optimization and its simple implementation compared to associated strategies such as GA. PSO is used in various fields and has successfully been applied by several researchers to quantitative structure-activity relationship modeling, including kernel regression and k-nearest neighbor [19], minimum spanning tree for partial least squares modeling [20], piecewise modeling, and Neural Network training [21].

At first, the system will have a population of randomly created candidate solutions. Each candidate solution is called a particle, and it will throw into the problem space and will be given a random velocity. Each particle has memory and keeps track of previous corresponding fitness and best position. *pbest* call the previous best value. Therefore, *pbest* is associated only with a particular particle. The best value that exists between all the particles *pbest* in the swarm is *gbest*. The basic concept of the PSO technique is the acceleration of every particle toward its *pbest* and the *gbest* locations at every time step. Acceleration weights are random for both *gbest* and *pbest* locations. **Figure 2** indicates the concept of PSO. In this figure, Pk , Pk+1, Vini, and Vmod are the current position, modified position, initial velocity, and modified velocity, respectively. V*pbest* is velocity considering V*gbest*, and *pbest* is velocity considering *gbest*.

**Figure 2.** *Concept of changing a particle's position in PSO [22].*

The PSO algorithm contains the following steps:


$$V\_{id} = \left(V\_{id} \ast W\right) + \left(rand\_1 \ast c\_1 \ast \left(P\_{betal} - X\_{id}\right)\right) + rand\_2 \ast c\_2 \ast \left(G\_{betal} - X\_{id}\right) \tag{1}$$

$$X\_{id} = (V\_{id} + X\_{id}) \tag{2}$$

6. Step (2) is repeated until a criterion is met. This criterion usually is a maximum number of iterations or sufficiently suitable fitness calls.

PSO has several control parameters: W is the weight inertia that controls the exploitation and exploration of the search because it adjusts velocity dynamically. Asynchronous updates are less costly than synchronous updates. Vmax is the largest velocity that is possible for the particles, if Vmax is less than the velocity particle, the velocity of the particle decreases to Vmax. Therefore, the fitness of search and resolution is directly affected by Vmax. Particles are trapped in local minima when Vmax is too low, and particles will move beyond good solution if Vmax is too high. c1 (cognition) and c2 (social components) are the constants of acceleration. They change a particle velocity toward *gbest* and *pbest*. The tension is determined by velocity in the system. In a search space, a swarm of particles can be used globally or locally. In the PSO's local version, the entire procedure is the same and the *gbest* is replaced by the *lbest*.
