**3. The optimization of ANNs based on PSO algorithm**

Methods with the aim of optimal design of an ANN utilizing PSO have been divided into two main categories: optimizing weights and optimizing structure and hyper parameters. These categories are further divided into two subcategories, including non-hybrid optimization and hybrid optimization, which in former authors only used PSO to optimize ANNs weights, in latter, hybrid methods have been utilized. Both subcategories have been reviewed in the following subsections A and B.

#### **3.1 Weights and biases optimization**

Some papers focused on weights for optimizing ANNs. They can be divided into two categories. First, those related to Non-Hybrid Optimization, and second, those related to Hybrid Optimization.

#### *3.1.1 Non-hybrid optimization*

Some studies used classical PSO to optimize NN and for showing their accuracy, they compared their solution with conventional approach optimization like BP. The first paper that falls into this category is from Gudise and Venayagamoorthy [23] published in 2003. They made a comparative study on the computational requirements of the BP and PSO algorithm for NN as training algorithms. They presented results for an FFNN learning a nonlinear function and indicated that the FFNN weights converge faster when the PSO is used instead of the BP algorithm. Later, in 2005, a modified PSO was presented by Zhao et al. [24], which adjusts the velocities and positions of the particle on the basis of the best positions that are earlier visited by other particles and themselves, and includes the method of diversifying the population to prevent premature convergence. In this paper, PSO is compared with the conventional BP to learn a nonlinear function for training a FFNN. The considered problem is how accurate and fast can the weights of NN be determined by BP and PSO to learn a common function. Another research that compared the PSO and BP for optimization NN was proposed by Ni et al. [25] in 2014. They introduced PSO for stochastic global optimization in NN training to solve the flaws of the traditional BP network in cementing prediction. They showed their method's training time is shorter than BP network and also the prediction accuracy that they obtained is high. Following by that, Liu et al. [26] to predict the high-speed grinding temperature used a BP NN based upon PSO algorithm (PSO-BP). They compared their method with gradient descent training BP NN which trained based upon Levenberg Marquardt (LM) algorithm and showed that PSO- BP performs better than the other methods in predicting the grinding temperature. In this paper, the authors used PSO algorithm for training BP NN to obtain a set of weights and biases, which could minimize the Mean Square Error (MSE).

In some studies, firstly PSO was improved and then used for optimizing NN. First, Bai et al. [27] used improved PSO- BP NN to improve the prediction accuracy of pest occurrence cycle. Their method used inertia weight to improve the PSO algorithm. Next, they used improved PSO to optimize the thresholds and weights of BP NN. Then, they established a pest prediction model using a rough set and an improved PSO- BP network. Their research showed that the number of iterations can be reduced by the improved PSO algorithm. Second, Liu and Yin [28] optimized BP NN with using an improved PSO. In the new algorithm, PSO used enhanced adaptive acceleration factor and also enhanced adaptive inertia weight to justify the initial weight value and biases of BP NN. At the end, simulation results indicated that the new algorithm is able to enhance convergence rate and precision of prediction of BP NN, that decreases the error of prediction. Later on, Nandi and Jana [29] rectified the problem by formulating a new inertia weight strategy for PSO called PPSO which balanced the exploitation and exploration properly while training ANN and compared their model with 4 other training algorithms. For all benchmark datasets, PPSO showed better performance with regard to avoiding local minima and convergence rate as well as

better accuracy. The proposed PPSO reduced the trapping risk in local minima with a very well convergence rate.

In some works, PSO was employed to optimize NN in different fields such as medical imaging, energy consumption, civil engineering, etc. For example, in medical imaging, Wang et al. [30] introduced a method of relatively recent image enhancement for improving the brain image contrast. Then, they presented the Predator-Prey PSO (PP-PSO), which is a modification of traditional PSO to train weights of singlehidden layer NN. In their method, they utilized the MSE as an objective function. Later on, Zhang et al. [31] developed a technique that could automatically establish diagnoses from the brain magnetic resonance images. First, the processing brain imaging was implemented. Second, from the volumetric image, one axial slice was selected. Third, a single-hidden layer NN was utilized as a classifier. Finally, for training the weights and biases of the classifier, a predator-prey PSO was proposed. Their method performs better than the human observers and 10 state-of-the-art approaches. Also, in energy consumption, Le et al. [32] proposed four novel AI techniques. They utilized these models for predicting the heating load of buildings' energy efficiency. Their model was based upon meta-heuristics algorithms and the potential of ANN, including Imperialist Competitive Algorithm (ICA), Artificial Bee Colony optimization (ABC), GA, and PSO. For the buildings prediction of the heating load of energy efficiency with PSO-ANN model, the parameters of the PSO algorithm were set up before optimization of the ANN model consisting of the number of particle swarms, maximum particle's velocity, individual cognitive, group cognitive, inertia weight, and maximum number of iterations. Then, PSO algorithm optimized the biases and weights of the initialized ANN. The best PSO-ANN model was determined with the lowest Root Mean Squared Error (RMSE). The GA provided the highest performance in optimizing the ANN model, to forecast the HL of EEB systems. The remaining meta-heuristics algorithms provided more unsatisfactory performance, in contrast to the performance of the ICA-ANN, PSO-ANN, and ABC-ANN models. In the civil engineering field, Chatterjee et al. [33] proposed a PSO-based approach to train the NN for predicting structural failure of the reinforced concrete buildings. In order to find the optimal weights for the NN classifier, the PSO algorithm was involved. In the first phase, NN training, PSO minimizes the RMSE to achieve the optimal input weight vector to the input layer of the ANN. Next, to get ingenuity, the NN-PSO model was compared with MLP-FFN classifier (multilayer perceptron FF network) and NN. Finally, the supremacy of the presented NN-PSO in comparison to the NN and MLP-FFN classifiers was shown by the experimental results.

Besides, some studies have focused on only a specific version of NN like random FF NN (RFNN) and tried to use PSO to optimize them. For example, Xu and Shu [34] at the beginning, considered the advantages of both PSO and non-iterative learning to train RFNN. Pacifico and Ludermir [35] presented to utilize PSO and clustering analysis to optimize RFNN input weights and biases. In this study, they employed a local best neighborhood scheme for PSO population updating where each individual only followed some members belongs to its immediate neighborhood. Following by that, an improved PSO was proposed by Ling et al. [36], which encoded the input-tooutput sensitivity information of RFNN to optimize the input weights and biases.

Some researchers to find a better answer for their problems, used different types of PSO such as cooperative PSO, Cultural Cooperative Particle Swarm Optimization (CCPSO), and multi-phase PSO. The cooperative PSO is an enhanced PSO that was presented by Van den bergh and Engelbrecht [37]. They obtained good results by applying this method on NN training. In this method, input vectors are divided into

several sub vectors that are optimized in their own swarms cooperatively. In this case, performance is improved due to splitting the main vector into several sub vectors which in turn results in better credit assignments and decreases the chance to omit a possible good solution for a certain component in the vector. Lin et al. proposed [38] a CCPSO approach that a collection of multiple swarms which interact by exchanging information. They applied CCPSO for optimizing a fuzzy NN and result in it performed better than BP and GA. Next, Multi-phase PSO (MPPSO) was proposed by Al-kazemi and Mohan [39] in 2002. Training of ANNS by MPPSO is another variation which evolves simultaneously multiple groups of particles that change the direction of search in different phases of the algorithm. Each particle in this method is in a specific group and phase at a given time. MPPSO boosts the broader exploration of the search space, increases population diversity, and prevents premature convergences. Furthermore, MPPSO has different update equations comparing to the basic PSO and permits changes to the locations of the particle that only lead to some improvements. Many researchers chose a different path and have used multiobjective PSO for optimizing NN. For example, Carlos Coello et al. [40] proposed Multiobjective Particle Swarm Optimization (MOPSO) and used this method as a searching strategy for improving NN.

Some studies utilized PSO for solving large-scale problems. For instance, a novel study for high-dimensional datasets was proposed to optimize the weights of NN with PSO and some other Evolutionary Computation (EC) methods. Xue et al. [41] presented a self-adaptive parameter and strategy-based PSO (SPS-PSO) algorithm and then they used this method to optimize FFNN with feature selection. The authors divided the experiments into two groups. They utilized SPS-PSO and three other evolutionary computation methods, GA, PSO, and biogeography-based optimization for directly optimizing the FNN's weights in the first group. In the other group, firstly, they employed SPS-PSO-based feature selection on the initial datasets and obtained eight comparatively smaller datasets with the K-Nearest Neighbor (KNN). Then, the new datasets were utilized as the inputs for FNN. They optimized the FNN weights one more time by SPS-PSO and three other evolutionary computation methods. The experimental findings showed that SPS-PSO had the vantage to optimize the FNN weights in comparison to the other methods of EC. Meanwhile, the feature selection based upon SPS-PSO can decrease the size of solution and computational complexity, whereas ensuring the accuracy of classification, it is utilized for preprocessing the datasets for FNN.

### *3.1.2 Hybrid optimization*

In this subcategory, authors used hybrid methods to optimize weights of ANNs. Some studies combining GA and PSO for optimizing ANN's weights. For instance, in 2018, Anand and Suganthi [42] optimized ANN with using a hybrid algorithm of PSO and GA. Then, they used this model to enhance the measurement of electricity demand in India. Their model has higher performance and reliable accuracy than ANN-PSO or ANN- GA that are single optimization models. They used hyperbolic tangent and identity as activation function in hidden layer and output layer, respectively, the sum of squares as error function and mean absolute percentage as an indicator of the quality of prediction. PSO by using linear and quadratic regression models together, optimized the weights of socio-economic indicators and performs a search for the best fitted members that lessen the error. Also, Ma [43] developed a short time traffic flow prediction software on the basis of BP NN that could be used

for predicting urban short-term traffic flow. The GA-based improved PSO was utilized for optimizing BP NN weight threshold to improve BP NN prediction accuracy. The results showed that this software could accurately and quickly predict the information of road traffic flow at the next moment, which could extremely reduce urban road traffic pressure. Next, Xiao et al. [44] proposed a new three-stage nonlinear ensemble model. In this model, three various types of NN based models, including elman network, generalized regression NN, and wavelet NN built by three nonoverlapping training sets. The results of the study showed the ensemble ANNs-PSO-GA method enhanced the prediction performance over other linear combination and individual models.

In some works, researchers preferred combining PSO and wavelet to obtain a better answer. In 2015, Zhang et al. [45] with using Wavelet Entropy (WE) proposed a novel computer-aided diagnosis system to extract some features from Magnetic Resonance (MR) brain images, followed by FFNN with training method of a Hybridization of PSO and biogeography-based optimization (HBP), which combined the exploration ability of biogeography-based optimization and exploitation ability of PSO. They used MSE as an objective function to optimize weights with PSO. The proposed WE+HBP-FNN method obtain nearly perfect detection pathological brains in MRI scanning. Next, a novel hybrid approach called Switching PSO-Wavelet Neural Network (WNN) was proposed by Yang Lu et al. [46] in 2015 to enhance recognition accuracy in face recognition that is one of the important research problems in computer vision. They used the algorithm of the recently proposed Switching PSO (SPSO) for optimizing the weight parameters, translation factors, scale factors, and threshold in WNN. The proposed method, SPSO- WNN, has a higher learning ability and fast convergence speed than conventional WNN. Especially, for overcoming the difference between the local search and the global search, which facilitates jumping the local minimum, a velocity-updating equation depended on mode with Markovian switching parameters is presented in SPSO. They showed their method has a much better performance compared to PSO-WNN, GA-WNN, and WNN.

Following by that, some studies tried to use a hybrid model to propose better models compare to BP. Firstly, in 2008, Chen et al. [47] used a hybrid evolutionary algorithm that is based upon PSO and AFSA, also referred to as AFSA-PSO- parallelhybrid evolutionary (APPHE) algorithm in FFNN training. They showed that FFNN training by the novel hybrid evolutionary algorithm compared to FFNN trained by Levenberg-Marquardt BP (LMBP) algorithm, show high stability toward the optimal position, satisfactory performance, convergent accuracy and converges quickly. In this research, both the output transfer function and the hidden transfer function were sigmoid function. Secondly, a hybrid crop classifier was presented by Zhang and Wu [48] for polarimetric synthetic aperture radar images in 2011. The feature sets included the cloude decomposition known as H/A/α decomposition, span image, and the gray-level co-occurrence matrix-based texture features. Then, Principle Component Analysis (PCA) reduced the features. Lastly, an FNN was built and trained by Adaptive Chaotic PSO (ACPSO). The results on flevoland sites showed the superiority of ACPSO to BP and adaptive BP.

Some works prefer to combine BP and PSO to make a hybrid model for optimizing weights of NN. In 2007, Zhang et al. [49] proposed a hybrid algorithm combining BP with PSO algorithm. For training the weights of FFNN, the hybrid algorithm can benefit from employing strong global searching and local searching ability of the PSOA and the BP algorithm, respectively. Firstly, in the PSOBP algorithm, a heuristic algorithm was adopted by them to give a transition from PSO to gradient descending

search. Also, they gave three kinds of encoding strategy of particles and gave the different problem areas that every encoding strategy was actively used in. They showed that in terms of accuracy and convergent speed, the proposed hybrid PSOBP algorithm performs better than the adaptive PSO and BP algorithm. Following by that, in 2011, Yaghini et al. [50] proposed a hybrid improved opposition-based algorithm that is based upon PSO and GA (HIOPGA) methods and then compared BP algorithm with their method on several benchmark problems. In fact, their method combined ability of two algorithms. This algorithm began training using a particle population. During the algorithm iteration, when improved opposition-based PSO cannot improve some particles' position, a subpopulation of such NNs is created and sent to GA. Now, the HIOPGA can find better NN to replace in the population by utilizing the GA operators, mutation, and crossover. Also, Kartheeswaran and Durairaj [51] in 2017, for image reconstruction, presented the sequential and parallel data implementing the decomposition strategies on a PSO algorithm based ANN weights optimization. They utilized a hybrid algorithm combining BP with PSO algorithm. They used PSO with BP-ANN for optimizing the different parameters including hidden layer sizes, number of hidden nodes, and optimize the network connection's weights. In fact, this study, by optimizing the weights of connection, presented the application of a hybrid model for the reconstruction of Shepp-Logan head phantom image.

#### **3.2 Optimizing structure and hyper parameters**

In this category, there are a few papers that have focused on optimizing hyper parameters. There are two subcategories: first Non-Hybrid Optimization, second, Hybrid Optimization.

### *3.2.1 Non-hybrid optimization*

In this subcategory, the authors used non-hybrid methods to optimize structure and hyper parameters.

In 2000, Zhang and Shao [52] were the first authors that presented a PSONN system for evolving network architecture and the weights of ANNs, alternately. They used evolved ANNs in modeling product quality estimator for a fractionator of the hydrocracking unit in the oil refining industry. Carvalho and Ludermir [53] proposed another study that was inspired by Zhang and Shaos methodology but introduces the weight decay heuristic in the weight adjustment process in an attempt to obtain more generalization control. They analyzed the use of the PSO for the optimization of architectures and weights of NN with the aim of the performance of better generalization by making a compromise between low training errors and low architectural complexity and utilized them for specific problems in the medical field that fall within benchmark classification category. The results that they obtained, showed that a PSO-PSO based method indicates an acceptable alternative for optimizing architectures and weights of NNs of MLP. Xue et al. [54] similar to Carvalho and Ludermir tried to optimize weight and architecture simultaneously. They found a variable- length PSO to optimize both the number of hidden nodes and input weights, simultaneously. Particles with various lengths which showed various network configurations can be solved with a new particle update strategy presented in this study.

Many researchers improved the algorithms themselves to optimize architecture. Here are some examples: Carvalho [55], proposed a PSO-PSO method, in which a PSO was employed for optimizing weights that were nested under another PSO which was

#### *Designing Artificial Neural Network Using Particle Swarm Optimization: A Survey DOI: http://dx.doi.org/10.5772/intechopen.106139*

employed to optimize the architecture of FNN by deleting or adding hidden nodes. Next, in 2009, Kiranyaz et al. [56] proposed a multidimensional PSO approach to construct FNN by utilizing an architectural space, automatically. Furthermore, the individuals in the swarm population have been designed in a way that it optimized both the weights and architecture of an individual in every iteration.

PSO for optimizing NN's architecture used by researchers in different areas and topics such as communication theory, civil and medical engineering. PSO has been utilized widely to address the optimization problems existing in communication theory. Das et al. [57] optimized ANN by using PSO for the problem of channel equalization in 2013. In this paper, they used PSO algorithm to optimize all the variables including network parameters and network weights. In fact, they used the PSO to optimize the number of input neurons, hidden neurons, the type of transfer functions, and the number of layers. The novelty in this paper is that they take care of suitable network topology. Extensive simulations proposed in this research showed that, as compared to other ANN-based equalizers as well as neuro-fuzzy equalizers, the proposed equalizer performs better in all noise conditions. An interesting application area of PSO is civil engineering. The application of an improved PSO technique was proposed by Asadnia et al. [58] for training an ANN to predict water levels for the Heshui Watershed. The results showed that the PSO-based ANNs performed better to predict the peak and low water levels compare with the LM-NN model. Additionally, IPSONN had a quicker convergence rate in comparison with CPSONN. In medical engineering, an adaptive CPSO was developed by Zhang et al. [59] to train the parameters of FFNN, with the purpose of accurate classification of magnetic resonance (MR) brain images. The classification accuracy of the presented technique was 98.75% on 160 images.

Many works used basic PSO to optimize NN's architecture. In a study by Chunkai et al. [60], in 2000, the network structure is adaptively adjusted and the PSO algorithm is applied to evolve the nodes of the NN with a specific generated structure. The techniques such as the combination of partial training and evolving added nodes are used to generate the desired architecture and then PSO is employed to evolve the nodes of the predefined structure. In another study in 2013, Wang et al. [61] used the BP NN to build an estimation model for the cost of plastic injection modeling parts to decrease the complication of the conventional procedures of estimating all the costs. They have made an estimation model for costs on the basis of the superior capability in forecasting and diagnosis for BP NN, and the capability of the great solution caused by PSO was utilized to get the parameters for BP NN, such as the number of hidden nodes and layers, initial weight, learning rate, hence learning and training for the network were made to perform better and be more precise. In this study, the sigmoid function was utilized as activation function and transfer function. In 2018, Qi et al. [62] presented a combination of ANN and PSO for forecasting the unconfined compressive strength of Cemented Paste Backfill (CPB). The authors used ANN for nonlinear relationships modeling and also utilize PSO for tuning the ANN architecture. In fact, in this work, PSO optimized the number of neurons and hidden layers. The findings indicated that PSO was efficient for optimizing the ANN architecture. Also, comparing the values of forecast UCS with experimental values indicated that the model of optimal ANN was very precise to predict the strength of CPB.

#### *3.2.2 Hybrid optimization*

In this subcategory, authors employed hybrid methods to optimize the structure and hyper parameters of NNs.

In a study, J Yu et al. [63] presented a new evolutionary ANN algorithm called IPSONet. This algorithm was based on an improved PSO. The improved algorithm utilized parameter automation strategy, mutations, crossover, and velocity reset- ting to enhance the performance of the classical PSO in fine-tuning of the solutions and global search. To solve the design problem of FFNN, the improved PSO was used by IPSONet. They used the improved PSO to evolve simultaneously weights and structure of ANNs by the evolutionary scheme and a specific individual representation. Next, researchers employed hybrid GA and PSO to optimize structure and hyper parameters to obtain a better answer. For example, Juang [64] in 2004, presented a modified PSO Hybrid of GA and PSO (HGAPSO) method that was employed to design NN. In this method, the individuals of the next generation are created not only by crossover and mutation operators but also by PSO. The upper half of the best performing individuals in a population are enhanced using PSO and the other half is generated by applying the crossover and mutations. Unlike GA, HGAPSO removes the restrictions of evolving the individuals within the same generation. In this article, the proposed method is another variation of PSO for fixed structure ANNs where only weights are adjusted.
