**1. Introduction**

ANN has been considered as an intelligent universal mechanism of dealing with function approximation, optimal design, process estimation, and prediction, pattern recognition, and other applications. Because of ANNs adaptability over a range of problems that involve decision making in uncertain situations, it is very attractive and popular amongst researchers. An ANN with many layers between the input layer and output layer is called Deep Neural Network (DNN). A large DNN may have millions of parameters that result in its learning process can take several days or even a month and need powerful hardware facilities. Also, there are several challenges which are required to address. For instance, the selection of the parameters, the structure of the networks, the selection of the initial values and the selection of the learning samples. If ANN is designed with suitable parameters, it can be a powerful tool and lead to reducing learning time, minimizing loss function and make our predictions as accurate as possible. At this time, optimizers come to our aid. The optimizer helps us to build a better model, to improve the training process and some of them prevent to get trap in local optima.

Various methods exist to optimize a NNs. Backpropagation (BP) is one of them and it is used for optimizing Neural Networks [1–5]. BP training algorithm has different forms such as Gradient Descent, Levenberg-Marquardt, Conjugate Gradient Descent, Bayesian Regularization, Resilient, and One-Step Secant [6, 7]. For these algorithms, computational and storage requirements are different, some of these are suitable for an approximation of function and others for recognition of pattern, but they have disadvantages in a way or another such as the size of NN and storage requirements associated with them.

Another method is meta-heuristic algorithms. The objective of meta-heuristic algorithms is to discover global or local optimal solutions that are optimal with low cost. Meta-heuristic algorithms generally rely on various agents such as particles, chromosomes, and fireflies, searching iteratively to discover the global optimum or local optimum. Meta-heuristic is a collective concept of a series of algorithms such as evolutionary algorithm like Genetic Algorithm (GA) [8], naturally inspired algorithms such as PSO [9], trajectory algorithm like Tabu search [10], and etc.

In this paper, the focus is on PSO which is a nature-inspired algorithm for global optimization which can be utilized for solving the black-box optimization problem. Particle swarm is based upon simulation of the behavior of a school of fish or flock of birds. The use of active communication in such schools or swarms is a key concept. PSO like a GA is an optimization tool based upon population (swarm).

The goal of the study is to survey the papers which use PSO for optimizing ANN based on optimizing weights and biases and optimizing hyper parameters. There are some other surveys in this field, optimizing NN with evolutionary algorithms [11, 12], conventional and metaheuristic approaches [13], but this study only focuses on optimization of NN using PSO. In this survey, we try to categorize the existing methods for optimizing NN with PSO and show the role of hybrid and non-hybrid methods in optimization NN with PSO. The paper is organized as follows: In Section 2, Background Review, the architecture of Artificial Neural Network is explained with the backward and forward path for the BP method. Next, a brief overview of the PSO and its implementation is explained. Section 3, presents a review of the previous research related to optimizing ANN using PSO based on two categorizations. Section 4 will review challenges and gaps and finally, Section 5 will draw the Conclusion.
