2.1. Artificial neural network (NN)

An NN is composed of simple processing units that compute certain mathematical functions (usually nonlinear). An NN consists of interconnected artificial neurons or nodes, which are inspired by biological neurons and their behavior. The neurons are connected to others to form a network, which is used to model relationships between artificial neurons. NN has the ability to learn and store experimental knowledge. It is a computational system with nodes that can be parallel processing.

Each artificial neuron is constituted by one or more inputs and one output. The neuron processing is nonlinear and adaptable. Each neuron has a function to define the output, associated with a learning rule. The neuron connection stores a nonlinear weighted sum called a weight. The inputs are multiplied by weights, and the results go through the activation function. This function activates or inhibits the next neuron.

Mathematically, we can describe the ith input with the following form:

$$\text{input summation: } \mu\_i = \sum\_{i=1} \rho\_{W\_{i:j \times j}} \tag{2a}$$

$$\text{In neuron output: } y\_i = \phi(u\_i) \tag{2b}$$

where x1, x1, . ⋯, xp are the inputs; wi1, ⋯, wip are the synaptic weights; ui is the output of linear combination; ϕ(�) is the activation function, yi is the ith neuron output, and p is number of neurons (Figure 1(a)).

A feed-forward network, which processes in one direction from input to output, has a layered structure. The input layer is the first layer of an NN, where the patterns are presented, the hidden layers are the intermediary layers, and the last layer is called the output layer, where the results are presented. The number of layers and the quantity of neurons in each are determined by the nature of the problem. In most applications, a feed-forward NN with a single layer of hidden units is used with a sigmoid activation function, such as the hyperbolic tangent function (Eq. 3)

$$\varphi(v) = \frac{1 - \exp\left(-av\right)}{1 + \exp\left(-av\right)}\tag{3}$$

The NN has two distinct phases: the training phase (learning process) and the run phase (activation or generalization). An iterative process for adjusting the weights is made for the training phase, where the NN establishes the mapping of input and target vector pairs for the best performance. This phase uses the learning algorithm, i.e., a set of procedures for adjusting the weights. "Epoch" is the name of a training set pass through the iterative network process, testing of the verification set each epoch; the iterative process continues or stops after defined criteria that can be the minimum error of mapping or a determined number of epochs. Once the processing is stopped, the weights are fixed, and the NN is ready to receive new inputs (different from training inputs) for which it calculates the corresponding outputs. The latter phase is called the generalization: each connection (after training) has an associated weight value that stores the knowledge represented in the experimental problem and considers the input received by each neuron of that NN.

Neural network designs or NN architectures are dependent on the learning strategy adopted, see Haykin [19]. The multilayer perceptron (MLP) (Figure 1(b)) is the NN architecture used in Data Assimilation by Artificial Neural Networks for an Atmospheric General Circulation Model http://dx.doi.org/10.5772/intechopen.70791 269

Figure 1. (a) Artificial neural network components and (b) multilayer perceptron.

inspired by biological neurons and their behavior. The neurons are connected to others to form a network, which is used to model relationships between artificial neurons. NN has the ability to learn and store experimental knowledge. It is a computational system with nodes that can

Each artificial neuron is constituted by one or more inputs and one output. The neuron processing is nonlinear and adaptable. Each neuron has a function to define the output, associated with a learning rule. The neuron connection stores a nonlinear weighted sum called a weight. The inputs are multiplied by weights, and the results go through the activation function. This function

input summation: ui <sup>¼</sup> <sup>X</sup>

where x1, x1, . ⋯, xp are the inputs; wi1, ⋯, wip are the synaptic weights; ui is the output of linear combination; ϕ(�) is the activation function, yi is the ith neuron output, and p is number of

A feed-forward network, which processes in one direction from input to output, has a layered structure. The input layer is the first layer of an NN, where the patterns are presented, the hidden layers are the intermediary layers, and the last layer is called the output layer, where the results are presented. The number of layers and the quantity of neurons in each are determined by the nature of the problem. In most applications, a feed-forward NN with a single layer of hidden units is used with a sigmoid activation function, such as the hyperbolic tangent function (Eq. 3)

1 � exp ð Þ �av

The NN has two distinct phases: the training phase (learning process) and the run phase (activation or generalization). An iterative process for adjusting the weights is made for the training phase, where the NN establishes the mapping of input and target vector pairs for the best performance. This phase uses the learning algorithm, i.e., a set of procedures for adjusting the weights. "Epoch" is the name of a training set pass through the iterative network process, testing of the verification set each epoch; the iterative process continues or stops after defined criteria that can be the minimum error of mapping or a determined number of epochs. Once the processing is stopped, the weights are fixed, and the NN is ready to receive new inputs (different from training inputs) for which it calculates the corresponding outputs. The latter phase is called the generalization: each connection (after training) has an associated weight value that stores the knowledge represented in the experimental problem and considers the

Neural network designs or NN architectures are dependent on the learning strategy adopted, see Haykin [19]. The multilayer perceptron (MLP) (Figure 1(b)) is the NN architecture used in

ϕð Þ¼ v

i¼1

neuron output: yi ¼ φð Þ ui (2b)

<sup>1</sup> <sup>þ</sup> exp ð Þ �av (3)

<sup>ρ</sup>Wi�j�<sup>j</sup> (2a)

Mathematically, we can describe the ith input with the following form:

be parallel processing.

neurons (Figure 1(a)).

activates or inhibits the next neuron.

268 Advanced Applications for Artificial Neural Networks

input received by each neuron of that NN.

this study, in which the interconnections between the inputs and the output layer have one intermediate layer of neurons, a hidden layer [14, 20]. NNs can solve nonlinear problems if nonlinear activation functions are used for the hidden and/or the output layers. In this work, during the training phase, the nonlinear activation functions employ the delta rule. Developed by [45], the delta rule is a version of the least mean square (LMS) method. The delta rule algorithm is summarized as follows:


The supervised learning process, the functional to be minimized is treading as a function of the weights wij (Eq. 2) instead of the NN inputs. For a given input vector x, xa NN is compared to the target answer xa ref . If the difference is smaller than a required precision, no learning takes place; on the other hand, the weights are adjusted to reduce this difference. The goal is to minimize the error between the actual output yi (or xa NN) and the target output (di) (or xa ref ) of the training data. The set of procedures to adjust the weights is the learning algorithm back propagation, which is generally used for the MLP training. It performs the delta rule, considering a set of (input and target) pairs of vectors {(x0, d0), (x1, d2), ⋯, (xN, dN)}<sup>T</sup> , where N is the number of patterns (input elements) and one output vector y = [y0, y1, y2, ⋯, yN] T . The MLP performs a complex mapping y =ϕ(w, x) parameterized by the synaptic weights w, and the functions ϕ(�) that provide the activation for the neuron. That is, for each (input/output) training pair, the delta rule determines the direction you need to be adjusted to reduce the error. In the back-propagation supervised algorithm, the adjustments to the weights are conducted by back propagating of the error and the target output is considered the supervisor. Ref. [14] included brief introductions of MLP and the back-propagation algorithm.

The NN applications, generally, are on function approximation of modeling of nonlinear transfer functions and pattern classifications. Refs. [18, 25] reviewed applications of NN in environmental science including atmospheric sciences. They reviewed some NN concepts and some NN applications; these reviews were also for other estimation methods and its applications. Other reviews for NN applications in the atmospheric sciences, looking at prediction of air-quality, surface ozone concentration, dioxide concentrations, severe weather, etc., and pattern classifications applications in remote sensing data to obtain distinction between clouds and ice or snow were presented by [14]. Refs. [18, 25] also presented applications on classification of atmospheric circulation patterns, land cover and convergence lines from radar imagery, and classification of remote sensing data using NN. Data assimilation was not mentioned in such reviews.
