4.6 Neural network architectures

The manner in which the neurons of a neural network are structured is intimately linked with the learning algorithm used to train the network. This leads to the formation of network architectures. The neural network architectures are classified into distinct classes depending upon the information flow. The different network architectures are: (a) multilayer perceptions, (b) recurrent, (c) RBF, (d) Kohonen self-organizing feature map, etc.

### 4.7 Multilayer perceptions (MLPs)

MLPs are layered (single-layered or multi-layered) feed forward networks typically trained with static back-propagation (Figure 3). Therefore, it is also called as FFBP neural networks. These networks have found their way into countless applications requiring static pattern classification. This architecture consists of input layers, output layer(s) and one or more hidden layers. The input signal moves in only forward direction from the input nodes to the output nodes through the hidden nodes. The function of hidden layer is to perform intermediate computations in between input and output layers through weights. The major advantage of FFBP is that they are easy to handle and can easily approximate any input-output map [37].

4.10 ANN learning paradigms

DOI: http://dx.doi.org/10.5772/intechopen.81369

own, in a kind of self-study without teacher.

right to left in Figure 4) un till minimum error criteria achieved.

network is the Kohonen self-organizing map, KSOM.

4.11 Kohonen self-organizing map (KSOM)

Figure 4.

31

A three layer feed-forward ANN model [7].

Unsupervised learning: Network is provided with inputs but not with desired outputs. The system itself must then decide what features it will use to group the input data. This is often referred to as self-organization or adaption. Provision is made for a task-independent measure of the quality of representation that the network is required to learn and the free parameters of the network are optimized with respect to that measure [32]. The most widely used unsupervised neural

KSOM maps the input data into two-dimensional discrete output map by clustering similar patterns. It consists of two interconnected layers namely, multidimensional input layer and competitive output layer with 'w' neurons (Figure 5).

Broadly speaking, there are two types of learning process namely, supervised and unsupervised. In supervised learning, the network is presented with examples of known input-output data pairs, after which it starts to mimic the presented input output behavior or pattern. In unsupervised learning, the network learns on their

Nonlinear Evapotranspiration Modeling Using Artificial Neural Networks

Supervised learning: It is also called 'associative learning' involves a mechanism of providing the network with a set of inputs and desired outputs. It is like learning with the help of a teacher. The so-called teacher has the knowledge of the environment and the knowledge is represented by a set of input-output examples. The environment is, however, unknown to the neural network. The network parameters (i.e., synaptic weights and error) are adjusted iteratively in a step-by-step fashion under the combined influence of the training vector and the error signal. After the completion of training, the neural network is able to deal with the environment completely by itself [32]. In supervised learning, FFBP NN is the most popular ones. In the FFBP NNs, neurons are organized into layers where information is passed from the input layer to the final output layer in a unidirectional manner. Any network in ANN consists of 'neurons or nodes or parallel processing elements' which interconnects the each layer with weights (W). A three layer (input (i), hidden (j) and target/output (k)) FFBP NN with weights Wij and Wjk is shown in Figure 4. During training the FFBP NN, the initial or randomized weight values are corrected or adjusted as per calculated error in between output and target values and back-propagates these errors (from

Figure 3.

Types of neural network architectures [37]. (a) Multilayer perception; (b) recurrent neural network; (c) radial basis function network.

## 4.8 Recurrent neural networks (RNN)

RNN may be fully recurrent networks (FRN) or partially recurrent networks (PRN). FNN sent the outputs of the hidden layer back to itself, whereas PRN initiates the fully RNN and add a feed-forward connection (Figure 3). A simple RNN could be constructed by a modification of the multilayered feed-forward network with the addition of a 'context layer'. At first epoch, the new inputs are sent to the RNN and previous contents from the hidden layer are passed to context layer and at next epoch, the information is fed back to the hidden layer. Similarly, weights are calculated hidden to context and vice versa. The RNN can have an infinite memory depth and thus find relationship through time as well as through the instantaneous input space. Recurrent networks are the state-of-the-art in nonlinear time series prediction, system identification, and temporal pattern classification [37–39].

### 4.9 Radial basis function (RBF) networks

RBF is a three-layer feed-forward network that consists of nonlinear Gaussian transfer function in between input and hidden layers and linear transfer function in between hidden and output layers (Figure 3). The requirement of hidden neurons for the RBF network is more as compared to standard FFBP, but these networks tend to learn much faster than MLPs [37]. The most common basis function used is Gauss function and is given by:

$$R\_i = -\exp\left(-\sum\_{i=1}^n \frac{||\mathbf{x}\_i - c\_i||^2}{2\sigma\_{ij}^2}\right) \tag{9}$$

where Ri = basis or Gauss function; c = cluster center; σij = width of the Gaussian function. The centers and widths of the Gaussians are set by unsupervised learning rules, and supervised learning is applied to the output layer. After the center is determined, the connection weights between the hidden layer and output layer can be determined simply through ordinary back-propagation (gradient-descent) training. The output layer performs a simple weighted sum with a linear output and the weights of the hidden layer basis units (input to hidden layer) are set using some clustering techniques.

$$\mathcal{Y} = \sum\_{i=1}^{n} w\_i R\_i(\mathbf{x}\_i) + w\_o \tag{10}$$

where wi = connection weight between the hidden neuron and output neuron; wo = bias; xi = input vector.
