Determination of the Elastic Constants of a Metal-Laminated Composite Material Using Artificial Neural Networks

*Marta Eraña-Díaz and Mario Acosta-Flores*

## **Abstract**

This chapter explores the use of an artificial neural network (ANN) to obtain the elastic constants of the components of a metal laminated composite material (MLCM). The dataset for the training and validation of the ANN was obtained by applying an analytical model developed for the study of stresses in MLCM. The information used in the dataset corresponds to MLCM configurations and data generated with variants registered in the structural presentation of the inputs and outputs. The best configuration found for the generation of the ANN models yielded an average relative error of less than 4% in relation to the results of the constants evaluated and published in a previous article. As shown in this research, it is important to have a clear definition of the problem as well as an effective selection and preparation of the characteristics of the training data during the constitutive modeling of composite materials and the correct application of the ANN.

**Keywords:** elastic constants of laminated composite materials, artificial neural networks, composite materials, constitutive model of composite materials, training dataset

## **1. Introduction**

Artificial neural networks (ANN) are an efficient artificial intelligence (AI) technique applied in several areas such as bioinformatics [1] for classification, function approximation, and knowledge discovery, as well as for data visualization in medical diagnosis [2].

Various numerical models and experimental techniques have been applied in relation to ANN in the investigation and obtention of the mechanical properties of composite materials, such as Young's Modulus (*E*), Rigidity Modulus (*G*), Elastic Limit, and Maximum Tensile Stress [3, 4]. In [5], the different elastics constants of the facecentered cubic austenitic stainless steel are determined. In [6], elastic parameters of an orthotropic material are obtained based on experimental data and using the finite element method (FEM) applied to ANN. The method described in [7] combines the FEM and deep neural networks to obtain constitutive relationships from indirect observations. Acosta et al. [8] use a linear constitutive analytical model proposed in

[9] for the analysis and obtention of elastic constants of laminated composite materials with metallic layers. The elastic constants of a laminate's component are obtained through an axial load experimental test.

In the constitutive modeling of composite materials, the ANN applications'state of the art, [10] exposes the obstacles that have been encountered due to the difficulty of having a large amount of constitutive experimental training data.

This research presents a method to obtain the elastic constants of one of the components of a MLCM using ANN. The amount of data needed for training was obtained using constitutive models of composite materials proposed by [8].

## **2. Artificial neural networks (ANN)**

ANN are a model inspired by the functioning of the human brain and are made up of connected node set (artificial neurons) that transmit signals to each other from an input stage to generate an output, in order, to improve their learning process by automatically modifying each other. There are several types of neural networks [11, 12] including recurrent neural networks (RNN) and feed-forward neural networks. The latter is an artificial neural network where the connections between the units do not form a cycle and where the information only moves forward.

This research used feed-forward ANN, made up of neurons grouped in layers alongside an input layer, one or more hidden layers, and an output layer. Each network neuron has a weight, a numerical value that modifies the received input. The new modified values are output from the neurons, if the output of any individual neuron is above the specified threshold value the neuron fires and sends data to the next layer of the network, otherwise, the data does not go through. This operation can be appreciated in **Figure 1**.

The *Hj* neuron has an assigned weight to each of its inputs (Eq. (1)); the assigned weight by *Hi* to *Hj* is represented as *wij*. The threshold represents the neuron's degree of inhibition, and it is represented as *ai(t)*. Eq. (2) is calculated with an activation function f(t), which can be linear (Eq. (3)), tangential (Eq. (4)), or hyperbolic tan, (Eq. (5)).

$$H\_j = \sum\_{n=1}^{i} w\_{ij} a\_i + B\_i \tag{1}$$

$$a\_i(t) = f\_i(H\_i) \tag{2}$$

#### **Figure 1.**

*Example of a feed-forward ANN configuration with i-inputs (X1 … Xi) in the input layer, bias1 and j neurons in hidden layer one (B1, H1, … , Hj), bias2 and k neurons in hidden layer two (B2, H1, … , Hk), and bias3 in the output layer (B3, Y1, Y2) for the two outputs.*

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

$$f(t) = t \tag{3}$$

$$f(t) = \frac{1}{1 + \mathfrak{e}^{-t}} \tag{4}$$

$$f(t) = \frac{\mathfrak{e}^t - \mathfrak{e}^{-t}}{\mathfrak{e}^t + \mathfrak{e}^{-t}}\tag{5}$$

Thus, each ANN neuron, except those in the input layer and the bias neurons, processes all its inputs and provides its own activation as an output.

Once the ANN has been designed, the training process begins to ensure that the *wij* given by each neuron is set correctly so that the entire network provides an acceptable output.

During this process, the neural network is capable of storing knowledge from a subset of data containing information on both the inputs and their corresponding outputs, which are known as "desired outputs." The network's obtained outputs are compared with the desired outputs, thus updating the synaptic weights (*wij*) so as to reduce the margin of error in the network results. This procedure is repeated until the network reaches a satisfactory performance. One of the used methods to train the ANN is backpropagation [13, 14], where the *wij* update is done by gradient descent, minimizing the mean squared error (MSE) (Eq. (6)).

$$\text{MSE} = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} \left(\chi\_{Pred,i} - \chi\_{act,i}\right)^2} \tag{6}$$

Overfitting, an ANN flaw [15–17], prevents it from obtaining acceptable outputs from unobserved data, that is, those not used in training. Ying X [18] proposes the following strategies to minimize the effects of overfitting: (1) stop training before finding the optimal MSE; (2) exclude any noise in the training set; and (3) expand the training data.

## **3. Methodology of determining the elastic constants of a metal laminated composite material using artificial neural networks**

Obtaining efficient and consistent results when calculating the elastic constants of a MLCM using an ANN with a constitutive model of composite materials, requires a clear and complete understanding of the analytical model presented in [8] for the efficient preparation of the training data and the correct application of the ANN. The methodology used in this work is as follows:


## **4. Analytical model of linear of axial load of composite laminate materials**

This study uses a linear analytical model of a composite laminated material made up of layers of metallic material. It is assumed the laminate components are relatively thin, homogeneous, with elastic and linear properties and that the union between them, is perfect.

There is a global uniaxial stress problem, a homogeneous state of strain is considered throughout the laminate as well as in the layers, each point of the laminate presents a state of plane stress.

At the local level, the problem is each layer is biaxial of stresses and the normal stresses have a constant average distribution throughout the thickness of the layers. The state of plane stress generated at the internal points of each of the layers (local analysis) will be referred to as intralaminar state of stress while the stress components of layer i in directions 1 and 2 will be called intralaminar stresses (*σxi* and *σyi*).

The linear analytical model allows the application of the superposition principle (SP) considering the general problem as a set of individual problems. Therefore, for

**Figure 2.** *Representation of the stress state, global, and local models [8].*

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

each load condition, the state of global stresses (average or total) *σGx* and *σGy* are the sum of the states of individual stresses (local) in each layer, see **Figure 2**. The analytical model's global–local equation (Eq. (7)) is as follows:

$$
\sigma\_{G\mathbf{x}} = n\_I \sigma\_{\mathbf{x}\mathbf{I}} + n\_{\text{II}} \sigma\_{\mathbf{x}\mathbf{II}} + n\_{\text{III}} \sigma\_{\mathbf{x}\mathbf{III}} + \dots + n\_{\text{i}} \sigma\_{\mathbf{x}\mathbf{i}}
$$

$$
\sigma\_{G\mathbf{y}} = n\_I \sigma\_{\mathbf{y}\mathbf{I}} + n\_{\text{II}} \sigma\_{\mathbf{y}\mathbf{II}} + n\_{\text{III}} \sigma\_{\mathbf{y}\mathbf{III}} + \dots + n\_{\text{i}} \sigma\_{\mathbf{y}\mathbf{i}}\tag{7}
$$

*σxi* and *σyi* represent intralaminar stresses and *σGx* and *σGx* are the global average of the stresses in both *x* and *y* directions. *ni* represents the volumetric fraction of material layers, *nI* ¼ *hi=h*, where

$$\mathbf{1} = \mathbf{n}\_I + \mathbf{n}\_{II} + \mathbf{n}\_{III} + \dots + \mathbf{n}\_i \tag{8}$$

The values of *ni* are the volumetric fractions of material with different properties, and *h* and *hi* are both the total thickness and the thickness of the layers or layers groups, respectively.

## **4.1 Definition of the experimental and illustrative example problem and identification of parameters to consider**

The application of the ANN technique requires a data set that helps the network to learn certain patterns related to the analyzed problem. The variables and parameters that will be considered as input and output data during the numerical application of the ANN must be those necessary and sufficient so that the problem is representative. If some key parameters are not considered in the problem, the performed study will be an incomplete and poorly formulated problem, implying a deficient solution.

In a mechanical problem, the state of stress is a function of the position, geometry, boundary conditions, and material. For the discussed problem here, the applied stresses at their *σGx* and *σGy* boundaries were uniformly distributed. Considering the strain state was homogeneous, the state of the plane stresses at a point was independent of the position within each component.

For the analyzed case in [8], which uses a laminated composite material consisting of metallic layers of two different materials (isotropic, homogeneous, and elastic-linear), Eqs. (7) and (8), globally and locally, respectively, are as follows:

$$
\sigma\_{G\mathbf{x}} = n\_1 \sigma\_{\mathbf{x}M1} + n\_2 \sigma\_{\mathbf{x}M2}
$$

$$
\sigma\_{G\mathbf{y}} = n\_1 \sigma\_{\mathbf{y}M1} + n\_2 \sigma\_{\mathbf{y}M2} \tag{9}
$$

$$
\sigma\_{\mathbf{x}M1} = \mathbf{Q\_{11M1}} \mathbf{e\_{x1}} + \mathbf{Q\_{12M1}} \mathbf{e\_{y1}}
$$

$$
\sigma\_{\mathbf{y}M1} = \mathbf{Q\_{21M1}} \mathbf{e\_{x1}} + \mathbf{Q\_{22M1}} \mathbf{e\_{y1}}
$$

$$
\sigma\_{\mathbf{x}M2} = \mathbf{Q\_{11M2}} \mathbf{e\_{x2}} + \mathbf{Q\_{12M2}} \mathbf{e\_{y2}}
$$

$$
\sigma\_{\mathbf{y}M2} = \mathbf{Q\_{21M2}} \mathbf{e\_{x2}} + \mathbf{Q\_{22M2}} \mathbf{e\_{y2}} \tag{10}
$$

And considering the engineering constants:

$$\mathbf{Q}\_{11M1} = \mathbf{Q}\_{22M1} = \frac{E\_{M1}}{\left(\mathbf{1} - \nu\_{M1}^2\right)}$$

$$\mathbf{Q}\_{12M1} = \mathbf{Q}\_{21M1} = \frac{\nu\_{M1} E\_{M1}}{\left(\mathbf{1} - \nu\_{M1}^2\right)}$$

**63**

$$\mathbf{Q}\_{1\text{M}\Omega} = \mathbf{Q}\_{2\text{2M}\Omega} = \frac{E\_{\text{M}\Omega}}{\left(\mathbf{1} - \upsilon\_{\text{M}\Omega}^2\right)}$$

$$\mathbf{Q}\_{1\text{2M}\Omega} = \mathbf{Q}\_{2\text{1M}\Omega} = \frac{\upsilon\_{\text{M}\Omega} E\_{\text{M}\Omega}}{\left(\mathbf{1} - \upsilon\_{\text{M}\Omega}^2\right)}\tag{11}$$

Eq. (10) can be represented as follows:

$$
\sigma\_{\rm xM1} = \frac{\mathbf{E\_{M1}}}{\left(1 - \nu\_{\rm M1}^2\right)} \mathbf{e\_{x1}} + \frac{\nu\_{\rm M1} \mathbf{E\_{M1}}}{\left(1 - \nu\_{\rm M1}^2\right)} \mathbf{e\_{y1}}
$$

$$
\sigma\_{\rm yM1} = \frac{\nu\_{\rm M1} \mathbf{E\_{M1}}}{\left(1 - \nu\_{\rm M1}^2\right)} \mathbf{e\_{x1}} + \frac{\mathbf{E\_{M1}}}{\left(1 - \nu\_{\rm M1}^2\right)} \mathbf{e\_{y1}}
$$

$$
\sigma\_{\rm xM2} = \frac{\mathbf{E\_{M2}}}{\left(1 - \nu\_{\rm M2}^2\right)} \mathbf{e\_{x2}} + \frac{\nu\_{\rm M2} \mathbf{E\_{M2}}}{\left(1 - \nu\_{\rm M2}^2\right)} \mathbf{e\_{y2}}
$$

$$
\sigma\_{\rm yM2} = \frac{\nu\_{\rm M2} \mathbf{E\_{M2}}}{\left(1 - \nu\_{\rm M2}^2\right)} \mathbf{e\_{x2}} + \frac{\mathbf{E\_{M2}}}{\left(1 - \nu\_{\rm M2}^2\right)} \mathbf{e\_{y2}}\tag{12}
$$

Here, *Q*11*<sup>M</sup>*1, *Q*12*<sup>M</sup>*1, *Q*22*<sup>M</sup>*1, *Q*11*<sup>M</sup>*2, *Q*12*<sup>M</sup>*2, and *Q*22*M*<sup>2</sup> represent the material's stiffness constants. The engineering constants for each layer were Young's moduli (*EM*<sup>1</sup> and *EM*2) and Poisson's ratios (*ν<sup>M</sup>*<sup>1</sup> and *νM*2). The deformation states were defined for each layer through their longitudinal strains: *εx*1, *ε<sup>y</sup>*<sup>1</sup> and *εx*2, *εy*2.

## **5. Neural network training process**

### **5.1 ANN arguments**

As mentioned in Sections 2 and 3, the used software to train the ANNs was R [19] and the used library was "neuralnet" [20], the used parameters are shown in **Table 1**.

**Table 1.** *Arguments for the neuralnet function.* *Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

**Figure 3.** *Procedure diagram for the ANN training process.*

The used learning algorithm was resilient backpropagation [21, 22], which modifies the updated values for each weight, *wij* according to the sign sequence behavior of the partial derivative equations in each dimension of the weight space, this reduces the number of steps compared to the original gradient descent backpropagation procedure.

The procedure to obtain a good ANN begins with the generation of the dataset through a normalization process that allows scaling the data values to improve learning. The process utilized scaling over the maximum value of the inputs as seen in Eq. (13).

$$\mathfrak{x}\_{i} = \frac{\mathfrak{x}\_{i}}{\max\left(\mathfrak{x}\_{1}, \dots, \mathfrak{x}\_{n}\right)} \forall i \in \{1, \dots, n\} \tag{13}$$

The best and final dataset built for this study consisted of 253 pieces of data, 76% of which were used for training (data1), 14% for testing (data2), and the remaining 10% for model validation (data3) (192, 36, 25). A final dataset (data4) was built for the ANN application as indicated in Section 7.3 with results as published in [8].

The variants in the arguments for ANN generation in this study were 2 or 3 hidden layers, with either the hyperbolic tangent (tanh) activation functions (Eq. (5)) or the logistic function (Eq. (4)) The number of neurons in each hidden layer was chosen to obtain the lowest RE in both the training dataset and the test dataset. All this is depicted in **Figure 3**.

It should be noted that some variations in the structural presentation of the inputs and outputs were made for the elaboration of the dataset, this was necessary since high MSE values were obtained during the ANN training.

#### **5.2 Generation of training data from the analytical model for the ANN**

As seen in Eqs. (10) and (12), the necessary and sufficient variables that define the plane stress models, based on the stiffness constants and the engineering constants, are:


As seen in Eqs. (10) and (12), the necessary and sufficient variables that define the state of plane stress models based on the stiffness constants and engineering constants are the material concentration of the components in the laminate (*ni*Þ; the stress components of the global state of stress *σGx* and *σGy*; the local states of stress in each component (*σxi* and *σyi*); and the elastic constants of the known components (*EM*1, *EM*2, *νM*1, and *ν<sup>M</sup>*<sup>2</sup> or *Q*11*<sup>M</sup>*1, *Q*12*<sup>M</sup>*1, *Q*21*<sup>M</sup>*1, and *Q*22*M*1Þ.

When the ANN objective is directly to determine the engineering elastic constants of one of the components of the laminate, the input parameters are Eqs. (9) and (12): *n*1, *n*2, *σGx*, *σGy*, *εx*1, *εy*1, *εx*2, *εy*2, *EM*1, *ν<sup>M</sup>*<sup>1</sup> and outputs *EM*2, *νM*2, the EvANN was constructed. And a QANN was constructed for Eqs. (9) and (10): with input parameters *n*1, *n*2, *σGx*, *σGy*, *εx*1, *εy*1, *εx*2, *εy*2, *Q*11*<sup>M</sup>*1, and *Q*12*<sup>M</sup>*1. and outputs *Q*11*M*<sup>2</sup> and *Q*12*M*2*:*

### **5.3 Specification of quantitative ranges of input data**

As described in the methodology, the input data must establish:


$$\varepsilon\_{\rm x} = \frac{\sigma\_{\rm Gx} (Q\_{11M1}\eta\_1 + Q\_{11M2}\eta\_2)}{Q\_{11M1}^2 \eta\_1^2 + 2Q\_{11M1}Q\_{11M2}\eta\_1\eta\_2 + Q\_{11M2}^2 \eta\_2^2 - Q\_{12M1}^2 \eta\_1^2 - 2Q\_{12M1}Q\_{12M2}\eta\_1\eta\_2 - Q\_{12M2}^2 \eta\_2^2} \tag{14}$$

$$\varepsilon\_{\rm y} = -\frac{\sigma\_{\rm Gx} (Q\_{12M1}\eta\_1 + Q\_{12M2}\eta\_2)}{Q\_{11M1}^2 \eta\_1^2 + 2Q\_{11M1}Q\_{11M2}\eta\_1\eta\_2 + Q\_{11M2}^2 \eta\_2^2 - Q\_{12M}^2 \eta\_1^2 - 2Q\_{12M1}Q\_{12M2}\eta\_1\eta\_2 - Q\_{12M2}^2 \eta\_2^2} \tag{14}$$

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

and

$$\varepsilon\_{x} = \frac{\sigma\_{\rm Gx} \left( E\_{M1} n\_1 \nu\_{M2}^2 + E\_{M2} n\_2 \nu\_{M1}^2 - E\_{M1} n\_1 - E\_{M2} n\_2 \right)}{E\_{M1}^2 n\_1^2 \nu\_{M2}^2 + 2 E\_{M1} E\_{M2} n\_1 n\_2 \nu\_{M1} \nu\_{M2} + E\_{M2}^2 n\_2^2 \nu\_{M1}^2 - E\_{M1}^2 n\_1^2 - 2 E\_{M1} E\_{M2} n\_1 n\_2 - E\_{M2}^2 n\_2^2} \tag{15.19}$$
 
$$\varepsilon\_{\gamma} = -\frac{\sigma\_{\rm Gx} \left( E\_{M1} n\_1 \nu\_{M2}^2 + E\_{M2} n\_2 \nu\_{M1}^2 \nu\_{M2} - E\_{M1} n\_1 \nu\_{M1} - E\_{M2} n\_2 \nu\_{M2} \right)}{E\_{M1}^2 n\_1^2 \nu\_{M2}^2 + 2 E\_{M1} E\_{M2} n\_1 n\_2 \nu\_{M1} \nu\_{M1} + E\_{M2}^2 n\_2^2 \nu\_{M1}^2 - E\_{M1}^2 n\_1^2 - 2 E\_{M1} E\_{M2} n\_1 n\_2 - E\_{M2}^2 n\_2^2} \tag{15.10}$$

The maximum and minimum quantitative value of the boundary conditions in the training data was established using the values found in [8] as a reference. Between 1 and 22 MPa for the global input stress. The components concentrations in the MLCM were bounded for values between 0 and 1 for 2, 3, 4, 5, and 6 layers of two metallic components assumed to have the same thickness. **Tables B1**–**B4** in Appendix B show various scenarios evaluated during the study.

The considered scenarios were:


The obtained training data from the model were adjusted so that there was not much difference in the order of the values, the stress was given in *MPa*, the *Q*'s and *E*'s in *GPa* and the strains in *με*.

An EvANN model to determine engineering constants and another to determine stiffness coefficients, QANN model, were presented to contextualize the effect that occurs when an ANN model is trained from simple knowledge or general knowledge, their implications can be seen in Eqs. (14) and (15).

The nomenclature used in the analytical model and the ANN network formulas is shown in Appendix A **Table A1**.

### **5.4 EvANN**

As mentioned above, this ANN was trained using the engineering constants and the R software. The settings for the "neuralnet" function are given in **Table 2**, where the output variables are the second material constants.

Starting from the first dataset training was carried out obtaining a MSE of 1.186e +09 and 4.391 for unnormalized and normalized data. Because of this, the dataset was extended considering a larger number of MLCM configurations with variations in n concentrations and global stress ranges, as well as, for the same mechanical problem, was done an inverted request in the elastic constants of component M1, for one case, and M2 for another.

**Table 3** shows the configured EvANNs and specifies the activation function, the number of hidden layers, the number of neurons in each layer, the MSE, the threshold reached, and the number of steps performed. The dataset used can be found in Appendix B.

#### *Artificial Neural Networks - Recent Advances, New Perspectives and Applications*


#### **Table 2.**

*"Neuralnet" Argument functions for training EvANN.*


**Table 3.**

*EvANN Configurations with different activation function, hidden layer, and number of neurons.*

The third EvANN configuration and its graph is shown in **Figure 4** along with the relative error percentages in **Figures 5** and **6**, the RE was calculated with Eq. (16).

$$RE = |\frac{\text{Real Value} - \text{ANN Value}}{\text{Real Value}}| \ast 100\tag{16}$$

A second ANN was generated for the same problem now defined in terms of Q11 and Q12.

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

**Figure 5.** *% RE training EvANN Configuration 3.*

**Figure 6.** *%RE, test dataset ANN Configuration 3.*

## **5.5 QANN**

A second model with stiffness coefficients was now developed with the results being the coefficients of the second material, Q11M2 and Q12M2, **Figure 7** and **Table 4**. The settings for the "neuralnet" function are given in **Table 4**.

The generated configurations with their respective achieved values are shown in **Table 5**.

The second QANN configuration and its graph is showed in **Figure 7** with the relative error percentages, **Figures 8** and **9**, RE which computes with Eq. (16).

#### **Figure 7.**

*QANN topology image. Input layers and Q11 Material 2 (Q11M2); Q12 Material 2 (Q12M2) in output layer configuration 2.*


#### **Table 4.**

*"Neuralnet" argument functions for training QANN.*

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*


#### **Table 5.**

*QANN Configurations with different activation function, hidden layer, and number of neurons.*

**Figure 8.** *% RE training QANN Configuration 2.*

**Figure 9.** *% RE, test dataset QANN Configuration 2.*

## **6. Neural network validation process**

Once an ANN has been trained and tested, it is evaluated by applying it to an equivalent problem but with different structural values from those used in the training. The results for the different scenarios are in **Table 6** and **Figures 10** and **11**.

Configurations 3 and 2 were selected for the evaluation of both, EvANN and QANN, based on their performance in the test dataset.

**Table 6** and **Figures 10** and **11** depicts the configuration and attributes selected for each ANN.

The RE for each ANN is shown in the plots in **Figures 10** and **11**.


#### **Table 6.**

*Configurations selected for EvANN and QANN.*

**Figure 10.**

*% RE validation dataset EvANN Model.*

**Figure 11.** *% RE QANN validation dataset.*

As can be seen in the tables, the maximum RE obtained are up to 58.5% for the EM2 output and up to 7.5% for vM2.

When looking at QANN, the tables show the maximum RE obtained was up to 4.48% for Q11M2 production and up to 3.71% for Q12M2.

## **7. Results of application**

In this section, the contrasting results of EvANN and QANN RE for data3 and compute with the results published in the article [8], data4 are presented .

### **7.1 Validation process (Data3)**

The training data was expanded and a configuration that is not close to the optimal MSE value was selected to avoid overfitting. For the EvANN, this occurred with

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*


**Table 7.**

*Comparison of results for EvANN and QANN for three materials in dataset 3 for the engineering constants.*

configuration 3, which has a higher MSE than configurations 2, 9, and 10, as shown in **Table 3**. For the QANN, configuration 2 was selected, which has two hidden layers with 12 and 8 neurons, respectively, with the tanh activation function (hyperbolic tangential), as shown in **Table 5**. The selection criterion was to obtain the lowest average RE in the two output variables.

The RE of EvANN and QANN outputs is shown in **Table 7**, where the QANN outputs were converted to engineering constants (Eqs. (17) and (18)).

$$E\_{\rm M2} = \frac{\mathbf{Q}\_{1\rm 1M2}^2 - \mathbf{Q}\_{12\rm 02}^2}{\mathbf{Q}\_{1\rm 1M2}} \tag{17}$$

$$\nu\_{\rm M2} = \frac{\mathbf{Q}\_{\rm 12M2}}{\mathbf{Q}\_{\rm 11M2}} \tag{18}$$

**Figure 12.** *MLCM test tubes used in the final ANN application.*


#### **Table 8.**

*Volumetric fractions of materials in the MLCMs.*


**Table 9.**

*Elastic constants of the MLCM components.*

### **7.2 Real case data (Data4)**

The different MLCMs used in this stage (**Figure 12**) are (1) Aluminum-Brass-Aluminum (A-B-A), (2) Brass-Aluminum-Brass (B-A-B) and Copper-Aluminum (C-A) (**Figure 3**). The properties and volumetric fractions of the materials in the MLCM are given in **Tables 8** and **9**.

## **7.3 Real case data compute in QANN and EvANN**

Continuing with the real case, the contrasted RE of EvANN and QANN results are presented as shown in **Figure 13**. The QANN model shows a better performance.

**Table 10** shows the averages of the RE obtained for each of the outputs when checking the application of the model obtained using QANN for the results in [8].

Configuration two, which has two hidden layers with 12 and 8 neurons in relation to the tanh (hyperbolic tangential) activation function, was selected. The selection

**Figure 13.**

*Comparison of EvANN and QANN results based on engineering constants, EVM2, vM2.*

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*


#### **Table 10.**

*Means and standard deviation of RE for different configurations of the QANN application with the article results.*


#### **Table 11.**

*Average percentages RE for the stiffness and engineering constants.*

criterion was to obtain the smallest average percentage of error in the two output variables, Q11M2 and Q12M2 as in Section 7.1, which obtained smaller RE for data3 using the QANN.

**Table 11** shows the results and the contrast in RE average percentage of all the considered outputs from the final results, obtained for the ANN, for the stiffness constants (QANN) and for the engineering constants (EvANN).

The values of the engineering constants as a function of Q were determined from the identity equations (Eqs. (17) and (18)), and the obtained average for each of the specimens in **Table 12**. The table also shows that the results for each configuration line should be very approximate since these were acquired from linear multiplication, but have variations.

Finally, the QANN configuration that shows the best results during validation was applied to the MLCMs analyzed in [8]. The value of the final average constants is presented in **Table 13**.

## **8. Discussion**

The different conditions described in Section 5, were evaluated as a result of the used study to obtain a trained and efficient ANN, the most important of which were the following:

1.The values of all data used in the training were selected and restricted in such a way that their values, corresponding to the MLCM of [8] (Aluminum-brass-


*Artificial Neural Networks - Recent Advances, New Perspectives and Applications*

**Table 12.**

 *output for each* 

*configuration*

 *line.*

**Young's Modulus (***E***) Poisson's Ratio** ð Þ *υ* **Expt constant Article constant RE % Evaluated constant RE % Expt constant Article constant RE % Evaluated constant RE %** Aluminum-Brass-Aluminum Aluminum 67 72 7.5 65.11 2.8 0.345 0.34 1.4 0.34 1.4 Brass 101 97.6 3.4 86.8 14.1 0.313 0.318 1.6 0.3148 0.6 Brass-Aluminum-Brass Aluminum 67 72 7.5 65.81 1.8 0.345 0.34 1.4 0.33 4.3 Brass 101 97.6 3.4 106.33 5.3 0.313 0.318 1.6 0.311 0.6 Aluminum-Copper Aluminum 67 64.4 3.9 66.41 0.9 0.345 0.33 4.3 0.34 1.4 Copper 109 106.2 2.6 116.8 7.2 0.33 0.32 3 0.3148 4.6

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

### **Table 13.**

*Average final elastic constants obtained with QANN.*

aluminum (A-B-A), Brass-Aluminum-brass (B-A-B), and Copper-aluminum (C-A) were located at a mean value. An observed case that showed the need for this step was in the training validation, the stress values (*σGx* and *σGy*) in the input were out of range compared to those used in the training, the results had larger differences between 20 and 80%, from when these were close to the mean.


were different with variations of up to 30%, when these should be the same. However, uncertainty risks are avoided by averaging the obtained values for each output as shown in **Table 13**.

6.A further important point that showed an overview of simplicity in the training setup was found when evaluating two cases: one in the training phase and the other in the output data request. The first one required the engineering constants while the latter required the stiffness constants. In the first case average RE of 12.54% *E* and 3.15 for *υ* were obtained; in the second case, the RE for *E* and *υ* were 6.18% and 3.57%, respectively. From the above and observing (Eqs. (14) and (15)), it is assumed that the analytical model in terms of *Q*'s is simpler than the model in terms of engineering constants.

## **9. Conclusions**

This chapter, using ANN, establishes a method to determine the engineering constants of metallic laminated composite material layers, shows the importance of adequately defining the problem to be solved, analyzing concepts, establishing scopes and constraints, and selecting sufficient and necessary training parameters, based on the obtained results. The importance of the following was identified by evaluating several scenarios to generate the ANN dataset: (a) the qualitative ranges of the parameters in the input data; recommending that the values of the application data should be in the mean of the training data, (b) variations in the structure of the dataset (different outputs for the same MLCM problem), and (c) simplicity in the dataset; the ANN showed better results when stiffness constants were requested in the output data; the analytical solution, is simpler in terms of stiffness constants than in terms of engineering constants.

Several configurations with different activation functions, number of layers, and number of neurons per layer were tested in the study, finding better results for this problem with a medium MSE when compared with the lowest MSE trained. This action may be due to the fact that there is no overfitting.

Based on this research, it is recommended to use the analytical model applied here to generate an ANN dataset for the study of the constitutive modeling of composite materials in plane stress problems.

## **Nomenclature**


*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

## **Appendix A:**


**Table A1.**

*Nomenclature used for the analytical model and ANN formula.*


**Table B1.**

*Part of the data used in the training, (data1).*

*Artificial Neural Networks - Recent Advances, New Perspectives and Applications*

**Appendix B:**

**80**


*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

> **Table B2.** *Part of the data used in*

 *training results (data2).*


**Table B3.**

*Part of the data used in validation*

 *results (data3).*

**82**


*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

> **Table B4.** *Datausedin*

 *the ANN application*

 *(data4).*

## **Author details**

Marta Eraña-Díaz and Mario Acosta-Flores\* Autonomous University of the State of Morelos, Cuernavaca, México

\*Address all correspondence to: mario.acosta@uaem.mx

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Determination of the Elastic Constants of a Metal-Laminated Composite Material Using… DOI: http://dx.doi.org/10.5772/intechopen.108601*

## **References**

[1] Yang ZR. Machine Learning Approaches to Bioinformatics. Exeter, UK: World Scientific; 2010. p. 336

[2] Al-shayea QK. Artificial neural networks in medical diagnosis. International Journal of Computer Science Issues. 2011;**8**(2):150-154

[3] D'Antino T, Papanicolaou C. Mechanical characterization of textile reinforced inorganic-matrix composites. Composites Part B Engineering. 2017; **127**:78-91

[4] Abbud LH, Al-Masoudy MMM, Hussien Omran S, Abed AM. Experimental study the mechanical properties of nano composite materials by using multi-metallic nano powder/ epoxy. Materials Today: Proceedings. 2021. DOI: 10.1016/j.matpr.2021.06.395. ISSN 2214-7853

[5] Benyelloul K, Aourag H. Elastic constants of austenitic stainless steel: Investigation by the first-principles calculations and the artificial neural network approach. Computational Materials Science. 2013;**67**:353-358

[6] Shin HS, Lee SW, Kim CY, Bae GJ. Neural network based identification of nine elastic constants of an orthotropic material from a single structural test. In: Proceedings of the 21st ISARC; Jeju, South Korea; 2004

[7] Huang DZ, Xu K, Farhat C, Darve E. Learning constitutive relations from indirect observations using deep neural networks. Journal of Computational Physics. 2020;**416**:1-28

[8] Acosta-Flores M, Jiménez-López E, Chávez-Castillo M, Molina-Ocampo A, Delfín-Vázquez JJ, Rodríguez-Ramírez JA. Experimental method for

obtaining the elastic properties of components of a laminated composite. Results in Physics. 2019;**12**:1500-1505

[9] Acosta-Flores M, Jiménez-López E, Rodríguez-Ramirez JA. Modelo para el análisis experimental de esfuerzos intralaminares en materiales compuestos laminados sujetos a carga axial. DYNA-Ingeniería e Industria. 2016;**91**:216-222

[10] Liu X, Tian S, Tao F, Yu W. A review of artificial neural networks in the constitutive modeling of composite materials. Composites Part B: Engineering. 2021;**224**:1-15

[11] Chen M, Challita U, Saad W, Yin C, Debbah M. Artificial neural networksbased machine learning for wireless networks: A tutorial. IEEE Communication Surveys and Tutorials. 2019;**21**(4):3039-3071

[12] Zhang Z. Artificial neural network. In: Multivariate Time Series Analysis in Climate and Environmental Research. Cham: Springer International Publishing; Springer, 2018. pp. 1-35

[13] Li X, Cheng X, Wu W, Wang Q, Tong Z, Zhang X, et al. Forecasting of bioaerosol concentration by a Back Propagation neural network model. Science of the Total Environment. 2020; **698**:134315

[14] Ye F, Wheeler C, Chen B, Hu J, Chen K, Chen W. Calibration and verification of DEM parameters for dynamic particle flow conditions using a backpropagation neural network. Advanced Powder Technology. 2019; **30**(2):292-301

[15] Srivastava N, Hinton G, Krizhevsky A, Sutskever I,

Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research. 2014;**15**(1):1929-1958

[16] Li Z, Kamnitsas K, Glocker B. Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation. In: Shen D, Liu T, Peters TM, Staib LH, Essert C, Zhou S, et al., editors. Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. Cham: Springer International Publishing; 2019. pp. 402-410

[17] Frei S, Chatterji NS, Bartlett P. Benign overfitting without linearity: Neural network classifiers trained by gradient descent for noisy linear data. In: Proceedings of Thirty Fifth Conference on Learning Theory PMLR; 2022. p. 2668–2703

[18] Ying X. An overview of overfitting and its solutions. Journal of Physics: Conference Series. 2019;**1168**(2)

[19] Storey MA, Singer L, Cleary B, Figueira Filho F, Zagalsky A. The (r) evolution of social media in software engineering. Future of Software Engineering Proceedings. 2014:100-116. DOI: 10.1145/2593882.2593887

[20] Fritsch S, Guenther F, Guenther MF. Package 'neuralnet'. Training of. Neural Networks. The R Journal 2010;**2**(1): 30-38

[21] Riedmiller M, Braun H. Rprop: a fast adaptive learning algorithm. Procedures of the International Symposium on Computer and Information Science VII. 1992

[22] Riedmiller M, Braun H. A direct adaptive method for faster backpropagation learning: The Rprop algorithm. In IEEE International

Conference on Neural Networks. 1993; p. 586-591

[23] Maplesoft, 2014, Maple (Release 18). Waterloo, ON: Maplesoft, a division of Waterloo Maple Inc.; 2014. Available online: https://hadoop.apache.org

## **Chapter 6**

## Optical Soliton Neural Networks

*Eugenio Fazio, Alessandro Bile and Hamed Tari*

## **Abstract**

The chapter describes the realization of photonic integrated circuits based on photorefractive solitonic waveguides. In particular, it has been shown that X-junctions formed by soliton waveguides can learn information by switching their state. X junctions can perform both supervised and unsupervised learning. In doing so, complex networks of interconnected waveguides behave like a biological neural network, where information is stored as preferred trajectories within the network. In this way, it is possible to create "episodic" psycho-memories, able to memorize information bitby-bit, and subsequently use it to recognize unknown data. Using optical systems, it is also possible to create more advanced dense optical networks, capable of recognizing keywords within information packets (procedural psycho-memory) and possibly comparing them with the stored data (semantic psycho-memory). In this chapter, we shall describe how Solitonic Neural Networks work, showing the close parallel between biological and optical systems.

**Keywords:** Nonlinear optics, photorefractive soliton, solitonic waveguide, supervised learning, unsupervised learning, Machine Learning, biological neural network, Artificial Intelligence, optical psycho-memory, optical neural network, photonics

## **1. Introduction**

Software artificial intelligence (AI) and the neuromorphic approach, both electronic and optical, are born to reproduce the learning capacity of the biological neural system. AI software has now proved to be fundamental in many fields, although with the limits imposed by the tools used [1]. These represent the pretext for developing neuromorphic hardware capable of overcoming these limitations [2]. Neuromorphic optics has shown great versatility. However, current technologies reproduce only some aspects of neural biology without grasping the overall view. Works such as [3] implement fundamental units capable of reproducing excitability, or spiking properties, while others are focused on synaptic connections [4]. An overview is missing. The biology of the brain [5] teaches us that it is a system with local properties that can have global effects. In other words, learning is a process that affects entire regions of the neural network and manifests itself through a structural organization of the connections between neurons. In this way, real neural maps are built, whose development includes learning and memorization of information. Soliton neural networks (SNNs), exploiting the typical plasticity of photorefractive materials, are dynamic entities capable of self-modifying to process, learn and memorize information. Furthermore, they are able to do so selectively at the information level, exactly as it happens in the human brain. By physically

combining the processing and memory units, SNN networks functionally approach the biological nervous system. Now learning and memorization are two events that occur at the same time through modifications of the spatial geometries.

## **2. Photorefractive solitons and solitonic waveguiding**

## **2.1 Spatial solitons**

The possibility of a beam becoming self-confining and propagating without diffraction was first studied about 50 years ago by E.T. Chiao, E. Garmire, and C.T. Townes [6] who in that year received the Nobel Prize for his studies on the maser and the laser. So they interpreted the phenomenon: "*We shall discuss here conditions under which an electromagnetic beam can produce its dielectric waveguide and propagate without spreading.*" Eight years later, V. E. Zakharov and A. B. Shabat formulated the theory of solitons [7].

The first experimental verification of self-confined beams arrived 13 years later, in 1985, by A. Bartelemy et al. [8] exploiting the Kerr-type nonlinearity of a liquid CS2 cell and, 5 years later in 1990, within a glass planar waveguide [9].

It was immediately evident that the applicability of Kerr solitons was not simple: in fact, the low values of the nonlinearity of the excitable Kerr type in the glass required either very high intensity (GW/m<sup>2</sup> ) or very long propagations (being cumulative), and only planar geometries (Kerr solitons are stable only in 1D and not in 2D geometries). Over the years, it has been clear that these nonlinearities could be exploited only to realize temporal soliton behaviors (pulses without dispersion) in optical fibers by adopting long propagations but not within the chips.

However, in those years, and in particular in 1992–1996, the very first theoretical and experimental works on the formation of spatial solitons in photorefractive materials came out [10–21]. Only later on, at the beginning of the 2000s, bright solitons have been observed in lithium niobate (LN) [22] the most widely used nonlinear material for integrated devices. Since then, spatial solitons in LN have been largely used as waveguides in devices.

However, the first use of solitons as waveguides started early: in 1991, De la Fuente et al. [23] used Kerr solitons as waveguides. Almost 9 years later, E. Fazio et al repeated the same experiment in a glass chip [24] and used spatial soliton interaction for signal processing [25].

### **2.2 Theory of photorefractivity and solitons**

A photorefractive crystal is typically a semiconductor that has a second-order nonlinearity of the electro-optical type, that is, the possibility of varying its refractive index as a function of an applied static electric field. Mathematically this can be represented in terms of the nonlinear polarization intensity vector:

$$\overrightarrow{P}(\boldsymbol{\alpha}) = \varepsilon\_0 \left[ \overrightarrow{\boldsymbol{\chi}}^{(1)} \bullet \overrightarrow{E}(\boldsymbol{\alpha}) + \overleftarrow{\boldsymbol{\chi}}^{(2)} : \overrightarrow{E}(\mathbf{0}) \overrightarrow{E}(\boldsymbol{\alpha}) \right] \tag{1}$$

where *E* ! ð Þ *ω* represents the electric field associated with the light and *E* ! ð Þ 0 the static one. Putting in evidence the light field we obtain

$$
\overrightarrow{P}(w) = \varepsilon\_0 \left[ \overrightarrow{\chi}^{(1)} + \overleftarrow{\chi}^{(2)} \bullet \overrightarrow{E}(\mathbf{0}) \right] \bullet \overrightarrow{E}(w) \tag{2}
$$

which shows that the electric susceptibility, and consequently the dielectric tensor, gets a linear dependence from the static field:

$$
\overrightarrow{\boldsymbol{\varepsilon}} = \boldsymbol{\varepsilon}\_0 \left[ \mathbf{1} + \overleftarrow{\boldsymbol{\chi}}^{(1)} + \overleftarrow{\boldsymbol{\chi}}^{(2)} \bullet \overrightarrow{\boldsymbol{E}}(\mathbf{0}) \right] \tag{3}
$$

for this reason, it is also called the "linear Pockels effect." Typically, the refractive index of crystals is described by an ellipsoid of the type:

$$\frac{x^2}{n\_x^2} + \frac{y^2}{n\_y^2} + \frac{z^2}{n\_x^2} = 1\tag{4}$$

and, as a consequence, its variation is expressed by the variation of

$$\frac{1}{n^2} \text{ (terms)}$$

$$\Delta \left( \frac{1}{n\_i^2} \right) = \sum\_j r\_{ij} E\_j(\mathbf{0}) \tag{5}$$

that corresponds to a decrease of the refractive index:

$$n\_i[E(\mathbf{0})] = n\_{i,0} - \frac{1}{2} \sum\_j n\_{i0}^3 r\_{i\bar{j}} E\_{\bar{j}}(\mathbf{0}) \tag{6}$$

where *i* represents one of the crystallographic directions (x,y,z) and *ni*<sup>0</sup> describes the linear refractive index along the *i-th* direction.

There are two critical points in the discussion that has followed so far:


For these reasons, it is necessary to follow a small procedure, a kind of small trick, to achieve a positive variation of the refractive index capable of self-confining the light: this can be done by applying a bias field to the whole material that lowers its index everywhere, and screening it in a small region where the light is, in order to raise back its value. As a consequence, the bright photorefractive spatial solitons are usually called *screening solitons*. Here is how this happens.

Let us consider a photorefractive medium as a semiconductor doped by a donor medium. Donor states (ND) are usually localized energetically within the energy gap: which means that light can induce electron transitions from the donor states to the conduction bands. Consequently, two charge populations are generated so far: ionized donors, that is holes (ND + ), which are physically localized, that is, are not free of moving because are connected to the physical position of the dopant ions, and electrons, which instead can go everywhere being in delocalized conduction states.

The donor rate equation is:

$$\frac{\partial n\_D^+}{\partial t} = \sigma F n\_D - \gamma n\_d^+ n\_e \tag{7}$$

where σ is the absorption cross-section, F the photon flux, and γ the relaxation rate. The electron rate equation follows the donor one, with the inclusion of the diffusion-conduction terms:

$$\frac{\partial n\_{\epsilon}}{\partial t} = \frac{\partial n\_{D}^{+}}{\partial t} - \mu \vec{\nabla} \bullet \left( n\_{\epsilon} \vec{E} + \frac{k\_{B}T}{q} \vec{\nabla} n\_{\epsilon} \right) \tag{8}$$

where μ is the electron mobility, kB the Boltzmann constant, and T the temperature. Electrons and holes constitute the local charge density ρ:

$$
\rho = q \left( n\_D^+ - n\_\epsilon \right) \tag{9}
$$

which generates, through Gauss's theorem, a local electric field that screens the applied bias:

$$
\vec{\sigma V} \bullet \vec{E}\_{\text{SC}} = \rho \rightarrow \vec{E}\_{\text{local}} = \vec{E}\_{\text{bias}} + \vec{E}\_{\text{SC}} \tag{10}
$$

Applying a bias field along the extraordinary ^*c* the crystallographic direction of a uniaxial photorefractive crystal, the refractive indices get the expressions

$$\begin{cases} \begin{aligned} n\_{\pi} = n\_{\mathcal{V}} = n\_{0} \\ n\_{x} = n\_{e} - \frac{1}{2} n\_{e}^{3} r\_{33} E\_{x-local} \end{aligned} \tag{11}$$

The nonlinear light propagation is then described by the nonlinear wave equation [13]:

$$
\left[\frac{\partial}{\partial \mathbf{x}} - \frac{i}{2k} \left(\frac{\partial^2}{\partial \mathbf{y}^2} + \frac{\partial^2}{\partial \mathbf{z}^2}\right)\right] A(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{ik}{n} \delta n(E\_{local}) A(\mathbf{x}, \mathbf{y}, \mathbf{z})\tag{12}
$$

where the field amplitude, in the case of a self-confined solitonic solution, should be factorized into an amplitude, independent from x, and a propagative term as follows:

$$A(\mathbf{x}, \mathbf{y}, \mathbf{z}) = u(\mathbf{y}, \mathbf{z})e^{i(\alpha t - \Gamma \mathbf{x})} \tag{13}$$

as done for every kind of soliton, not only the photorefractive ones. Many groups have tried to solve analytically Eq. (12) without real success. Semi-analytical solutions are indeed reported in the literature showing that such complex problems can support bright solitons. In order to observe the soliton formation, a numerical integration (*FDTD—Finite Difference in Time Domain*) is performed of all Eqs. (6)–(11). Often, an approximated equation is considered, taking into account the saturable behavior of the nonlinear dielectric constant:

$$\left[\frac{\partial}{\partial \mathbf{x}} - \frac{i}{2k} \left(\frac{\partial^2}{\partial y^2} + \frac{\partial^2}{\partial \mathbf{z}^2}\right)\right] \mathbf{A} = -\frac{\boldsymbol{\in}\_{NL} E\_{bias}}{\mathbf{1} + \frac{|\boldsymbol{A}|^2}{|\mathbf{A}\_{SAT}|^2}} \mathbf{A} \tag{14}$$

### **2.3 Experiments on photorefractive solitons**

The experimental set-up for spatial solitons is shown in **Figure 1** [22]. A laser beam (soliton beam) is focused down to about 10–12 μm FWHM onto the input face of a

**Figure 1.** *Experimental set-up for screening photorefractive solitons [22].*

sample. To generate suitable refractive index modulation, the sample must be biased along its optical axis.

The value of the electric field to bias strongly depends on the crystal type and its electro-optic coefficient: for example, using strontium barium niobate crystals (SBN) which have a very high electro-optic coefficient, the electric field ranges from a few hundred V/cm up to some kV/cm [26]; lithium niobate (LN) has a lower electro-optic coefficient and requires several tens of kV/cm [22]; in materials with high optical activity like Bi12SiO20 (BSO), the applied bias must be as high as 55 kV/cm or higher to induce the light to reach a nonlinear polarization regime and to self-confine [27–29]. Chauvet et al. [30, 31] proposed an interesting innovative solution for the bias application: induce an internal electric field by applying a thermal gradient and take advantage of the pyroelectric effects that some crystals have, for example, LN. Indeed, this is a major improvement in the technology, as it eliminates any conductive contacts/plates, thus leaving the sample completely free and accessible from all sides for further applications.

Background illumination can be provided also, to stabilize the solitonic beam during propagation (i.e., prevent beam self-deflection [32–35]).

Finally, an optical imaging system is placed after the sample to monitor the output face of the sample using a camera. The typical evolution of the soliton formation is shown in **Figure 2** where the light intensity at the output phase is shown.

A key feature of photorefractive solitons is the very low power required for their writing. Photorefractive solitons require very low powers, of the order of microwatt in continuous [36]. This means that they can be made both with coherent light from continuous or pulsed lasers at the fundamental or second harmonic frequency [37–40], even in the femtosecond regime [41, 42], and with incoherent light from fluorescent bulbs [43] or even ion fluorescence [44].

E. Fazio et al. [22] have shown experimentally that the solitonic solution gives a hyperbolic transverse profile which can be easily identified by plotting the transverse intensity distribution in a semi-log scale (**Figure 3**). A laser beam has usually a Gaussian profile that, in a semi-log plot, gets a negative parabolic shape. As soon as it evolves into a soliton, the Gaussian profile rearranges into a hyperbolic one. This transformation can be monitored in the semi-log graph, where the hyperbolic profile gets a triangular shape (a linear rise and fall tuned together on the vertex).


#### **Figure 2.**

*Experimental images of the soliton formation and stabilization [22].*

**Figure 3.** *A Gaussian laser beam modifies into a hyperbolic secant beam when becomes spatial solitons [22].*

### **2.4 Photorefractive soliton waveguiding**

Among the possible applications of soliton beams, one of the most important is their use as waveguides. Compared to traditional techniques, writing solitonic waveguides has many advantages in terms of construction costs, 3D geometries, propagation characteristics, and time durations.

Regarding costs, solitonic waveguides can be written with extremely low laser powers and above all in continuous mode: therefore, practically at no cost, since they can also be written by laser diodes for a few euros.

With regard to 3D geometries, a soliton guide can be written in any position within a nonlinear substrate, allowing full exploitation of the entire available volume. This


*Measured at 800 nm with 75 fs pulses within waveguides written at 514 nm<sup>2</sup> Measured at 800 nm with a CW laser beam*

#### **Table 1.**

*Performances of typical photorefractive solitonic waveguides.*

was not possible before with traditional waveguide construction techniques, which act mainly on the surface of the substrate or at most by penetrating a few microns.

Regarding the propagative characteristics, the performances of a solitonic guide are amazing, significantly improving the specifications of traditional waveguides.

**Table 1** shows some characteristic values of a soliton guide made of lithium niobate. As you can see, the waveguides are relatively wide and with a rather low refractive index contrast. These factors are related to the applied bias electric field: Low fields originate wide beams with a modest contrast; very high fields can originate narrow beams and, consequently, high refractive index contrasts.

However, the fundamental characteristic of soliton waveguides is linked to the propagative losses, extremely low in the order of 0.07–0.04 dB/cm (limit of measurability), much lower than commercial waveguides (a guide obtained by ion exchange typically has 1 dB/cm as propagative losses). This factor is related to the nature of solitons: unlike traditional guides in which the index profile is made artificially, in this case, it is precisely the light that chooses the best index profile to be able to propagate self-confined, that is, without diffraction. This leads to ultra-very low losses and low modal dispersion (since the guides are almost single-mode).

Another fundamental characteristic of solitonic guides is their transient, permanent or semi-permanent character: using substrates with a very rapid dielectric relaxation and/or using thin films, as soon as the writing light is turned off the associated guide disappears, with times even of a few nanoseconds. By using substrates with extremely slow dielectric relaxations [45, 46], waveguides can survive for a long time, even months. When writing solitons with very intense femtosecond pulses, the material can undergo permanent changes and the waveguides no longer erase.

## **3. Stigmergy, reinforcement learning, and photorefractive plasticity**

### **3.1 Stigmergy**

Stigmergy was first proposed by French entomologist Pierre-Paul Grassè in the1950s when studying the activities of social insects [47]. The word Stigmergy is a combination of the Greek words ``stigma" (outstanding sign) and ``ergon" (work), signifying that some activities of agents are prompted by external traces, which themselves are generated by the agent's activities [48]. Stigmergy allowed Grassè to explain how insects with fractional intelligence, without obvious communications, can collaboratively engage in complex tasks, such as building a nest simply by following very naive rules. In general, the paradigm of social insect societies is a distributed system that, despite the lack of sophistication of their individuals, offers a highly structured social organization. For instance, as a result of this organization, ant colonies can carry out complex assignments that in some cases are beyond the capacities of a single ant [49]. A study of their behavior indicates that in the heart of their commotional random movements, there can be seen the trace of a series of behaviors that are driven by repeated stimulus-response cycles [50]. For example, when searching for food, ants initially explore the area surrounding their nest randomly and while moving, they leave a chemical pheromone trail on the ground (**Figure 4**). Once an ant finds a food source, it evaluates the quantity and the quality of the food and carries some of it back to the nest [52].

During the return trip, the quantity of pheromone that an ant leaves on the ground may depend on the quantity and quality of the food. The pheromone trails will guide other ants to the food source and subsequently, the shortest path to the food source will be reinforced as the result of a higher probability of feedback concerning the long paths [51]. This environment-intermediated type of communication has captivated researchers in many dissimilar fields. For example, it can be referred to all those protocols for the optimization of multi-variable problems known as genetic algorithms, which exploit the rules of genetics to solve mathematical problems with many independent variables, or neural networks, mathematical systems that base the calculation on a "learning" database that the system has previously prepared. All these typical problems that would require smart signal processing, are called "reinforcement learning" [53]. This expression is commonly used in computer science to describe those algorithms "of machine learning inspired by behaviorist psychology, which is

#### **Figure 4.**

*Basic scheme of the search for food by the ants. The system is based on the two fundamental decision-making principles of following a trace of pheromone and of changing track when a more marked one is met [51].*

connected with how software agents ought to take actions in an environment to amplify some impulse of cumulative reward [54].

## **3.2 Reinforcement learning**

Reinforcement learning concerns neural networks or artificial intelligence protocols that self-set by reinforcing specific information identified by feedback in the system in order to solve complex problems. This procedure is indeed inspired by nature, adopting its Stigmergy in order to transfer information in decentralized systems, thus realizing distributed cognitive processes through many small, simple elaborations [55]. The basic idea of reinforcement learning is to consider the feedback derived from the dynamic interaction of the learning agent with the surrounding environment. Guiding autonomous agents to act optimally through trial-and-error interaction with the corresponding environment is the primary goal in the field of artificial intelligence and is regarded as one of the most important objectives of reinforcement learning [56]. During the learning process, the adaptive system tries some actions (i.e., output values) on its environment, then, it is reinforced by receiving a scalar evaluation (the reward) of its actions feedback [57]. As a result, the reinforcement learning algorithms selectively retain only the outputs that maximize the received reward because of the higher repetition rate over time [53].

Unfortunately, software-based protocols need solution times that increase exponentially with the size of the problem; after many years of research, no improved algorithm has been found to solve these problems within a polynomial time using a deterministic Turing machine. For this reason, hardware approaches have been proposed in the past [58, 59]. Among all, optical solutions to supercomputing seem to win for versatility [60] in terms of increased fan-in and fan-out, energy consumption, and recursive preprocessing. However, the proposed optical solutions [61, 62] neither reduce the complexity of the problem nor offer technologically efficient procedures without exponentially increasing the demand for physical resources [63].

Very recently, an alternative approach was proposed to realize photonic hardware able to perfectly simulate the Stigmergy processes adopted by ants searching for food. This alternative approach was published in the paper by M. Alonzo et al. entitled "All-Optical Reinforcement Learning in Solitonic X-Junctions [55]." In this work, the pheromone trajectories are represented by paths of the light through a nonlinear photorefractive material and the trajectory of the light is represented as the modified refractive index of the host material. Such modifications behave as induced waveguides, that is, regions that confine optical information which can travel inside them without being dispersed (as signals in optical fibers). The refractive contrast between the induced channel and the surrounding medium depends on the intensity of the writing beam. Consequently, it behaves like the pheromone quantity in the ant's path: it can be strengthened or weakened with the writing light intensity. This decisionmaking process can be represented by a nonlinear modulation of the crossing point between these paths. The strengthening of one path in an X-crossing point would correspond to making it a preferential trajectory, where the light will be conveyed more easily. It behaves as a channel of water whose banks have been made deeper and therefore more capacious: when two channels meet, more water will flow into the deepest channel rather than inside the shallowest one. Such addressable behavior has been induced into a nonlinear optical X-junction. The junction has been realized by injecting two absorbed beams that cross each other in the middle of the host photorefractive medium. Each beam modulated the refractive index of the host

medium according to its intensity. A signal beam (unable to modify the host medium) is injected inside one channel and consequently reaches the X junction. It represents the information that propagates inside the photonic structure. If the writing beams have the same intensities, the junction is perfectly symmetrical, meaning that 50% of the information beam emerges from one channel and 50% from the other one (**Figure 5**). When the writing light is unbalanced or writing feedback is injected from the output, the X-junction switches to an asymmetric behavior, for which 80% of the information beam is now conveyed inside the strengthened channel and the remaining 20% remains in the weaker one.

## **3.3 Photorefractive plasticity**

In neuroscience, this phenomenon is the basis of the selective memorizingforgetting process that characterizes the memory of the events in the brain [64]: information pieces that are no longer reinforced will gradually be lost concerning recently reinforced ones. This capability arises owing to the considerable plasticity of the individual building block of the nervous system which allows animals to adapt to changing internal and external environments. During development, learning, and ongoing behavior, individual neurons, synapses, and circuits form short-term and long-term changes as a result of experience. This is the basis of the learning in a neural network which governs neuroplasticity, that is the ability of a system to modify the synaptic interconnection network according to its own needs, both to carry out "reasoning" and to recover unused areas (e.g., reusing regions that are inhibited due to trauma or injury) [65]. Neuroplasticity occurs at all levels, from the behavior of a single ionic channel to the morphology of neurons and large circuits and over

**Figure 5.**

*Numerical simulation (top) and experimental results (below) of a stigmergic photonic x-junction [51].*

timescales ranging from milliseconds to years [66]. Each level of the elementary unit is connected in parallel to each other, which performs simple operations of the storage and processing of information in successive cascading levels. Similar to the performance of the ants in the colony the information processed by a group of neurons is sent to the next level of neurons by opening and/or closing specific synaptic interconnections. In this way, the memory and subsequent reasoning consist of trajectories within the network, the mapping of which represents the set of stored information, which can be kept over time or deleted as needed. This is the way of "learning" and "remembering" a biological neural network. Any further information will follow its path: if the new path coincides with an active trajectory, then the information will be recognized, otherwise, the signal will be blocked, sooner or later, by inactive synaptic interconnections [67].

In this way, the neural network simultaneously remembers and processes, in a spatial coexistence where, the traditional computers cannot do this: in fact, they are based on the Von Neumann architecture which provides one or more processors connected to various external, separate peripherals, including memory. Whenever the computer needs information, it must access the memory to take and bring data back to the processor. This operation requires machine time and costs, in terms of energy consumption. Whereas, the neuromorphic paradigm, on the other hand, wants to unify the two areas of processing and memory, as happens in the biological field. However, overcoming the dichotomy between processing and memory is possible by creating neuromorphic architectures. By exploiting the typical functional geometries of the nervous system, information can be stored and processed in the same physical location, unifying memory and processor. In 2011, C. David Wright introduced the use of PCM for arithmetic and bio-inspired calculation [68], and provided the principle experimental proof of "processor" based on PCM for the first time, demonstrating the four basic operations of addition, multiplication, division and subtraction, and storing the results at the same time. In the same year, D. Kuzum reported new nanoscale electronic synapses based on PCM for optical data storage and non-volatile storage [69]. Continuous resistance transitions in PCM [70] and saturable absorber composite materials are used to simulate the properties of biological synapses so as to realize synaptic learning rules [71]. In 2017, Alexander N. Tait of Princeton University published a paper referred to neuromorphic silicon photonics, introducing the world's first integrated photonic neural network [72], It uses a neural compiler to program a silicon photonic neural network with 49 nodes, each node operates at a specific wavelength, light from each node is detected and summed before it is fed into the laser, then the output will be feedback to create a feedback loop with nonlinear characteristics. Tait et al. simulated traditional neural networks demonstrated how photonic neural networks can solve differential equations and found that photonic neural networks using silicon photonic platforms can be connected to ultrafast information processing environments for radio control and scientific computation.

It should be noted that most of the platforms that have been introduced as photonic or electronic neural networks are fixed structures that rigidly perform the calculation without the capability of changing the interconnections as requested [73]. This last aspect requires the use of modifiable plastic materials and/or devices, that is, capable of assuming different behaviors depending on the information to be stored. In these structures, the configuration of the neurons and their interconnections are written and predefined. So, they are only capable of doing certain limited functions, whereas the biological neurons can dynamically modify the interconnections in the procedure of training. They can establish new interconnections or if requires they can diminish or

strengthen the weight of specific interconnections in synaptic points. Recently, inspired by the biological brain, reinforcement learning methods based on the memory of past experiences have been realized in photonic platforms via solitonic interconnections. Thanks to the plasticity of photorefractive materials, the light beam itself is able to locally vary the refractive index of the host material and create a channel within which it can propagate without diffraction. This solitonic signal will change the refractive index of the medium similar to the pheromone-mediated indirect communication. Obviously, the repetition and intensity profile of the incoming signals will affect the formation of the waveguide channel by exploiting the nonlinearity of the refractive index. This channel can also be used by other beams that recognize it as a waveguide. Also, these interconnections have a specific lifetime and their weight's strength is dependent on the extent of their exploitation. The interconnection's existence and strength is a self-driven process in which the signal itself can reconfigure its pathway by its occurrence redundancy. Consequently, any interconnection which is not activated for a long time will be diminished and taken out of the computation cycle, at the expense of the highly exploited ones. Depending on the material used, this waveguide will then cancel itself completely when the writing light is switched off (rapid dielectric relaxation) or survive for a shorter or longer time (slow dielectric relaxation).

The solitonic guides are, therefore, completely plastic guides, which are induced by a modification of the material and can be suitably shaped by other light passing through them. Now there is no artificial neuroplastic hardware, that its networks be able to reorganize themselves autonomously, although this is the only way to reproduce artificial systems similar to biological ones. An extremely promising way to achieve them is represented by soliton optical neural networks, able to exploit the plasticity of the refractive index to create circuits whose interconnections can be activated or inhibited as required by the information to be stored or processed. In 2018, a collaboration between Sapienza and Nanyang Technological University in Singapore demonstrated that X-junctions formed by soliton waveguides learn information [55]. Recently, it has been shown that X-junctions can perform both supervised and unsupervised learning, behaving as if they were neurons that fully exploit the plasticity of the substrate both to write the circuit and to post-modification based on the evolution of the system [74]. By exploiting the X junctions as elementary units, it is possible to create complex neural networks capable of storing information as specific trajectories within the circuit network [75].

## **4. Solitonic X-junctions as photonic neurons: Supervised and unsupervised learning**

The solitonic neuron is a device capable of reproducing the fundamental characteristics of the learning and memorization processes typical of biological neurons. From a biological point of view, the neuron, a fundamental unit of the nervous system, is a dynamic unit capable of self-assembling and self-modifying according to the information that arrives. These structural changes are the mirror of the unfolding of learning and memorization [76]. The capacity for self-organization is not local, that is, it does not affect the individual units independently as if they were noncommunicating structures. Whenever a certain type of information presents itself at the gates of the nervous system, through the different types of receptors of which it is composed, an enlarged (global) mechanism is set in motion, influencing pathways within the nerve mapping, affecting neurons through connections both in parallel and

## *Optical Soliton Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.107927*

in series. This characteristic interconnectedness underlies the functional complexity of the nervous system and at the same time represents its strength. This is why an event, at a precise point on the neural map, can trigger a succession of changes culminating in a complete reorganization of entire neural regions. One of the properties we have already talked about previously but that needs to be brought to attention perhaps with greater energy is the concept of plasticity. This is a key feature for various reasons. A good approximation of the concept of plasticity can be defined through the expression "dynamic self-organization" [44]. It is typical of systems that do not remain identical to themselves neither at the functional nor at the structural level [77]. More precisely plastic hardware overlaps these two nuances: function becomes synonymous with structure. This is one of the fundamental properties of biological neural tissue. Conceiving the implementation of an artificial hardware neuron that works on the biological model, there are therefore some characteristics that should be kept in mind. First of all, it must have a dynamic structure that is able to adapt to the evolution of the environment and provide a response to it in a nonlinear way. Furthermore, it is necessary to keep the chronology of the information processed at the same time as the analysis and learning operations. A schematic of the functional blocks which characterize the "*modus conoscendi*" of a neuron is shown in **Figure 6a**.

We can highlight a tripartite structure: the neuron receives signals through the dendrites, small branches acting as input channels. The information is collected through these and conducted to the soma, the central body of the neuron which acts as a real microprocessor. Here the signals are "read" and analyzed through weighing and comparison operations with respect to a threshold value. A signal above the threshold is highly informative so it must be stored and propagated along the neural mapping. On the contrary, a sub-threshold signal is judged not important at the informational level and, therefore, its transmission is stopped [78]. The axon is a long channel with the task of carrying the signal out and distributing it to the neurons that follow through special connections called synapses. These are the basis of the communication between different units and correspond to the entities that allow the realization of complex neural mapping. The soliton photon neuron, which the research group of the

#### **Figure 6.**

*(a) Fundamental scheme of a biological neuron. (b) the solitonic neuron X-junction structure. (c) Perfectly balanced X-junction.*

Smart and Neuro Photonics Lab has designed and built, has a functional geometry very close to the one just described. A soliton neuron [55] is characterized by an Xjunction structure [79], as shown in **Figure 6b** obtained through the intersection of self-written waveguides by two self-confining and non-diffracting laser beams. Using the technology of spatial solitons obtained through the Pockels effect [13, 80], the writing takes place through a local variation of the refractive index induced by the incoherent laser light beams. All materials with a saturating nonlinear electro-optical coefficient can be used. The input channels functionally represent the dendrites that collect the input signals. The soliton soma coincides with the region in which the two laser beams progressively approach until they overlap. It is in this region, which by assonance with the ML models we call the solitonic node, that the nonlinear energy transfer between the channels takes place which, as we will see shortly, allows the learning process. The output channels, which allow a subsequent redistribution of the propagated signal, replicate the functional action of the axon. In order for the soliton soma to form and be active at a functional level, it is necessary that the laser beams arrive at the input face of the crystal at an extremely small angle with respect to the normal, between 0.8° and 1°. For different angles, the node is characterized by an area that is too limited which determines a low coupling between the waveguides. The soliton neuron can perform supervised and unsupervised learning tasks [55, 74]. From a theoretical point of view, supervised learning is performed using a fundamental truth, or in other words, there is prior knowledge of what the output values to learn should be [81–83]. If the learning is unsupervised, on the contrary, there is no a priori knowledge of the desired output, which is identified at the same time as learning [84, 85]. The substantial difference lies in the way in which the already written waveguide structure is modified. In the supervised case, indeed, it is necessary to know the target and therefore to guide the learning. The X junction is modified using a feedback system that locally alters the refractive index contrast, depending on the information received, through successive cycles (**Figure 7**). This mechanism is fully

#### **Figure 7.**

*The X-junction neuron switches from the balanced outputs (a) to the unbalanced behaviors, either due to feedback on the alpha channel (b) or due to feedback on the beta channel (c). Learning dynamics of the solitonic junction: starting from the initial neutral condition 50/50, the junction recognizes the input and switches accordingly.*

explained in [74], where a numerical code FDTD solving the nonlinear equation (Eq. (1)) reported below, shows the morphological evolution of the neuron (see **Figure 7**), index of learning what is happening.

$$\nabla^2 \mathbf{A}\_i = -\frac{\boldsymbol{\Xi}\_{NL} \boldsymbol{E}\_{bias}}{\mathbf{1} + \frac{\left|\boldsymbol{A}\_1\right|^2 + \left|\boldsymbol{A}\_2\right|^2}{\left|\boldsymbol{A}\_{SAT}\right|^2}} \mathbf{A}\_i \tag{15}$$

where **ε**NL is the nonlinear dielectric constant, Ebias is the electrostatic bias field that allows the formation of photorefractive solitons and j j *ASAT* **<sup>2</sup>** the saturation intensity. In this type of learning, only the A1 and A2 beams are able to excite the nonlinearity underlying the index modification. The information signal is indeed represented by a laser with a different wavelength, with respect to which the refractive index is not sensitive. The initial situation is represented by a perfectly balanced X junction, as shown in **Figure 7a1**, which is characterized by a symmetrical structure obtained by using two laser beams at the same power input. The injected signal, having reached the solitonic node, "perceives" the same index and divides itself perfectly 50% into the two output channels. By using different power ratios in the writing phase, it is possible to build asymmetrical structures **Figure 7a1** and **7b3**. In this case, the index will begin to differentiate already within the area of the soliton soma and will result in an unequal division of the input information in the two outputs. However, the soliton neuron is also able to perform unsupervised learning tasks. In this case, the refractive index of the crystal is also sensitive to the wavelength of the signal, which, by propagating within the previously written structure, is able to change it. The information becomes directly responsible for the asymmetrization of the junction. For unsupervised learning, the Helmholtz equation becomes:

$$\nabla^2 \mathbf{A}\_i = -\frac{\boldsymbol{\Xi}\_{NL} \boldsymbol{E}\_{bias}}{\mathbf{1} + \frac{|\boldsymbol{A}\_1|^2 + |\boldsymbol{A}\_2|^2 + \eta |\boldsymbol{A}\_3|^4 \left(\mathbf{1} - \boldsymbol{e}^{-\frac{\mathbf{1}}{\overline{\tau}}}\right)}{|\boldsymbol{A}\_{SAT}|^2}} \mathbf{A}\_i \tag{16}$$

where A3 represents the information signal and η an efficiency coefficient for the nonlinear process that depends on the wavelength and the material used. These structural variations can be the result of numerous successive propagation cycles or single events characterized by much higher powers. This is another point of similarity with the biological case. The biological signal, called spike, is propagated toward the axon when the combination of input signals is above the threshold. This can occur as a consequence of the accumulation of numerous inputs, spike trains, in a limited time interval, or by virtue of a very intense signal.

In the solitonic case, learning is, therefore, identified with the process of changing the refractive index and therefore has its own physical translation. What about memory? Many neuromorphic implementations, both in electronics and in optics, have achieved remarkable results in the reproduction of a neural system, however, there is always a great difficulty in defining a memory that is present at the same time as the processing unit. The soliton X junction introduces a new paradigm in the field of neuromorphic research, approaching the nature of biological neurons. The index modification is in general a semi-permanent property with times that depend on the particular material used. The input information is therefore saved in the particular morphological structure obtained during the learning phase. In, the authors show the possibility of building soliton neurons in bulk LiNbO3 crystals. This represents the first supervised realization. The neuron is able to convey information, represented by a

signal with a different wavelength, traveling within the waveguides in the directions declared by the local refractive index. Starting from these results and integrating them with the technology of spatial solitons in thin films in lithium niobate [86], in [87] the possibility of implementing soliton neurons in 8 μm films of lithium niobate is demonstrated. This technology brings with it numerous new benefits. First of all, its extreme compactness makes them a useful tool for integration into small devices. Furthermore, the Lithium Niobate layers show focusing dynamics of two orders of magnitude faster than the bulk counterparts. Finally, the films offer greater control over the propagation of the beams within the crystal, ensuring considerable precision which results in greater coupling and, ultimately, in a more performing soliton soma. By virtue of their plastic behavior, soliton X-junction neurons can be interfaced in more complex structures, to give rise to complex neural mappings capable of functionally replicating biological neural tissue. This perspective represents the great innovation of the soliton neuromorphic, which is not limited to reproducing a unity or a connection, as in previous neuromorphic models, but is able to reach a higher and more complete level of complexity, through the realization of a whole neural environment.

## **5. Bit-to-bit data storage and recognition**

Solitonic neurons can be interconnected to form complex neural maps, and soliton neural networks (SNNs) [75]. Their functioning is based on the movement of photogenerated electrical charges that assume the same role played by neurotransmitters in biological neural networks (BNN). Both regulate the intensity with which a synaptic connection, solitonic or biological, is built, modified, or destroyed. Furthermore, the solitonic synapse, exactly as in the biological case, is the basis of the memorization processes. The repetition of information results in synaptic strengthening, which is synonymous with information memorization [88]. Therefore, learning and memorization are processes that occur through structural changes. In **Figure 8**, the summary diagram of the functionality of the BNNs and SNNs.

### **Figure 8.**

*Functional diagrams on the left of a BNN network and on the right of an SNN network. Both are able to selfmodify their structure according to the information signals received to process and store them in precise neural patterns [75].*

Recently, an SNN has been studied which is able to carry out a 4-bit recognition. It is formed by X-Junction channels written with equal power beams in order to create 50–50 junctions. SNNs are divided into successive layers as reported in **Figure 9**.

The first layer corresponds to the input face and is characterized by a number N of channels corresponding to the number of information (bits) to be processed and to the number of incoming laser sources. The SNN exploits the phenomenon of total reflection at the edge in correspondence with the even layers, which are therefore characterized by N/2–2 neural units. While the odd layers are characterized by N/2 fundamental units.

This network is able to learn by switching the propagating signal between the two outputs of each X-Junction. By appropriately increasing the size of the matrix it is possible to obtain the representation of any SNN. Each channel, therefore, has its own weight which is modified over time based on the information received as reported in equation.

$$\underset{\rightarrow}{Y} = \mathcal{W}\left(E\_{BIAS}, \underset{\rightarrow}{X}\right) \cdot \underset{\rightarrow}{X} \tag{17}$$

For an in-depth analysis of the SNN, we recommend reading [75].

An SNN network, at present, is able to perform an Episodic recognition. This term derives from psychology studies that have made it possible to identify three ways of working with memory, episodic, procedural, and semantic [89]. Memory is of an episodic type if it records an event photographically, that is, it fails to decontextualize the subjects present [90]. Let us consider the picture of a dog running in the mountains. The dog subject is recognized only in that environment, mountains, and in that position, running, if moved then it will be identified as different. Procedural memory, on the other hand, identifies a mechanism and learns its rule. Finally, semantic memory contains these mechanisms within itself, thus reaching abstraction through the analysis of details.

#### **Figure 9.**

*Structure of a 4-bit SNN network. W is the weight of the junction point. In particular, W(1) is the weight relative to the node of the first solitonic neuron in layer 1, W(2) is the weight relative to the node of the second solitonic neuron in layer 2, and so on. The information inputs are represented by xi while yi the processed signals [75].*

The solitonic technology has allowed, until now, to successfully realize an episodic memory able to save information through precise neural mapping. As information flows into the SNN it modifies the refractive index of the network, determining precise paths.

Learning the SNN takes place in two stages. The first phase, defined as Training, consists in administering the information pattern to be learned to the network several times. The network changes morphology accordingly. Then we try to understand how profound the changes have been, or how much the information has been learned and memorized. This phase is called validation. The network acts as a filter letting only the saved information propagate.

**Figure 10** was realized starting from the results proposed by [75]. It shows the learning of 1-bit in four different cases corresponding to the four input channels. In **Figure 10a**, the first line shows the network training with the four 1-digit in each channel while the others are set to 0. The images below report the SNN recognizing process: it uses the stored information to operate the comparison. Therefore, if digit 1

#### **Figure 10.**

*In (a) training and validation processes of a 4-bit SNN are reported in 1-digit recognition case. The first line is related to the training phase while in the following rows validation steps are reported. In (b) the signal output amplitudes for different training numbers are reported: Only the trained channel is above the threshold (dotted line) [75].*

#### **Figure 11.**

*In (a) training and validation processes of a 4-bit SNN are reported in 2-digit recognition case. The first line is related to the training phase while in the following rows validation steps are reported. In (b) the signal output amplitudes for different training numbers are reported: Only the trained channel is above the threshold (dotted line) [75].*

**Figure 12.**

*In (a) training and validation processes of a 4-bit SNN are reported in 3-digit recognition case. The first line is related to the training phase while in the following rows validation steps are reported. In (b) the signal output amplitudes for different training numbers are reported: Only the trained channel is above the threshold (dotted line) [75].*

of the new number corresponds to digit 1 of the training number, the output of the network is high and the information is recognized. Otherwise, the output is low, which means no recognition. **Figures 11** and **12** report the learning cases of 2-digit and 3-digit, following the scheme already illustrated in **Figure 10**.

The SNN recognizes through a threshold process. If the output is higher than a threshold, determined experimentally, then the recognition has occurred. This procedure can be generalized to N bits according to the equation shown below.

$$I\_k^{output\_i} \ge \theta I\_k^{input\_i} \tag{18}$$

where θ is the pure number (0.7).

Therefore, Optical Soliton Neural Networks are systems characterized by a structural dynamism, based on the plasticity of the refractive index, which can self-modify to recognize previously learned or new signals. Learning and memorization occur at the same time as physical evolutions.

## **6. Conclusions**

Artificial intelligence is marking a profound innovation in everyday life. To overcome the limitations of AI software, research has developed the neuromorphic approach, which consists in reproducing the functional blocks of the human brain. A first attempt was carried out by electronics, which however suffer from a structural rigidity that does not match neural geometries. One of the fundamental qualities that characterize them is in fact plasticity, that is to say, the ability to self-modify one's units to trap, learning, and memory, in its structure. The solitonic optical approach that we have described in this chapter bases its effectiveness precisely on the concept of plasticity and self-assembly. Compared to other optical technologies, which focus on single neural properties (first of all excitability), soliton networks are able to reproduce complex behavior by exploiting the local differences in refractive indices to build specific trajectories for each information through the propagation of solitons. SNNs are currently able to reproduce a specific type of psycho-memory, episodic

memory, in a particularly effective way, that is, with small powers (nW-μW) and with extremely low losses. SNNs capable of reproducing procedural and semantic memories are currently being studied. Once these objectives have been achieved, hardware that is functionally very close to biological neuronal dynamics will be available. In the biological neural system, the synaptic connections are created and deleted following the change in neurotransmitter density, in the soliton paradigm that we propose, the birth and modification of X-junction neurons depends on the density of photo-excited electric charges.

## **Acknowledgements**

The authors would like to acknowledge all alumni that have been working previously in the Ultrafast Photonics Lab and then in the Smart and Neuro Photonics of Sapienza Università di Roma: all of them participated in increasing the in-depth knowledge we now have on photorefractive spatial solitons and their enormous application possibilities.

## **Conflict of interest**

The authors declare no conflict of interest declaration.

## **Author details**

Eugenio Fazio\*, Alessandro Bile and Hamed Tari Smart and Neuro Photonics Laboratory, Department of Fundamental and Applied Sciences for Engineering, Sapienza Università di Roma, Roma, Italy

\*Address all correspondence to: eugenio.faziouniroma1.it

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Optical Soliton Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.107927*

## **References**

[1] Feldmann J, Youngblood N, Wright CD, Bhaskaran H, Pernice WH. All-optical spiking neurosynaptic networks with self-learning capabilities. Nature. 2019;**569**(7755):208-214

[2] De Lima TF, Shastri BJ, Tait AN, Nahmias MA, Prucnal PR. Progress in neuromorphic photonics. Nano. 2017; **6**(3):577-599

[3] Inagaki T, Inaba K, Leleu T, Honjo T, Ikuta T, Enbutsu K, et al. Collective and synchronous dynamics of photonic spiking neurons. Nature Communications. 2021;**12**(1):1-8

[4] Geng X, Hu L, Zhuge F, Wei X. Retinainspired two-terminal optoelectronic Neuromorphic devices with light-tunable short-term plasticity for self-adjusting sensing. Advanced Intelligent Systems. Jun 2022;**4**(6):2200019

[5] Gerstner W, Kistler WM, Naud R, Paninski L. Neuronal Dynamics: From Single Neurons to Networks and Models of Recognition. Cambridge: Cambridge University Press; 2014

[6] Chiao RY, Garmire E, Townes CH. Self-trapping of optical beams. Physical Review Letters. 1964;**13**(15):479

[7] Shabat A, Zakharov V. Exact theory of two-dimensional self-focusing and one-dimensional self-modulation of waves in nonlinear media. Soviet Physics JETP. 1972;**34**(1):62

[8] Barthelemy A, Maneuf S, Froehly C. Propagation soliton et auto-confinement de faisceaux laser par non linearité optique de Kerr. Optics Communications. 1985;**55**(3):201-206

[9] Aitchison JS, Weiner A, Silberberg Y, Oliver M, Jackel J, Leaird D, et al.

Observation of spatial optical solitons in a nonlinear glass waveguide. Optics Letters. 1990;**15**(9):471-473

[10] Segev M, Crosignani B, Yariv A, Fischer B. Spatial solitons in photorefractive media. Physical Review Letters. 1992;**68**(7):923

[11] Crosignani B, Segev M, Engin D, Di Porto P, Yariv A, Salamo G. Selftrapping of optical beams in photorefractive media. JOSA B. 1993; **10**(3):446-453

[12] Valley GC, Segev M, Crosignani B, Yariv A, Fejer M, Bashaw M. Dark and bright photovoltaic spatial solitons. Physical Review A. 1994;**50**(6):R4457

[13] Segev M, Valley GC, Crosignani B, Diporto P, Yariv A. Steady-state spatial screening solitons in photorefractive materials with external applied field. Physical Review Letters. 1994;**73**(24): 3211

[14] Zozulya A, Saffman M, Anderson D. Propagation of light beams in photorefractive media: Fanning, selfbending, and formation of self-pumped four-wave-mixing phase conjugation geometries. Physical Review Letters. 1994;**73**(6):818

[15] Shih M-F, Segev M, Valley G, Salamo G, Crosignani B, Di Porto P. Observation of two-dimensional steady-state photorefractive screening solitons. Electronics Letters. 1995;**31**(10): 826-827

[16] Zozulya AA, Anderson DZ. Propagation of an optical beam in a photorefractive medium in the presence of a photogalvanic nonlinearity or an externally applied electric field. Physical Review A. 1995;**51**(2):1520

[17] Zozulya A, Anderson D. Nonstationary self-focusing in photorefractive media. Optics Letters. 1995;**20**(8):837-839

[18] Zozulya A, Anderson D. Spatial structure of light and a nonlinear refractive index generated by fanning in photorefractive media. Physical Review A. 1995;**52**(1):878

[19] Fressengeas N, Maufoy J, Kugel G. Temporal behavior of bidimensional photorefractive bright spatial solitons. Physical Review E. 1996;**54**(6): 6866

[20] Segev M, Shih M-F, Valley GC. Photorefractive screening solitons of high and low intensity. JOSA B. 1996; **13**(4):706-718

[21] Shih M-F, Leach P, Segev M, Garrett MH, Salamo G, Valley GC. Twodimensional steady-state photorefractive screening solitons. Optics Letters. 1996; **21**(5):324-326

[22] Fazio E, Renzi F, Rinaldi R, Bertolotti M, Chauvet M, Ramadan W, et al. Screening-photovoltaic bright solitons in lithium niobate and associated single-mode waveguides. Applied Physics Letters. 2004;**85**(12):2193-2195

[23] De La Fuente R, Barthelemy A, Froehly C. Spatial-soliton-induced guided waves in a homogeneous nonlinear Kerr medium. Optics Letters. 1991;**16**(11):793-795

[24] Fazio E, Zitelli M, Bertolotti M, Carrera A, Sanvito N, Chiaretti G. Solitonic waveguiding in planar glass structures. Optics Communications. 2000;**185**(4–6):331-336

[25] Zitelli M, Fazio E, Bertolotti M. All-optical NOR gate based on the interaction between cosine-shaped input beams of orthogonal polarization. JOSA B. 1999;**16**(2):214-218

[26] Duree GC Jr, Shultz JL, Salamo GJ, Segev M, Yariv A, Crosignani B, et al. Observation of self-trapping of an optical beam due to the photorefractive effect. Physical Review Letters. 1993; **71**(4):533

[27] Fazio E, Mariani F, Bertolotti M, Babin V, Vlad V. Experimental demonstration of (1+ 1) D selfconfinement and breathing soliton-like propagation in photorefractive crystals with strong optical activity. Journal of Optics A: Pure and Applied Optics. 2001; **3**(6):466

[28] Fazio E, Ramadan W, Belardini A, Bosco A, Bertolotti M, Petris A, et al. (2+ 1)-dimensional soliton formation in photorefractive Bi 12 SiO 20 crystals. Physical Review E. 2003;**67**(2): 026611

[29] Fazio E, Ramadan W, Bertolotti M, Petris A, Vlad V. Complete characterization of (2+ 1) D soliton formation in photorefractive crystals with strong optical activity. Journal of Optics A: Pure and Applied Optics. 2003; **5**(5):S119

[30] Safioui J, Devaux F, Chauvet M. Pyroliton: Pyroelectric spatial soliton. Optics Express. 2009;**17**(24): 22209-22216

[31] Safioui J, Fazio E, Devaux F, Chauvet M. Surface-wave pyroelectric photorefractive solitons. Optics Letters. 2010;**35**(8):1254-1256

[32] Barthélémy A, Froehly C, Maneuf S, Reynaud E. Experimental observation of beams'self-deflection appearing with two-dimensional spatial soliton propagation in bulk Kerr material. Optics Letters. 1992;**17**(12):844-846

*Optical Soliton Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.107927*

[33] Carvalho M, Singh S, Christodoulides D. Self-deflection of steady-state bright spatial solitons in biased photorefractive crystals. Optics Communications. 1995;**120**(5–6): 311-315

[34] Jinsong L, Keqing L. Screeningphotovoltaic spatial solitons in biased photovoltaic–photorefractive crystals and their self-deflection. JOSA B. 1999; **16**(4):550-555

[35] Chauvet M, Coda V, Maillotte H, Fazio E, Salamo G. Large selfdeflection of soliton beams in LiNbO 3. Optics Letters. 2005;**30**(15):1977-1979

[36] Fazio E, Ramadan W, Petris A, Chauvet M, Bosco A, Vlad V, et al. Writing single-mode waveguides in lithium niobate by ultra-low intensity solitons. Applied Surface Science. 2005; **248**(1–4):97-102

[37] Pettazzi F, Coda V, Chauvet M, Fazio E. Frequency-doubling in selfinduced waveguides in lithium niobate. Optics Communications. 2007;**272**(1): 238-241

[38] Fazio E, Pettazzi F, Centini M, Chauvet M, Belardini A, Alonzo M, et al. Complete spatial and temporal locking in phase-mismatched second-harmonic generation. Optics Express. 2009;**17**(5): 3141-3147

[39] Fazio E, Belardini A, Alonzo M, Centini M, Chauvet M, Devaux F, et al. Observation of photorefractive simultons in lithium niobate. Optics Express. 2010;**18**(8):7972-7981

[40] Pettazzi F, Coda V, Fanjoux G, Chauvet M, Fazio E. Dynamics of second-harmonic generation in a photovoltaic photorefractive quadratic medium. JOSA B. 2010; **27**(1):1-9

[41] Vlad V, Petris A, Bosco A, Fazio E, Bertolotti M. 3D-soliton waveguides in lithium niobate for femtosecond light pulses. Journal of Optics A: Pure and Applied Optics. 2006;**8**(7):S477

[42] Pettazzi F, Alonzo M, Centini M, Petris A, Vlad VI, Chauvet M, et al. Selftrapping of low-energy infrared femtosecond beams in lithium niobate. Physical Review A. 2007;**76**(6):063818

[43] Mitchell M, Segev M. Self-trapping of incoherent white light. Nature. 1997; **387**(6636):880-883

[44] Fazio E, Alonzo M, Devaux F, Toncelli A, Argiolas N, Bazzan M, et al. Luminescence-induced photorefractive spatial solitons. Applied Physics Letters. 2010;**96**(9):091107

[45] Chauvet M, Hawkins S, Salamo GJ, Segev M, Bliss D, Bryant G. Self-trapping of planar optical beams by use of the photorefractive effect in InP: Fe. Optics Letters. 1996;**21**(17):1333-1335

[46] Alonzo M, Dan C, Wolfersberger D, Fazio E. Coherent collisions of infrared self-trapped beams in photorefractive. Applied Physics Letters. 2010;**96**(12): 121111

[47] Grassé P-P. La reconstruction du nid et les coordinations interindividuelles chezBellicositermes natalensis etCubitermes sp. la théorie de la stigmergie: Essai d'interprétation du comportement des termites constructeurs. Insectes Sociaux. 1959; **6**(1):41-80

[48] Theraulaz G, Bonabeau E. Coordination in distributed building. Science. 1995;**269**(5224):686-688

[49] Moffett MW, Garnier S, Eisenhardt KM, Furr NR, Warglien M, Sartoris C, et al. Ant colonies: Building

complex organizations with minuscule brains and no leaders. Journal of Organization Design. 2021;**10**(1):55-74

[50] Dorigo M, Bonabeau E, Theraulaz G. Ant algorithms and stigmergy. Future Generation Computer Systems. 2000; **16**(8):851-871

[51] Fazio E. A road towards the photonic hardware implementation of artificial cognitive circuits. Journal of Mental Health & Clinical Psychology. 2018;**2**(5): 1-5

[52] Blum C. Ant colony optimization: Introduction and recent trends. Physics of Life Reviews. 2005;**2**(4): 353-373

[53] Hu F, Deng Y, Aghvami AH. Scalable Multi-Agent Reinforcement Learning Algorithm for Wireless Networks, 2021: arXiv preprint arXiv:2108.00506, 2021 arxiv.org

[54] Das S, Dey A, Pal A, Roy N. Applications of artificial intelligence in machine learning: Review and prospect. International Journal of Computer Applications. Apr 2015;**115**(9):31-41

[55] Alonzo M, Moscatelli D, Bastiani L, Belardini A, Soci C, Fazio E. All-optical reinforcement learning in solitonic xjunctions. Scientific Reports. 2018;**8**(1): 1-7

[56] Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine. 2017; **34**(6):26-38

[57] Sutton RS. Introduction: The challenge of reinforcement learning. In: Machine Learning. Vol. 8. Boston. Manufactured in The Netherlands: Kluwer Academic Publishers; 1992**8**:225- 227. pp. 1-3

[58] Qinghua L, Liman W, Frutos AG, Condon AE, Corn RM, Smith LM. DNA computing on surfaces. Nature. 2000; **403**(13):175-178

[59] Aaronson S. Guest column: NPcomplete problems and physical reality. ACM SIGACT News. 2005;**36**(1):30-52

[60] Caulfield H, Dolev S. Why future supercomputing requires optics. Nature Photonics. 2010;**4**:261-263

[61] Oltean M. Solving the Hamiltonian path problem with a light-based computer. Natural Computing. 2008; **7**(1):57-70

[62] Shaked NT, Messika S, Dolev S, Rosen J. Optical solution for bounded NP-complete problems. Applied Optics. 2007;**46**(5):711-724

[63] Woods D, Naughton TJ. Photonic neural networks. Nature Physics. 2012; **8**(4):257-259

[64] Anderson MC, Hulbert JC. Active forgetting: Adaptation of memory by prefrontal control. Annual Review of Psychology. 2021:1-36. DOI: 10.1146/annurev-psych-072720-094140

[65] Anderson ML. Neural reuse in the organization and development of the brain. Developmental Medicine & Child Neurology. 2016;**58**:3-6

[66] Sweatt JD. Neural plasticity and behavior—Sixty years of conceptual advances. Journal of Neurochemistry. 2016;**139**:179-199

[67] Lomonaco V. Continual Learning with Deep Architectures, PhD Dissertation thesis, Alma Mater Studiorum Universită di Bologna. PhD in Computer Science and Engineering. 31 Ciclo 2019. pp. 1-134. DOI 10.6092/

*Optical Soliton Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.107927*

unibo/amsdottorato/9073. Available from: http://amsdottorato.unibo.it/view/ dottorati/DOT536/

[68] Wright CD, Liu Y, Kohary KI, Aziz MM, Hicken RJ. Arithmetic and biologically-inspired computing using phase-change materials. Advanced Materials. 2011;**23**(30):3408-3413

[69] Kuzum D, Jeyasingh RG, Lee B, Wong H-SP. Nanoelectronic programmable synapses based on phase change materials for brain-inspired computing. Nano Letters. 2012;**12**(5): 2179-2186

[70] Ambrogio S, Ciocchini N, Laudato M, Milo V, Pirovano A, Fantini P, et al. Unsupervised learning by spike timing dependent plasticity in phase change memory (PCM) synapses. Frontiers in Neuroscience. 2016;**10**:56

[71] Tari H, Bile A, Moratti F, Fazio E. Sigmoid type neuromorphic activation function based on saturable absorption behavior of Graphene/PMMA composite for intensity modulation of surface Plasmon Polariton signals. Plasmonics. 2022:1-8

[72] Tait AN, De Lima TF, Zhou E, Wu AX, Nahmias MA, Shastri BJ, et al. Neuromorphic photonic networks using silicon photonic weight banks. Scientific Reports. 2017;**7**(1):1-10

[73] Zarei S, Marzban M-R, Khavasi A. Integrated photonic neural network based on silicon metalines. Optics Express. 2020;**28**(24):36668-36684

[74] Bile A, Moratti F, Tari H, Fazio E. Supervised and unsupervised learning using a fully-plastic all-optical unit of artificial intelligence based on solitonic waveguides. Neural Computing and Applications. 2021;**33**(24):17071-17079

[75] Bile A, Tari H, Fazio E. Episodic memory and information recognition using Solitonic neural networks based on photorefractive plasticity. Applied Sciences. 2022;**12**(11):5585

[76] Kandel ER. In Search of Memory: The Emergence of a New Science of Mind. New York City: WW Norton & Company; 2017

[77] Bednar JA, Miikkulainen R. Patterngenerator-driven development in selforganizing models. Computational Neuroscience. 1998;**1998**:317-323

[78] Ibarz B, Casado JM, Sanjuán MA. Map-based models in neuronal dynamics. Physics Reports. 2011;**501**(1–2):1-74

[79] Ianero B, Bile A, Alonzo M, Fazio E. Stigmergic electronic gates and networks. Journal of Computational Electronics. 2021;**20**(6):2614-2621

[80] Zhou H, Zhao Y, Wang X, Gao D, Dong J, Zhang X. Self-learning photonic signal processor with an optical neural network chip. arXiv preprint arXiv: 1902.07318. 2019. DOI: 10.48550/ arXiv.1902.07318

[81] Bile A, Tari H, Grinde A, Frasca F, Siani AM, Fazio E. Novel model based on artificial neural networks to predict short-term temperature evolution in museum environment. Sensors. 2022; **22**(2):615

[82] Heaton J, Goodfellow I, Bengio Y, Courville A. Deep Learning. Genetic Programming and Evolvable Machines. 2018;**19**:305-307. DOI: 10.1007/s10710-017-9314-z

[83] Hastie T, Tibshirani R, Friedman J. Overview of supervised learning. In: The Elements of Statistical Learning. Berlin, Heidelberg, New York, London, Paris, Tokyo: Springer; 2009. pp. 9-41

[84] Barlow HB. Unsupervised learning. Neural Computation. 1989;**1**(3):295-311

[85] Love BC. Comparing supervised and unsupervised category learning. Psychonomic Bulletin & Review. 2002; **9**(4):829-835

[86] Chauvet M, Bassignot F, Henrot F, Devaux F, Gauthier-Manuel L, Maillotte H, et al. Fast-beam selftrapping in LiNbO 3 films by pyroelectric effect. Optics Letters. 2015; **40**(7):1258-1261

[87] Bile A, Chauvet M, Tari H, Fazio E. Supervised learning of Soliton Xjunctions in lithium niobate films on insulator. In press on Optics Letters. 2022

[88] Dao Duc K, Parutto P, Chen X, Epsztein J, Konnerth A, Holcman D. Synaptic dynamics and neuronal network connectivity are reflected in the distribution of times in up states. Frontiers in Computational Neuroscience. 2015;**9**:96

[89] Terry WS. Learning and Memory: Basic Principles, Processes, and Procedures. New York: Routledge; 2017. DOI: 10.4324/9781315622781

[90] Tulving E. Elements of Episodic Memory. Oxford University Press; 1983

## **Chapter 7**

## Application of Artificial Neural Network in Solar Energy

*Bin Du and Peter D. Lund*

## **Abstract**

Accurate prediction of system performance is very important for the optimal planning of solar energy systems. The latest research of artificial neural network (ANN) technology for predicting the efficiency of solar thermal systems and the performance of photovoltaic system is reported here. Application of ANN to performance assessment of solar collectors is briefly reviewed including novel all-glass straight-through evacuated tube collectors. An overview of the most recent work of ANN for combined photovoltaic/thermal panels (PV/T) and concentrating photovoltaic collectors is also provided.

**Keywords:** artificial neural network, solar collector, performance prediction, thermal efficiency, photovoltaic/thermal, concentrating photovoltaics

## **1. Introduction**

The increase of population and development of world industry requires the massive use of fossil fuel [1], resulting in environmental pollution and global warming. Renewable energy is one of the effective technical methods to alleviate this phenomenon [2]. Solar Energy is the most rapidly developing and widely used renewable energy technology. At present, there are many application forms, including solar power generation [3], seawater desalination [4], heating [5], refrigeration [6], etc. For the estimation for the efficiency of solar thermal systems, experimental study and theoretical analytic simulation codes are often utilized [7, 8]. The traditional algorithms usually employed are very complex, including the solution of complicated different equations, which usually involves large resource and takes a great quantity of time to give exact solutions [8]. Moreover, traditional analysis methods are often based on simplified assumptions, as well as simplified models and solving nonlinear partial differential equations, which reduce the prediction accuracy [9–11]. ANN is a mathematical method that mimics the behavior of human brain. It has a strong ability to learn and find nonlinear relationships between input and output in systems [12], so it has the ability to realize information processing by adjusting the connection between internal nodes [12]. Unlike complex laws and mathematical routines in traditional analysis methods, ANN can learn key information patterns in multidimensional information domain [8]. Therefore, ANN technology has obvious advantages in speed, organization ability, fault tolerance and adaptability [8]. In

recent years, ANN has gained more and more applications in the solar energy field, such as solar radiation prediction [13–17], photovoltaic power generation [18–20], solar drying [21], etc.

## **2. Background**

The solar collector converts the solar radiation energy into heat and transfers it to the heat transfer fluid [7]. Recently, the application of ANN technology in the energy-engineering systems has attracted more and more attention. ANNs have been utilized by many researchers for modeling and prediction of thermal performance of various solar collectors. Delfani et al. [22] employed ANN to determine the efficiency of direct absorption solar collector with nanofluid, and investigated the influence of collector depth and length and other important parameters on its working performance and Nusselt number. Maria et al. [23] built ANN models to evaluate the efficiency of flat plate solar collectors with silver/water nanofluid, and the results are in good agreement with the experimental data. Cuma [24] and Kalogirou et al. [25] studied various methods to predict flat plate solar collectors and solar water heater with cylindrical concentrator respectively and analyzed them comparatively. It is evident that the ANN model greatly improved the prediction accuracy.

Many ANN algorithms are employed to predict the solar heating system performance. Kumar et al. [26] investigated a roughened solar air heater, focusing on the comparison of three ANNs to evaluate its exegetic efficiency of roughened solar air heater and obviously the Radial Basis Function (RBF) model has the best performance. Abdellah et al. [27] compared the advantages and disadvantages of traditional theoretical analysis (energy balance-based) method and ANN model (data based modeling methods) in determining the performance of heat pipe solar collector. According to the results, ANN was significantly superior to other traditional theoretical methods. Kumar et al. [28] utilized ANN and multiple linear regression model to evaluate heat transfer in a solar air heater with a rough absorber and compared their performance according to a number of statistical criterias. Kumar et al. [29] further contrasted ANN models with four training functions to estimate the thermal performance of uniform flow porous bed solar air heater, and the results showed that the prediction performance of the training function was better than the other three. They also analyzed the advantages and disadvantages of three ANN algorithms for thermal performance prediction of a solar air heater with unusual physical structure [30]. Liu et al. [31] proposed an evacuated solar water heater which has high collector efficiency by developing a technology based screening method. Sadeghi et al. [32] studied the factors affecting the exergy and energy efficiency of collectors and found that the usage of copper oxide/water nanofluid in a parabolic concentrator improved the thermal efficiency. Diez et al. [33] employed various methods to evaluate the outlet temperature of working medium, and concluded after comparison that the generalized regression neural network has the best prediction effect. ANN technology with above mentioned input to estimate the characteristics of flat-plate collectors has also been presented. Comparison with conventional analytical methods indicated the superiority of ANN models [34]. Budihardjo et al. [35] modeled and analyzed heat transfer and fluid flow in single evacuated tubes. Morrison et al. [36] investigated the influence of circular heat distribution on the performance of such tubes.

## **3. Application of ANN in an evacuated tube solar collector**

At present, the most popular evacuated tube solar collector (ETC) in the market is Dewar-tube [37, 38], because it is cheap and easy to manufacture [38]. As the fluid flow in Dewar tube is driven by buoyancy [39–41], salt usually deposit at the bottom of the tube, which worsens heat transfer. But in the all-glass straight-through evacuated tube solar collector, stronger convection promotes heat transfer, reduces heat losses and improves water quality thus eliminating the main issue of salt precipitation and weak convective heat transfer inherent in traditional Dewar-tube [37, 42]. Better performance and high efficiency [36, 43–45] decrease the cost too.

## **3.1 Experimental set-up**

The structure of all-glass straight-through evacuated tube collector is shown in **Figure 1**. Both ends of the inner (absorption tube) and outer tube (cover glass tube) are fused together. The space between the inner and outer tubes is vacuumed with pressure <0.013 Pa to reduce convective heat loss. The selective absorption coating is coated on the outer surface of the inner tube. Hence, the working temperature of the inner tube is higher than that of the outer tube, and the temperature difference leads to thermal stress. For the safe and stable operation of the evacuated tube, the outer tube is manufactured of glass with high thermal expansion coefficient, which can withstand the thermal stress. The detailed structure of the tube is illustrated in **Table 1**.

The heat transfer fluid used in the experiment is water, which flows through the all-glass straight-through evacuated tube solar collector. The collector inlet and outlet temperatures and ambient temperature were measured by thermocouples and recorded by a data logger. The water flow rate is measured by the rotameter placed at the inlet of the tube, as illustrated in **Figure 2**.

During the experiment, water flows through the evacuated tube driven by a pump, and the flow is adjusted and stabilized by a valve connected to the flow meter. The solar radiation intensity is surveyed by a special pyrometer. The experimental site is Nanjing, China. The collected actual data include solar radiation intensity, wind speed, ambient temperature, inlet and outlet water temperature and water flow rate and. The experiment was implemented from 10 a.m. to 4 p.m. every day, and the data were recorded every 30 minutes [30].

#### **Figure 1.**

*(a) Overview of the all-glass straight-through evacuated tube (b) Cross-section view of the inlet of the collector tube [51].*

#### *Artificial Neural Networks - Recent Advances, New Perspectives and Applications*


#### **Table 1.**

*Parameters of an all-glass straight through evacuated tube [11].*

**Figure 2.** *Setup of the experimental system [12].*

### **3.2 Methodology**

Most of the solar energy absorbed by the evacuated tube is transferred from the tube inner wall of the tube to the working fluid flowing through the inner tube by convection heat transfer. Another part of the energy is transmitted to the inner wall of the outer glass tube through radiation and convection, and passes through the outer glass tube through heat conduction. Heat is lost to the environment by convection and to the sky by radiation from the outer surface of the outer glass tube.

Thermal efficiency is the most important criteria to evaluate the performance of evacuated tube solar collector. Here, the thermal efficiency is defined as the ratio of the heat obtained by the heat transfer fluid to the incident solar flux on the tube [30], and is written as follows [22, 46–48].

*Application of Artificial Neural Network in Solar Energy DOI: http://dx.doi.org/10.5772/intechopen.106977*

$$\eta\_{\rm th} = \frac{\dot{\mathbf{m}\_{\rm f}} \mathbf{C\_{p}} (\mathbf{T\_{fo}} - \mathbf{T\_{fi}})}{\mathbf{IA\_{p}}} \tag{1}$$

where *IAp* represents the solar radiation on the tube surface, *Tfi* and *Tfo* are the inlet and outlet temperature of water, respectively, *m*\_ *<sup>f</sup>* means the mass flow rate of heat transfer fluid, *Cp* is the specific heat of heat transfer fluid (**J***<sup>=</sup>* **kg** � **<sup>K</sup>** � �).

#### **3.3 ANN modeling**

The Multiple linear regression (MLR), Support vector regression (SVR), Back Propagation neural network (BP) and Radial basis function (RBF) are employed here for thermal efficiency prediction of the all-glass straight-through evacuated tube collector. The following variables are used as input parameters of the models: water flow rate *mf*, inlet water temperature *Tfi*, wind speed *wa*, ambient temperature *Ta* and solar radiation intensity *I*. The thermal efficiency of the solar collector *ηth* is the output lay. In this work, 70% of the total 158 experimental datasets are regarded as training dataset and the other 30% is test sets. In ANN models, the optimum number of neurons in hidden layer is evaluated by the equation which is recommended by Ghritlahre et al. [7]:

$$\mathbf{H}\_{\mathbf{n}} = \frac{\mathbf{M} + \mathbf{N}}{\mathbf{2}} + \sqrt{\mathbf{T}\_{\mathbf{n}}} \mathbf{\#} \tag{2}$$

where *Hn* represents the number of hidden neurons, M and N are the input and output neurons, *Tn* is the number of training data.

#### *3.3.1 Data preparation*

There is likely to be a large dimension and dimension units difference between the measured data, which will seriously affect the prediction performance. Data normalization is essential to eliminate the dimensional influence among the indices. Here, the normalization is expressed as:

$$\mathbf{Y\_{norm}} = \frac{\mathbf{Y\_i} - \mathbf{mean}}{\mathbf{std}} \tag{3}$$

where mean represents the mean of the training samples, std. means the standard deviation of the training samples. The normalized data is distributed in a reasonable range, which is beneficial for further processing and analysis.

#### *3.3.2 Performance evaluation criteria*

Several criteria can be utilized to assess the accuracy of the proposed models. Their definitions are as follows:

Coefficient of Determination:

$$\mathbf{R}^2 = \mathbf{1} - \frac{\sum\_{\mathbf{i}=1}^n (\mathbf{X}\_{\mathbf{A},\mathbf{i}} - \mathbf{X}\_{\mathbf{P},\mathbf{i}})^2}{\sum\_{\mathbf{i}=1}^n \mathbf{X}\_{\mathbf{P},\mathbf{i}}^2} \tag{4}$$

Root Mean Squared Error:

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left( \mathbf{X}\_{\mathbf{A},i} - \mathbf{X}\_{\mathbf{P},i} \right)^{2}} \tag{5}$$

Mean Absolute Error:

$$\mathbf{MAE} = \frac{1}{\mathbf{n}} \sum\_{\mathbf{i}=1}^{\mathbf{n}} (\mathbf{X}\_{\mathbf{A},\mathbf{i}} - \mathbf{X}\_{\mathbf{P},\mathbf{i}}) \tag{6}$$

where n represents the total number of data, *XA*,*<sup>i</sup>* is the actual efficiency of the collector, and *XP*,*<sup>i</sup>* means the predicted efficiency value.

#### **3.4 Results and discussion**

After evaluation with Eq. (2), selecting 10–16 neurons in hidden layers to verify with BP algorithm, and it is obvious that the model with 13 neurons is the best.

**Table 2** illustrates the comparison of RBF, BP, MLR and SVR models in predicting the thermal efficiency of all-glass straight-through evacuated tube.

It is evident that the accuracy of RBF is superior to the other methods followed by the BP model, but obviously the SVR, BP and RBF can all successfully carry out the prediction. Dealing with nonlinear problems is not the strength of MLR algorithm [49]. For nonlinear problems, SVR finds a nonlinear mapping to map the input data to the high-dimension feature space first, so that the separation status is greatly improved. Then, to classify in such feature space, and after that return to the original space, and then get the nonlinear classification of the original input space. However, after all, SVR uses linear algorithm for nonlinear regression in high-dimensional attribute space. Comparatively speaking, the major benefit of neural network method is that it is good at solving complex nonlinear relationship among variables efficiently. Thus, the deviation between the neural network model prediction values of the evacuated tube thermal efficiency and the actual data is the minimum.

The comparison between the actual data and the prediction results of the proposed models is shown in **Figure 3**. It is evident that the results of RBF model are the closest to the actual data among the four models investigated.

#### *3.4.1 Sensitivity analysis*

Sensitivity analysis refers to finding the sensitive factors that have a significant impact on the output of the models from many uncertain factors, and analyzing and calculating their impact and relative importance on the results. In short, sensitivity


**Table 2.**

*Accuracy of the models in performance prediction [11].*

*Application of Artificial Neural Network in Solar Energy DOI: http://dx.doi.org/10.5772/intechopen.106977*

#### **Figure 3.**

*(a) Comparison of experimental and MLR, SVR, BP and RBF predicted thermal efficiency. (b) Individual error with MLR, SVR, BP and RBF models [11].*

#### **Figure 4.**

*(a) Relative importance of input variables based on solar radiation. (b) Relative importance (%) of the inlet variables on the thermal efficiency of the evacuated tube [11].*

analysis is to see which variable changes the conclusion is sensitive to [50]. Taking the RBF model in this analysis as an example, the relative importance of every input parameter to the output result is illustrated in **Figure 4**. Clearly, the solar radiation has the largest impact on the efficiency prediction of the proposed evacuated tube, followed by collector inlet temperature and water flow rate.

The efficiency value calculated by RBF model is illustrated in **Figure 5**. With the enhancement of solar radiation and the increase of water flowrate, the convective heat transfer in the tube is promoted, and the thermal performance of the evacuated tube rises. **Figure 5(b)** shows the change of thermal efficiency of evacuated tube with water flow rate and wind speed when the solar radiation intensity is 900 **W***=***m2**. It is visible that the increase of wind speed promotes the heat dissipation from the surface of the outer glass tube to the environment, resulting in a rise in the heat loss of the evacuated tube and a decrease in its thermal efficiency.

### **3.5 Combining CFD and ANN techniques modeling**

The dominant energy equations of the studied all-glass straight-through evacuated tube solar collector, as well as necessary heat and mass transfer and other related

**Figure 5.**

*(a) Efficiency vs. water flow rate (wind speed 1.5 m/s). (b) Efficiency vs. water flow rate (solar intensity 900* W*=*m<sup>2</sup>*) [11].*

#### **Figure 6.**

*(a) Temperature distribution alongside the tube. (b) Inlet and outlet temperature. Irradiance is 1000* W*=*m<sup>2</sup> *[51].*

conditions required for theoretical analysis have been explained in detail in [51]. The 3-D model based on these equations and conditions of the proposed evacuated tube is developed into the computational fluid dynamics (CFD) software ANSYS Fluent [47, 52, 53] to carry out the heat transfer simulation.

**Figure 6** shows the temperature distribution of the evacuated tube obtained by numerical simulation of the tube model. In **Figure 6**, the inlet temperature is 298 K, the mass flow rate is 25 kg*=*h, and the solar radiation intensity is 1000 W*=*m2.

The MLR, BP and convolutional neural network (CNN) [54, 55] models were employed to determine the thermal characteristics of all-glass straight-through evacuated tube solar collector. A total of 243 experimental data sets were employed, of which 70% were used for training and 30% were test datasets. Collector inlet water temperature, wind speed, water flow rate, ambient temperature and solar radiation intensity and the values calculated by the theoretical CFD models were used as input, the collector outlet water temperature and the thermal efficiency of the tube were regarded as output (see **Figure 7**). The output values with and without the theoretical model + CFD were compared with experimental data.

The prediction accuracies of studied models are illustrated in **Table 3**. The measurement criteria of CNN model with modeled value as one of the input parameters (CFD-CNN) is the best. Comparing the data in **Table 3**, when the modeled value of the collector outlet temperature is taken as one of the inputs, the prediction accuracies of MLR, BP and CNN models are significantly enhanced (**Figure 8**).

#### **Figure 7.**

*Illustration of the integrated models [51].*


#### **Table 3.**

*Inaccuracies of the different models [51].*

## **4. Application of ANN technique to photovoltaics**

Several previous studies have reviewed the application of ANN into solar irradiance and photovoltaics **(**PV) power production forecasting, anomaly detection (fault diagnostic) in PV, tracking the maximum power point (MPP), etc. Here an overview of the most recent work on ANN for photovoltaic/thermal (PV/T) systems and concentrating PV(CPV) is presented.

Ammar et al. [56] investigated a PV/T based water pumping and heating system in which the PV/T panel simultaneously delivers electrical power P and thermal power Q. An ANN model was developed to determine the optimal power point (OPOP), which is defined as the crossing point of the max(P O) curves. The focus was to calculate the optimal value of water flow with varying ambient temperature and solar radiation conditions to ensure maximum electrical power and thermal power output. The proposed neural network model regarded the solar radiation intensity and ambient temperature as the input layer and the output is the corresponding optimal water flow rate. The normal mean bias error (NMBE) was used to measure the accuracy of the ANN when the ambient temperature was 5–35°C and the solar radiation varied from 350 to 950 W*=*m2, yielding a NMBE of 13.05%. The collected data was divided into cold and hot season according to the weather, for which the OPOP was computed respectively. The results show that during the hot season, with relatively stable weather conditions, the accuracy of the estimated value of the neural network model is better. The ANN algorithm provides a feasible control strategy for similar PV/T systems.

AI-Waeli et al. [57] studied a photovoltaic/thermal system using a special experimental rig for ANN analysis., Three cooling strategies were employed to verify the effectiveness of the design: PV/T with water-filled container and water as working fluid, PV/T with PCM-filled container and water as working fluid, PV/T with container filled with nanoparticles dispersed in Phase Change Material (PCM) and nanofluid as working fluid, as well as the conventional PV panel as reference. The nano-PCM and nanofluid using SiC nanoparticles yielded the best cooling effect among these methods, and the maximum efficiency reached 13.3%, while the efficiency of conventional PV was 8.1% only. Three ANN models, namely, MLP, SOFM and SVM, were used to evaluate the performance of the investigated PV/T system showing slight differences in the performance prediction.

Ahmadi et al. [58] developed ANN models such as multilayer perception (MLP), RBF, least squares support vector machine (LSSVM) and adaptive neuro-fuzzy inference system (ANFIS) to model the efficiency of a PV/T plate which contains a full circle tube as the fluid channel that is bonded to the absorber plate by special adhesives. Solar heat, solar radiation, heat, flow rate, inlet temperature were regarded as inputs, and the electrical efficiency as the output of these models. By comparing the RMSE and correlation coefficient (*R*<sup>2</sup> ), the LSSVM approach gave the best accuracy with *R*<sup>2</sup> = 0.9867. Using a sensitivity analysis, it was found that the inlet temperature had the greatest impact on the efficiency of the proposed PV/T system.

ANN models were also used to predict the thermal efficiency of a PV/T system that has a serpentine tube connected to the plate and using water as the cooling fluid [59]. MLP-ANN, ANFIS and LSSVM were employed to specify the thermal efficiency of the solar collector as output and inlet temperature, water flow rate and solar irradiance as input layer. The ANN model provided that best prediction performance when using the mean squared error (MSE) and determination coefficient (*R*<sup>2</sup> ) for the comparison. Also, here the inlet temperature proved to have the greatest impact on the thermal efficiency of the PV/T panel.

Cao et al. [60] explored six AI models, including least-squares support vector regression (LS-SVR), adaptive neuro-fuzzy inference systems (ANFIS), and four ANN methods, i.e., multi-layer perceptron (MLP), cascade feedforward (CFF), radial basis function (RBF) and generalized regression (GR) for evaluating the electrical efficiency of a PV/T system cooled by the nanofluids. Through comprehensive

*Application of Artificial Neural Network in Solar Energy DOI: http://dx.doi.org/10.5772/intechopen.106977*

comparison of statistical indices such as the absolute average relative deviation (AARD), mean square error (MSE) and coefficients of determination (*R*<sup>2</sup> ), it was found that the ANFIS model had the best prediction accuracy for the electrical efficiency of studied PV/T system. The theoretical analysis also showed that the SiC water nanofluid was the best coolant for the PV/T system.

In [61], three ANN methods, including the radial-basis function artificial neural network (RBFANN), were employed to predict the performance of a photovoltaic thermal nanofluid (PVT/N) based collector system which is equipped with a copper sheet and tube collector and zinc-oxide (ZnO)/water nanofluid as coolant. Ten days experimental data in various weather conditions were used for training and to test of the proposed AI approach. Ambient temperature, incident solar radiation and fluid inlet temperature were regarded as input while fluid outlet temperature and electrical efficiency were set as the output layer. The ANFIS was more accurate for predicting the fluid outlet temperature, but the RBFANN was superior to the other methods to predict the electrical efficiency of the proposed PVT/N unit.

Renno et al. [62] compared the prediction performance of Random-Forest (RF), ANN and Linear Regression Model (LRM) approaches to predict the temperature of multijunction solar cells. The studied cells constituted of InGaP/GaAs/Ge and InGaP/InGaAs/ Ge under a high concentration Fresnel lens. The input variables were the local hour, global radiation, concentration factor and the environmental temperature,and the cell temperature was used as output. The RF method yielded the best performance with the lowest values of RMSE, MAE and MAPE. It was is observed that the cell temperature increased with the increasing ambient temperature, solar radiation, and concentration ratio.

In [63], the power output of a V-trough photovoltaic system was predicted with support vector machine (SVM), ANN, kernel and nearest-neighbor and deep learning (DL) methods. Through a statistical indices comparison, the support vector machine gave better performance prediction accuracy, although all the presented algorithms predicted the PV module power output satisfactorily. Also, the ANN model was not inferior to the SVM algorithm in evaluating the peak data. The prediction performance of DL and ANN were also compared with SVM. The results showed that the predicted PV power output by DL was higher than the actual data, which was likely due to the availability of data.

## **5. Conclusions**

Application of artificial neural network for performance prediction of solar energy collectors has briefly been introduced here including comparison to traditional analysis methods.

Back propagation (BP), radial basis function (RBF), support vector regression (SVR) and multiple linear regression (MLR) were used to predict the performance of a novel all-glass straight-through evacuated tube solar collector employing experimental datasets. The RBF and BP outperformed the SVR and MLR methods, but the accuracy of the first three models mentioned above were well within acceptable limits (*R*<sup>2</sup> s were 0.8447, 0.9059 and 0.9658, respectively). However, the MLR algorithm was not good in dealing with nonlinear problems. The RBF method showed the best performance with the lowest RMSE (0.0066) and the lowest MAE (0.0043) for the solar collector efficiency prediction.

A novel approach combining mathematical performance simulation (CFD) and neural networks was also investigated for determining the performance of the all-glass straight-through evacuated tube. The results show that regarding CFD modeled output as the input of ANN significantly improved the evaluation accuracy of all proposed models including MLR, BP and convolutional neural network (CNN). The CFD-CNN model is superior to the other studied models with the highest R2 and the lowest RMSE, 0.9684 and 0.0044 (**Table 3**).

The research on applying ANN to photovoltaics was also reported with focus on the utilization of neural networks for output power prediction of photovoltaic/thermal system (PV/T) and concentrating photovoltaics (CPV). The review demonstrated the usefulness of ANN also for the PV field.

Future work of ANN in solar energy could extend to other design parameters and meteorological data as input to the neural network model. Also, using new ANN approaches such as the recurrent neural network could be relevant. Future directions of interest include the combination of some metaheuristic methods such as gray wolf optimization (GWO), genetic algorithm (GA), particle swarm optimization (PSO) and ANN to optimize ANN structure and improve ANN performance. Extensions of ANN, e.g., extreme machine learning (EML), adaptive network-based fuzzy inference system can be used to improve prediction accuracy. Based on the work presented here, it is believed that the artificial neural network will increasingly be applied in the field of solar energy.

## **Acknowledgements**

Part of this work was funded by the National Natural Science Foundation of China (Grant number 51736006). The support of Aalto University is also acknowledgement.

## **Author details**

Bin Du1,2\* and Peter D. Lund1,3

1 Key Laboratory of Solar Energy Science and Technology in Jiangsu Province, Institue of Energy and Environment, Southeast University, Nanjing, China

2 Energy Storage Research Center, Southeast University, Nanjing, China

3 School of Science, Aalto University, Espoo, Finland

\*Address all correspondence to: bindubill@hotmail.com

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Application of Artificial Neural Network in Solar Energy DOI: http://dx.doi.org/10.5772/intechopen.106977*

## **References**

[1] Can S, Sharp Julia L, Annick AA. Factors impacting diverging paths of renewable energy: A review. Renewable Sustainable Energy Reviews. 2018;**81**: 2335-2342

[2] Guven G, Sulun Y. Pre-service teacher's knowledge and awareness about renewable energy. Renewable Sustainable Energy Reviews. 2017;**80**: 663-668

[3] Jain S, Kumar Jain N, Jamie VW. Challenges in meeting all of India's electricity from solar: An energetic approach. Renewable Sustainable Energy Reviews. 2018;**82**:1006-1013

[4] Chen L, Huiyao W, Sarada K, Krishna K, Xu P. Low-cost and reusable carbon black based solar evaporator for effective water desalination. Desalination. 2020;**483**:1-15

[5] Pereira Da Cunha J, Eames PC. Compact latent heat storage decarbonization potential for domestic hot water and space heating applications in the UK. Applied Thermal Engineering. 2018;**134**:396-406

[6] Salilih Elias M, Birhane YT. Modelling and performance analysis of directly coupled vapor compression solar refrigeration system. Solar Energy. 2019; **190**:228-238

[7] Ghritlahre HK, Prasad RK. Application of ANN technique to predict the performance of solar collector system—A review. Renewable Sustainable Energy Reviews. 2018;**84**(3): 75-88

[8] Kalogirou SA. Applications of artificial neural-networks for energy systems. Applied Energy. 2000; **67**:17-35

[9] Elsheikh AH, Sharshir SW, Elaziz MA, Kabeel AE, Wang GL, Zhang H. Modeling of solar energy systems using artificial neural network: A comprehensive review. Solar Energy. 2019;**180**:622-639

[10] Bellos E, Tzivanidis C. Development of an analytical model for the daily performance of solar thermal systems with experimental validation. Sustainable Energy Technology Assessments. 2018;**28**:22-29

[11] Du B, Lund PD, Wang J, Kolhe M, Eric H. Comparative study of modelling the thermal efficiency of a novel straight through evacuated tube collector with MLR, SVR, BP and RBF methods. Sustainable Energy Technologies and Assessments. 2021;**44**:1-10

[12] Du B, Peter D, Lund WJ. Improving the accuracy of predicting the performance of solar collectors through clustering analysis with artificial neural network models. Energy Reports. 2022;**8**: 3970-3981

[13] Vakili M, Sabbagh-Yazdi SR, Khosrojerdi S, Kalhor K. Evaluating the effect of particular matter pollution on estimation of daily global solar radiation using artificial neural network modeling based on meteorological data. Journal of Cleaner Production. 2017;**141**:1275-1285

[14] Mghouchi YE, Chham E, Zemmouri EM, Bouardi AEI. Assessment of different aombinations of meteorological parameters for predicting daily global solar radiation using artificial neural network. Building and Environment. 2019;**149**:607-622

[15] Bou-Rabee M, Sulaiman SA, Saleh MS, Marafi S. Using artificial neural networks to estimate solar

radiation in Kuwait. Renewable and Sustainable Energy Reviews. 2017;**72**: 434-438

[16] Shaddel M, Javan DS, Baghernia P. Estimation of hourly global solar irradiation on tilted absorbers from horizontal one using artificial neural network for case study of Mashhad. Renewable and Sustainable Energy Reviews. 2016;**53**:59-67

[17] Kashyap Y, Bansal A, Sao AK. Solar radiation forecasting with multiple parameters neural networks. Renewable and Sustainable Energy Reviews. 2015; **49**:825-835

[18] Hussain M, Dhimish M, Titarenko S, Mather P. Artificial neural network based photovoltaic fault detection algorithm integrating two bi-directional input parameters. Renewable Energy. 2020;**155**:1272-1292

[19] Yadav AK, Sharma V, Malik H, Ghandel SS. Daily array yield prediction of grid-interactive photovoltaic plant using relief attribute evaluator based radial basis function neural network. Renewable and Sustainable Energy Reviews. 2018;**81**:2115-2127

[20] Almonacid F, Fernandez EF, Mellit A, Kalogirou S. Review of techniques based on artificial neural networks for the electrical characterization of concentrator photovoltaic technology. Renewable and Sustainable Energy Reviews. 2017;**75**: 938-953

[21] Prakash O, Laguri V, Pandey A, Kumar A. Review on various modelling techniques for the solar dryers. Renewable and Sustainable Energy Reviews. 2016;**62**:396-417

[22] Shahram D, Mostafa E, Maryam K. Application of artificial neural network for performance prediction of a nanofluidbased direct absorption solar collector. Sustainable Energy Technologies and Assessments. 2019;**36**(12):1-11

[23] Maria TA, Nizar A, Subathra MSP, Godson AL. Analysing the performance of a flat plate solar collector with silver/ water nanofluid using artificial neural network. Procedia Computer Science. 2016;**93**:33-40

[24] Cuma C, Fethi H, Hamit C, Imdat T. Generating hot water by solar energy and application of neural network. Applied Thermal Engineering. 2005;**25**: 1337-1348

[25] Kalogirou SA. Prediction of flat-plate collector performance parameters using artificial neural networks. Solar Energy. 2006;**80**:248-259

[26] Kumar GH, Krishna PR. Exegetic performance prediction of solar air heater using MLP, GRNN and RBF models of artificial neural network technique. Journal of Environmental Management. 2018;**223**:566-575

[27] Abdellah S, Hossein P, Mehdi K. Comparative and performative investigation of various data-based and conventional theoretical methods for modelling heat pipe solar collectors. Solar Energy. 2020;**198**:212-223

[28] Kumar GH, Krishna PR. Prediction of heat transfer of two different types of roughened solar air heater using artificial neural network technique. Thermal Science and Engineering Progress. 2018;**8**: 145-153

[29] Kumar GH, Krishna PR. Prediction of thermal performance of unidirectional flow porous bed solar air heater with optimal training function using artificial neural network. Energy Procedia. 2017; **109**:369-376

*Application of Artificial Neural Network in Solar Energy DOI: http://dx.doi.org/10.5772/intechopen.106977*

[30] Kumar GH, Krishna PR. Investigation of thermal performance of unidirectional flow porous bed solar air heater using MLP, GRNN and RBF models of ANN technique. Thermal Science and Engineering Progress. 2018; **6**(6):226-235

[31] Zhj L, Hao L, Liu Kj Y, hch, Cheng Kw. Design of high-performance waterin-glass evacuated tube solar water heaters by a high-throughput screening based on machine learning: A combined modelling and experimental study. Solar Energy. 2017;**142**:61-67

[32] Sadeghi G, Nazari S, Ameri M, Shama F. Energy and exergy evaluation of the evacuated tube collector using Cu2O/water nanofluid utilizing ANN methods. Sustainable Energy Technologies and Assessments. 2020;**37**: 1-14

[33] Diez FJ, Navas-Gracia LM, Martinez-Rodriguez A, Correa-Guimaraes A, Chico-Santamarta L. Modelling of a flat-plate solar collector using artificial neural networks for different working fluid(water) flow rates. Solar Energy. 2019;**188**: 1320-1331

[34] Sozenm A, Menlikm T, Unvar S. Determination of efficiency of flat-plate solar collectors using neural network approach. Expert Systems with Applications. 2008;**35**:1553-1539

[35] Budihardjo I, Morrison GL, Behnia M. Measurement and simulation of flow rate in a water-in-glass evacuated tube solar collectors. Solar Energy. 2007; **81**:1460-1472

[36] Morrison GL, Budihardjo I, Behnia M. Measurement and simulation of flow rate in a water-in-glass evacuated tube solar water heater. Solar Energy. 2005;**78**(2):257-267

[37] Kim Y, Seo T. Thermal performances comparisons of the glass evacuated tube collectors with shapes of absorber tube. Renewable Energy. 2007;**32**: 772-795

[38] Shf Q, Ruth M, Ghosh S. Evacuated tube collectors: A notable driver behind the solar water heater industry in China. Renewable and Sustainable Energy Reviews. 2015;**47**:580-588

[39] Daghigh R, Shafieian A. Theoretical and experimental analysis of thermal performance of a solar water heating system with evacuated tube heat pipe collector. Applied Thermal Engineering. 2016;**103**:1219-1227

[40] Gao Y, Zhang Q, Fan R, Liu X, Yu Y. Effects of thermal mass and flow rate on forced-circulation solar hot-water system: Comparison of water-in-glas and U-pipe evacauetd-tube solar collectors. Solar Energy. 2013;**98**:290-301

[41] Ayompe LM, Duffy A. Thermal performance analysis of a solar water heating system with heat pipe evacuated tube collector using data from a field trial. Solar Energy. 2013;**90**:17-28

[42] Salgado-Conrado L, Lopez-Montelongo A. Barriers and solutions of solar water heaters in Mexican household. Solar Energy;**188**:831-838

[43] Li JR, Li XD, Wang Y, Tu JY. A theoretical model of natural circulation flow and heat transfer within horizontal evacuated tube considering the secondary flow. Renewable Energy. 2020;**147**(3):630-638

[44] Sobhansarbandi S, Martinez PM, Papadimitratos A, Zakhidov A, Hassanopour F. Evacuaetd tube solar collector with multifunctional absorber layers. Solar Energy. 2017;**146**(4): 342-350

[45] Budihardjo I, Morrison GL, Behnia M. Natural circulation flow through water-in-glass evacuated tube solar collectors. Solar Energy;**81**(12): 1460-1472

[46] Esen H, Ozgen F, Esen M, Sengur A. Artificial neural network and wavelet neural network approached for modelling of a solar air heater. Expert Syatems with Applications. 2009;**36**(10): 11240-11248

[47] Tagliafico LA, Scarpa F, Rosa MD. Dynamic thermal models and CFD analysis for flat-plate thermal solar collectors-a review. Renewable and Sustainable Energy Reviews. 2014;**30**(2): 526-537

[48] Shafieian A, Osman JJ, Khiadani M, Nosrati A. Enhancing heat pipe solar water heating systems performance using a novel variable mass flow rate technique and different solar working fluids. Solar Energy. 2019;**186**(5): 191-203

[49] Khatib T, Mohamed A, Sopian K. A review of solar energy modelling techniques. Renewable and Sustainable Energy Reviews. 2012;**16**(6):2864-2869

[50] Alvarez ME, Hernandez JA, Bourouis M. Modelling the performance parameters of a horizontal falling film absorber with aqueous (lithium, potassium, sodium) nitrate solution using artificial neural networks. Energy. 2016;**102**(5):313-323

[51] Bin D, Lund PD, Wang J. Combining CFD and artificial neural network techniques to predict the thermal performance of all-glass straight evacuated tube solar collector. Energy. 2021;**220**:1-15

[52] Filipovic P, Dovic D, Ranilovic B, Horvat I. Numerical and experimental approach for evaluation of thermal performance of a polymer solar collector. Renewable and Sustainable Energy Reviews. 2019;**112**(9):127-139

[53] Alfaro-Ayala JA, Martinez-Rodriguez G, Picon-Nunez M, Uribe-Ramirez AR, Gallegos-Munoz A. Numerical study of a low temperature water-in-glass evacuated tube solar collector. Energy Conversion and Management. 2015; **94**(4):472-481

[54] Ahmed R, Sreeram V, Mishra Y, Arif MD. A review and evaluation of the state-of-the-art in PV solar power forecasting: Techniques and optimization. Renewable and Sustainable Energy Reviews. 2020;**124**:1-26. Article 109792

[55] Feng C, Zhang J, SolarNet. A sky image-based deep convolutional neural network for intra-hour solar forecasting. Solar Energy. 2020;**204**(7):71-78

[56] Ammar MB, Chaabene M, Chtourou Z. Artificial neural network based control for PV/T panel to track optimum thermal and electrical power. Energy Conversion and Management. 2013;**65**(1):372-380

[57] Ali HA, Al-Waeli KS, Kazem HA, Yousif JH, Miqdam TC, Ibrahim A, et al. Comparison of prediction methods of PV/T nanofluid and nano-PCM system using a measured dataset and artificial neural network. Solar Energy. 2018; **162**(3):378-396

[58] Ahmadi MH, Baghban A, Sadeghzadeh M, Zamen M, Mosavi A, Shamshirband S, et al. Evaluation of electrical efficiency of photovoltaic thermal solar collector. Engineering Applications of Computational Fluid Mechanics. 2020;**14**(1):545-565. DOI: 10.1080/19942060.2020.1734094

*Application of Artificial Neural Network in Solar Energy DOI: http://dx.doi.org/10.5772/intechopen.106977*

[59] Zamen M, Alireza Baghban S, Pourkiaei M, Ahmadi MH. Optimization methods using artificial intelligence algorithms to estimate thermal efficiency of PV/T system. Energy Science and Engineering. 2019;**7**(2):821-824. DOI: 10.1002/ese3.312

[60] Cao Y, Kamrani E, Mirzaei S, Khandakar A, Vaferi B. Electrical efficiency of the photovoltaic/thermal collectors cooled by nanofluids: Machine learning simulation and optimization by evolutionary algorithm. Energy Reports. 2022;**8**(1):24-36

[61] Kalani H, Sardarabadi M, Passandideh-Frad M. Using artificial neural network models and particle swarm optimization for manner prediction of a photovoltaic thermal nanofluid based collector. Applied Thermal Engineering. 2017;**113**(2): 1170-1177

[62] Renno C, Petito F. Triple-junction cell temperature evaluation in a CPV system by means of random-Forest model. Energy Conversion and Management. 2018;**169**(5):124-136

[63] Agbulut U, Gurel AE, Ergun A, Ceylan I. Performance assessment of a V0trough photovoltaic system and prediction of power output with different machine learning algorithms. Journal of Cleaner Production. 2020; **268**(122269):1-12

## **Chapter 8**

## Modeling a Petrochemical Unit with Artificial Neural Networks (ANN)

*Shafaati Akbar and Pourazad Hamidreza*

## **Abstract**

The purpose of this chapter is to model a petrochemical unit by neural networks to estimate the product flow rate of the plant by it. Multilayer perceptron and RBF neural networks have been used in this work, and finally, the outputs of both types of networks have been compared to choose the more accurate network. The same data have been used for training and modeling both networks. The data used for this modeling have been collected by measuring the flow rate of input materials and output products from the plant in ton per day. **Table 1** shows the input materials and products.

**Keywords:** artificial neural networks, RBF, MLP, regression, petrochemical unit

## **1. Introduction**

To model a petrochemical unit by the artificial neural network, the necessary acquaintances with artificial neural networks should be made first, and we should answer the question of why we should use artificial neural networks instead of conventional methods.

The artificial neural network is a complex nonlinear computing system that is inspired by nature, and the main advantage of this network in performing calculations compared with other computing systems is because of its internal structure [1].

Neural networks are composed of a large number of neurons that have extensive connections with each other. These neurons have the ability to share information with each other. A neural network performs calculations by organizing neurons and communication between them and the information stored in them.

Using conventional modeling methods requires a lot of mathematical calculations and has many complications, especially when we are dealing with a nonlinear system. It takes a lot of time to do this, and if there is an error in the calculations, all the steps must be repeated, and the existing error must be identified and fixed. On the other hand, all the influencing parameters of the designed model should be considered, and a relationship should be defined for how it affects the system, and finding these relationships also has complications. Finding these relationships is important because it can have a great impact on the accuracy of the designed model's output. Finally, all relevant equations must be solved, which is very time-consuming.

There is no need to perform complex mathematical calculations in modeling with a neural network, and we can save time. Other advantages of neural networks compared with other methods include adaptability, nonlinearity, error tolerance, and flexibility against changing conditions.

To model with an artificial neural network, a dataset is needed to train the network, and these data must be collected by experimental tests, industrial devices, etc.

For example, in the modeling of a petrochemical unit, the goal is to predict outputs according to the flow rate of input to the petrochemical unit, so to prepare basic data for training the network, it is necessary to flow rate of inputs and outputs, during different operations be measured and entered into the network as training data.

In this modeling, the entire petrochemical unit considered as a black box (**Figure 1**), and only the flow rates of input and output materials are considered as influencing parameters. This is because none of the processes that take place inside the petrochemical unit are involved in the model designed based on neural networks.

Here are several related works:

Tufaner et al. developed a three-layer artificial neural network (ANN) and nonlinear regression model to predict the performance of biogas production from the anaerobic hybrid reactor (AHR). In this study, experimental data were used to estimate the biogas production rate with models produced using both ANNs and nonlinear regression methods. Moreover, 10 related variables, such as reactor fill ratio, influent pH, effluent pH, influent alkalinity, effluent alkalinity, organic loading rate, effluent chemical oxygen demand, effluent total suspended solids, effluent suspended solids, and effluent volatile suspended solids, were selected as inputs of the model [2].

DS Pandey et al. developed a multilayer feed-forward neural network to predict the lower heating value of gas (LHV), lower heating value of gasification products including tars and entrained char (LHVp), and syngas yield during gasification of municipal solid waste (MSW) during gasification in a fluidized bed reactor. These artificial neural networks (ANNs) with different architectures are trained using the Levenberg-Marquardt (LM) back-propagation algorithm. Nine input and three output parameters are used to train and test various neural network architectures in both multiple-output and single-output prediction paradigms using the available experimental datasets [3].

M. EI-Sefy et al. developed a feed-forward back-propagation artificial neural network (ANN) model and trained to simulate the interaction between the reactor core and the primary and secondary coolant systems in a pressurized water reactor. A Nuclear Power Plant (NPP) is a complex dynamic system of systems with highly nonlinear behaviors. In order to control the plant operation under both normal and abnormal conditions, the different systems in NPPs (e.g., the reactor core components, primary and secondary coolant systems) are usually monitored continuously, resulting in very large amounts of data. The transients used for model training

**Figure 1.**

*Assumed structure for the petrochemical unit for modeling by artificial neural network.*

included perturbations in reactivity, steam valve coefficient, reactor core inlet temperature, and steam generator inlet temperature. Uncertainties of the plant physical parameters and operating conditions were also incorporated in these transients [4].

### **1.1 Introduction to radial basis function networks (RBF)**

**Radial basis neural networks** use the radial basis function instead of the logistic function as the activation function. The logistic function maps some arbitrary value to a 0–1 interval to answer a yes or no question (binary question) [5].

These types of neural networks are suitable for "classification" and "decisionmaking systems," but they are not good in Continuous values. While the basic radial basis function answers the question, how far are we from the goal? And this makes these neural networks suitable for function approximation and machine control (for example, as an alternative to the PID controller) [5].

Radial basis neural networks are special types of natural neural networks that are distance-based and measure the similarity between data based on distance.

Unlike MLP networks, which have multiple consecutive layers, the RBF network consists of three fixed layers. An input layer, which is the input data entered into the network from there, the middle layer, which contains radial basis functions, and the output layer, which gives a linear combination of all middle layer outputs.

Output layer uses a linear activation function or can be thought of without any activation function [6].

#### **1.2 Introduction to multilayer perceptron networks (MLP)**

One of the most basic neural models available is the multilayer perceptron model, which simulates the transfer function of the human brain. In this type of neural network, most of the network behavior of the human brain and signal propagation have been considered in it, and hence, they are sometimes called feed-forward networks [1].

Perceptron is a machine learning algorithm that is in the field of supervised learning. This algorithm is known as one of the first artificial neural network algorithms used in this field. Perceptron is considered a type of binary classification algorithm, which means that this algorithm can decide whether a member belongs to a specific category or not [7].

**Figure 2.** *Schematic of an RBF neural network.*

A multilayer perceptron neural network consists of at least three layers, which are the input layer, a hidden layer, and the output layer. In this type of artificial neural network, the outputs of the first (input) layer are used as the inputs of the next (hidden) layer. This continues until, after a certain number of layers, the outputs of the last hidden layer are used as the inputs of the output layer. All the layers that are placed between the input layer and the output layer are called "Hidden Layers" (**Figures 2** and **3**).

## **2. Modeling by radial basis function networks (RBF) neural network**

The necessary dataset for training this network is by measuring the flow rate of input materials and products (outputs) that have been collected, which has been measured every day for a year. **Table 1** shows the inputs and outputs of the petrochemical unit.

For testing the network, experimental data were given to the network and the networks outputs compared with the real outputs of petrochemical unit shown in the **Figure 4**.

The empty circles on the blue graph in **Figure 4** indicate the measured amount of the products (experimental data or targets), and the empty circles on the orange graph also indicate the predicted parameters by the neural network. Some of these circles are almost coincident with each other, and some are slightly different from each other. In the best case, these points should overlap. The names of each of which are indicated by arrows.

To better understand the amount of difference and whether the network has provided an acceptable performance or not, we can use linear regression between the data estimated by the network and the measured parameters (experimental data). **Figure 5** shows the regression between the experimental and predicted data used in **Figure 4**.

*Modeling a Petrochemical Unit with Artificial Neural Networks (ANN) DOI: http://dx.doi.org/10.5772/intechopen.107723*


#### **Table 1.**

*Inputs and outputs of the petrochemical unit.*

**Figure 4.** *Difference between RBF network's outputs and experimental data(targets).*

As can be seen, the correlation coefficient between the estimated and experimental data is 0.987, which is acceptable for the petrochemical unit in non-essential and non-sensitive situations.

## **3. Modeling by multilayer perceptron (MLP) neural network**

The multilayer perceptron network considered for this modeling consists of three layers. The first layer has 80, the second layer has 35, and the third layer has 13 neurons.

**Figure 5.** *Regression between predicted data with RFB network and experimental data.*

#### **Figure 6.**

*Difference between multilayer perceptron network's outputs and experimental data(targets).*

The activation functions considered for each of these layers are **Relu** for the first and second layers and **purlin** for the last layer, respectively.

The best performance of the network is achieved when it gives the value of the error between the network and the experimental data to the lowest possible value, and this is done by some functions, which are called performance functions. In this *Modeling a Petrochemical Unit with Artificial Neural Networks (ANN) DOI: http://dx.doi.org/10.5772/intechopen.107723*

#### **Figure 7.**

*Regression between predicted data with MLP network and experimental data.*

modeling, the mean square error (MSE) performance function is used. For this modeling, the Levenberg Marquardt training algorithm is used [1, 8].

After the training process, the same data used for testing the RBF network in **Figure 4** are used to test the MLP network, and the results are shown in **Figure 6**.

Like **Figure 4**, the blue graph represents the experimental data, and the orange graph represents the data estimated by the neural network. The empty circles on the blue graph in **Figure 7** indicate the measured amount of the products (experimental data), and the empty circles on the orange graph also indicate the estimated parameters by the neural network.

As before, to better understand the amount of difference and whether the network has provided an acceptable performance or not, we can use linear regression between the data predicted by the network and the measured parameters (experimental data).

The correlation coefficient between experimental and estimated data by the network is equal to 0.995, which indicates the good performance of the network.

## **4. Conclusion**

By comparing the correlation coefficient of the RBF neural network, which is equal to 0.987, and the correlation coefficient of the MLP neural network, which is equal to 0.995, it can be concluded that the MLP neural network can perform better in estimating the amount of petrochemical unit products in different conditions. Due to the complex processes that are carried out inside the petrochemical unit, by which inputs are converted into products, a large number of experimental samples are needed for modeling by neural network and its training. In other words, we should take samples from everything that has a direct or indirect effect on the system under study (petrochemical unit) and changes the amount of products produced by the petrochemical unit. It means recording the amount of these changes and finally preparing the required dataset. It is obvious that one of the things that have a great effect on the amount of products produced from a petrochemical unit is the amount of input materials (feed). Therefore, the amount of changes in production products that occur due to changes in the amount of feed should be recorded. These changes were measured per ton per day. As it was said, complex processes take place inside the Petrochemical unit, such as chemical reactors, distillation towers, etc., each of which has an effect on the amount of production, but due to the limitation in measuring these factors, it was decided to measure only the amount of input feed and changes in the amount of produced products, and for this reason, we omitted the details and processes within the petrochemical unit. By doing this, the accuracy of the designed neural network was disrupted, and to solve this problem, it was decided to increase the number of samples collected from the amount of input materials (feed) and changes in the products produced, so that the neural network has more data for training. It took a year to collect this amount of data to complete the desired dataset.

## **Author details**

Shafaati Akbar1 \* and Pourazad Hamidreza2

1 Urmia University of Technology, Urmia, Iran

2 Sahand University of Technology, Tabriz, Iran

\*Address all correspondence to: akbar007.78.sh@gmail.com

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## **References**

[1] Haykin SS. Neural Networks and Learning Machines/Simon Haykin. New York: Prentice Hall; 2009. ch4, ch5

[2] Tufaner F, Demirci Y. Prediction of biogas production rate from anaerobic hybrid reactor by artificial neural network and nonlinear regressions models. Clean Technologies and Environmental Policy. 2020;**22**(3):713-724

[3] Pandey DS et al. Artificial neural network based modeling approach for municipal solid waste gasification in a fluidized bed reactor. Waste Management. 2016;**58**:202-213

[4] El-Sefy M et al. Artificial neural network for predicting nuclear power plant dynamic behaviors. Nuclear Engineering and Technology. 2021;**53**(10):3275-3285

[5] The Mostly Complete Chart of Neural Networks Explained. 2021. Available from: https://towardsdatascience.com/ the-mostly-complete-chart-of-neuralnetworks-explained-3fb6f2367464

[6] Radial Basis Function Neural Network Simplified. 2021. Available from: https:// towardsdatascience.com/radial-basisfunction-neural-network-simplified-6f26e3d5e04d

[7] Multi-Layer Perceptron. Available from: https://www.sciencedirect. com/topics/computer-science/ multilayer-perceptron

[8] Haykin SS. Neuronal Networks: A Comprehensive Foundation. Subsequent edition. New York: Prentice Hall; 2000. ch1, ch3

## *Edited by Patrick Chi-Leung Hui*

This book examines artificial neural networks (ANNs) and their applications in various fields. Chapters address ANNs and deep learning techniques for real-world applications in health care, materials processing, energy management, and more.

## *Andries Engelbrecht, Artificial Intelligence Series Editor*

Published in London, UK © 2023 IntechOpen © your\_photo / iStock

Artificial Neural Networks - Recent Advances, New Perspectives and Applications

IntechOpen Series

Artificial Intelligence, Volume 13

Artificial Neural Networks

Recent Advances, New Perspectives

and Applications

*Edited by Patrick Chi-Leung Hui*