**2. Most common activation functions used in chemical and process engineering applications**

A neural network contains hyperparameters to be tuned prior to training in order to achieve the best configuration. Among them, the following can be mentioned: (i) number of hidden neurons, (ii) activation function, (iii) optimizer, and (iv) regularization and their dependencies (learning rate, optimizer specific, dropout rate, etc.).

Particularly, activation functions determine the output of the model, its accuracy, and the computational efficiency of training a model; therefore, they are an essential part of the structure of the neural networks. The Sigmoid function, Hyperbolic Tangent (TanH), and ReLU (Rectified Linear Unit) are the most

**169**

*Application of Artificial Neural Networks to Chemical and Process Engineering*

common in Chemical Engineering; however, recent studies improve these classical activation functions, defining new ones, such as Leaky ReLU, Swish, H-Swish [11]. In the sigmoid activation function, the output values are bounded between 0 and 1, normalizing each neuron output. However, there is a problem with the vanishing gradient, and outputs are not zero-centered. To make the modeling easier, the TanH was proposed, for which the outputs are zero-centered, i.e., when

In order to circumvent the computational expense, the ReLU was proposed. It is a computationally efficient linear activation function that will output the input directly if it is positive; otherwise, it will output zero. A further development is the Leaky ReLU, whereby the slope is changed to the left of x = 0, avoiding the dying ReLU problem, whereby some neurons can die for all inputs and remain inactive. Therefore, the correct definition of the activation function is a fundamental part of the hyperparameter tuning to guarantee the best configuration of a neural network. In the course of the chapter, we will always mention which activation

In recent decades, there have been a large number of studies using ANNs in chemical engineering, from molecular property prediction [12], fault diagnosis [13], predictive control [14], and optimization [15, 16]. The use of first-principles knowledge must be integrated with the neural network in order to retain a more physical understanding of the system [14]. In the following subsections, we presented the principal papers of each area, with tables summarizing the characteris-

Several data-driven models have been employed to predict phase equilibrium and transport phenomena coefficients for various chemical systems [17]. Indeed, these fields already have some empiricism in their standard mathematical formulations. For example, flash algorithms have some empiricism when using binary interaction parameters in subjective mixing rules [18], and the majority of transport phenomena coefficients are estimated from empirical correlations, sometimes questionable [19]. Therefore, the use of ANNs is a better way to find functional relationships between

Moreover, ANNs reveal a conceivably faster choice to those property prediction calculations in process simulations, limiting process control applications that require to be conducted in real-time. For this, Poort et al. [21] studied the replacement of conventional Equations of State (EoS) for property and phase stability calculations on a binary mixture of methanol–water. They trained ANNs with data generated through the Thermodynamics for Engineering Applications (TEA) to represent four kinds of flash algorithms, leading to an enhancement of 15 times for

the model variables instead of first determining these constants [20].

the predictions of properties and 35 times for classification of the phases.

Also noteworthy is that ANNs have also been used to predict if a particular mixture forms an azeotrope – essential information to design and to control a separation process. Alves et al. [22] successfully developed an ANN classification model to determine whether binary mixtures can exhibit (or not) azeotropy based solely on the properties of pure components as input variables. Therefore, it shows the power of ANNs for this type of thermodynamic evaluation since it does not take

the inputs contain strongly negative, neutral, and strongly positive values.

*DOI: http://dx.doi.org/10.5772/intechopen.96641*

function each work used in the summary tables.

**3.1 Thermodynamics and transport phenomena**

into account the non-ideality of the mixture.

tics of the ANNs used.

**3. Applications to chemical and process engineering**

#### *Application of Artificial Neural Networks to Chemical and Process Engineering DOI: http://dx.doi.org/10.5772/intechopen.96641*

common in Chemical Engineering; however, recent studies improve these classical activation functions, defining new ones, such as Leaky ReLU, Swish, H-Swish [11].

In the sigmoid activation function, the output values are bounded between 0 and 1, normalizing each neuron output. However, there is a problem with the vanishing gradient, and outputs are not zero-centered. To make the modeling easier, the TanH was proposed, for which the outputs are zero-centered, i.e., when the inputs contain strongly negative, neutral, and strongly positive values.

In order to circumvent the computational expense, the ReLU was proposed. It is a computationally efficient linear activation function that will output the input directly if it is positive; otherwise, it will output zero. A further development is the Leaky ReLU, whereby the slope is changed to the left of x = 0, avoiding the dying ReLU problem, whereby some neurons can die for all inputs and remain inactive.

Therefore, the correct definition of the activation function is a fundamental part of the hyperparameter tuning to guarantee the best configuration of a neural network. In the course of the chapter, we will always mention which activation function each work used in the summary tables.
