**4.6.3 System modeling**

The modeling process based on ANFIS can broadly be classified in three steps:

Step 1. System identification

The first step in system modeling is the identification inputs and outputs variables called the system's Takagi-Sugeno-Kang (TSK) model [33,34] are formed, where antecedent are defined be a set of non-linear parameters and consequents are either linear combination of input variables and constant terms or may be constants, generally called, singletons.

Fig. 4.10. ANFIS structure of the model.

#### Step 2. Determining the network structure

Once the input and output variables are identified, the neuro-fuzzy system is realized using a six-layered network as shown in Figure 4.10. The input, output and node functions of each layer are explained in the subsequent paragraphs

#### **Layer 1: Input layer**

196 Fuzzy Inference System – Theory and Applications

converts the fuzzy outputs obtained by interface engine into a non-fuzzy output real number domain. In order to incorporate the capability of learning from input/output data sets in fuzzy interface systems, a corresponding adaptive neural network is generated. An adaptive network is a multi-layer feed-forward network consisting of nodes and directional links through which nodes are connected. As shown in Figure 4.10. Layer 1 is the input layer, layer 2 describes the membership functions of each fuzzy input, layer 3 is interface layer and normalizing is performed in layer 4. Layer 5 gives the output and layer 6 is the defuzzification layer. The layers consist of fixed and adaptive nodes, each adaptive node has asset of parameters and performs a particular function (node function) on incoming signals. The learning model may consist of either back propagation or hybrid learning algorithm, the learning rules specifies how the parameter of adaptive node should be change to minimize a prescribed error measure [37]. The change in values of the parameters results in change in

shape of membership functions associated with fuzzy interface system.

The modeling process based on ANFIS can broadly be classified in three steps:

The first step in system modeling is the identification inputs and outputs variables called the system's Takagi-Sugeno-Kang (TSK) model [33,34] are formed, where antecedent are defined be a set of non-linear parameters and consequents are either linear combination of

input variables and constant terms or may be constants, generally called, singletons.

**4.6.3 System modeling** 

Step 1. System identification

Fig. 4.10. ANFIS structure of the model.

Each node in layer 1 represents the input variables of the model identified in step 1 this layer simply transmits these input variables to the fuzzification layer.

#### **Layer 2: Fuzzification layer**

The fuzzification layer describes the membership function of each input fuzzy set, membership functions are used to characterize fuzziness in fuzzy sets, the output of each node i in this layer is given by *A i <sup>i</sup> x* where the symbol *<sup>A</sup> x* is the membership function. Its value on the unit interval (0, 1) measure the degree to which elements x belongs to the fuzzy set A, xi is the input to the node i and Ai is the linguistic label for each input variable associated with this node.

Each node in this layer is an adaptive node that is the output of each node depends on the parameters pertaining to these nodes. Thus the membership function for *A* can be any appropriate parameterized membership function. The most commonly used membership functions are triangular, trapezoidal, Gaussian, and bell shaped. Any of these choices may be used, the triangular and trapezoidal membership functions have been used extensively especially in real-time implementations due to their simple formulas and computational efficiency.

In our original fuzzy model [40] we have used triangular membership functions however since these membership functions are composed of straight line segments they are not smooth at corner points specified by the parameters though the parameters of these membership functions can be optimized using direct search methods but they are less efficient and more time consuming, also the derivatives of the functions are not continuous so the powerful and more efficient gradient methods cannot be used for optimizing their parameters Gaussian and bell shaped membership functions are becoming increasingly popular for specifying fuzzy sets as they are non-linear and smooth and their derivatives are continuous gradient methods can be used easily for optimizing their design parameters . Thus in this model, we have replaced the triangular fuzzy memberships with bell shapes functions (Table 4.7). The bell or generalized bell (or gbell) shaped membership function is specified by a set of three fitting parameters {*a,b,c*} as:

$$\mu\_A(\mathbf{x}) = \frac{1}{1 + \left[\left(\left(\mathbf{x} - \mathbf{c}\right) / a\right)^2\right]^b} \tag{4.12}$$

The desired shape of gbell membership function can be obtained by proper selection of the parameters more specifically we can adjust *c* and *a* to vary the center and width of membership function, and *b* to control the slope at the crossover points. The parameter *b* gives gbell shaped membership function one more degree of freedom than the Gaussian membership function and allows adjusting the steepness at crossover points. The parameters in this layer are referred to as premise parameters.

Some Studies on Noise and Its Effects on Industrial/Cognitive Task Performance and Modeling 199

*i ii i i i i <sup>w</sup> <sup>f</sup> O O wf*

The ANFIS model fine-tunes the parameters of membership functions using either the back propagation learning algorithm is an error-based supervised learning algorithm. It employs an external reference signal, which acts like a teacher and generate an error signal by comparing the reference with the obtained response. Based on error signal, the network modifies the design parameters to improve the system performance. It uses gradient descent method to update the parameters. The input/output data pairs are often called as training data or learning patterns. They are clamped onto the network and functions are propagated to the output unit. The network output is compared with the desired output values. The

6

 <sup>2</sup> 6

6

*K*

*I*

error measure *EP*, for *P* pattern at the output node in layer 6 may be given as:

1 2

layer in the network. Further the sum of squared errors for the entire training data set is:

1 2

*P P*

6

*O*

The error measure with respect to node output in layer 6 is given by delta (

corresponding parameters the delta value for the layer 5 is given as:

*P PP*

<sup>2</sup> *<sup>E</sup> T O*

This delta value gives the rate at which the output must be changed in order to minimize the error function, since the output of adaptive nodes of the given adaptive network depend on the design parameters so the design parameters must be updated accordingly. Now this delta value of the output unit must be propagated backward to the inner layers in order to distribute the error of output unit to all the layers connected to it and adjust the

> 5 65 *E E O O OO*

> > 1

*I P E EO*

*KK K E EO OO O*

Similarly for any *k*th layer, the delta value may be calculated using the chain rule as:

is a set of design parameters of the given adaptive network then

 *O*

*<sup>w</sup>* (4.16)

*P PP E TO* (4.17)

*E E TO* (4.18)

(4.19)

(4.20)

(4.21)

(4.22)

*<sup>P</sup> O* the single node output of defuzzification

):

*i ii*

Step 3. Learning algorithm and parameter tuning

<sup>2</sup>

<sup>6</sup>

<sup>1</sup>

Now if

Where *<sup>P</sup> <sup>T</sup>* are the target or desired output and 6

#### **Layer 3: inference layer**

The third layer is inference layer. Each node in this layer is fixed node and represents the IF part of a fuzzy rule. This layer aggregates the membership grades using any fuzzy intersection operator which can perform fuzzy AND operation [35]. The intersection operator is commonly referred to as T-norm operators are min or product operators. For instance

IF *x1* is *A1* AND *x2* is *A2* AND *x3* is *A3* THEN *y* is *f*(*x1*, *x2, x3*)

Where *f*(*x1*, *x2, x3*) is a linear functions of input variables or may be constant, the output of *i*th node is given as:

$$w\_i = \mu\_{A\_1}(\mathbf{x\_1}) \times \mu\_{A\_2}(\mathbf{x\_2}) \times \mu\_{A\_3}(\mathbf{x\_3}) \tag{4.13}$$

#### **Layer 4: normalization layer**

The *i*th node of this layer is also a fixed node and calculates the ratio of the *i*th 'rules' firing strength in interference layer to the sum of all the rules firing strengths

$$
\overline{w}\_i = \frac{w\_i}{w\_1 + w\_2 + \dots + w\_R} \tag{4.14}
$$

Where *i* =1,2, , *R* and *R* is total number of rules. The outputs of this layer are called normalized firing strengths.

#### **Layer 5: Output layer**

This layer represents the THEN part (i.e., the consequent) of the fuzzy rule. The operation performed by the nodes in this layer is to generate the qualified consequent (either fuzzy or crisp) of each rule depending on firing strength. Every node *i* in this layer is an adaptive node. The output of the node is computed as:

$$O\_i = \overline{w}\_i f\_i \tag{4.15}$$

Where *wi* is normalized firing strength from layer 3 and *<sup>i</sup> f* is a linear function of input variables of the form (pix1+qix2+ri)where {pi, qi, ri} is the parameter set of the node i, referred to as consequent parameters or *f* may be a constant if *<sup>i</sup> f* is linear function of input variables then it is called first order Sugeno fuzzy model (as in our present model) and if *<sup>i</sup> f* is a constant then it is called zero order Sugeno fuzzy model. This consequent can be linear function as long as it appropriately describes the output of the model within the fuzzy region specified by the antecedent of the rule. But in the present case, the relationship between input variables (noise level, cognitive task type, and age) and output (reduction in cognitive task efficiency) is highly non-linear. In Sugeno model, consequent can be taken as singleton, i.e. real numbers without losing the performance of the system.

#### **Layer 6: Defuzzification layer**

This layer aggregate the qualified consequent to produce a crisp output .the single node in this layer is a fixed node. It computes the weighted average of output signals of the output layer as:

$$O = \sum\_{i} O\_{i} = \sum\_{i} \overline{w}\_{i} f\_{i} = \frac{\sum\_{i} w\_{i} f\_{i}}{\sum\_{i} w\_{i}} \tag{4.16}$$

#### Step 3. Learning algorithm and parameter tuning

198 Fuzzy Inference System – Theory and Applications

The third layer is inference layer. Each node in this layer is fixed node and represents the IF part of a fuzzy rule. This layer aggregates the membership grades using any fuzzy intersection operator which can perform fuzzy AND operation [35]. The intersection operator is commonly referred to as T-norm operators are min or product operators. For

Where *f*(*x1*, *x2, x3*) is a linear functions of input variables or may be constant, the output of *i*th

The *i*th node of this layer is also a fixed node and calculates the ratio of the *i*th 'rules' firing

*i*

*w*

*R*

1 2

Where *i* =1,2, , *R* and *R* is total number of rules. The outputs of this layer are called

This layer represents the THEN part (i.e., the consequent) of the fuzzy rule. The operation performed by the nodes in this layer is to generate the qualified consequent (either fuzzy or crisp) of each rule depending on firing strength. Every node *i* in this layer is an adaptive

*O wf i ii* (4.15)

Where *wi* is normalized firing strength from layer 3 and *<sup>i</sup> f* is a linear function of input variables of the form (pix1+qix2+ri)where {pi, qi, ri} is the parameter set of the node i, referred to as consequent parameters or *f* may be a constant if *<sup>i</sup> f* is linear function of input variables then it is called first order Sugeno fuzzy model (as in our present model) and if *<sup>i</sup> f* is a constant then it is called zero order Sugeno fuzzy model. This consequent can be linear function as long as it appropriately describes the output of the model within the fuzzy region specified by the antecedent of the rule. But in the present case, the relationship between input variables (noise level, cognitive task type, and age) and output (reduction in cognitive task efficiency) is highly non-linear. In Sugeno model, consequent can be taken as

This layer aggregate the qualified consequent to produce a crisp output .the single node in this layer is a fixed node. It computes the weighted average of output signals of the output

123 (4.13)

*ww w* (4.14)

IF *x1* is *A1* AND *x2* is *A2* AND *x3* is *A3* THEN *y* is *f*(*x1*, *x2, x3*)

12 3 *wx x x iA A A*

strength in interference layer to the sum of all the rules firing strengths

*i*

singleton, i.e. real numbers without losing the performance of the system.

*w*

**Layer 3: inference layer** 

instance

node is given as:

**Layer 4: normalization layer** 

normalized firing strengths.

**Layer 6: Defuzzification layer** 

layer as:

node. The output of the node is computed as:

**Layer 5: Output layer** 

The ANFIS model fine-tunes the parameters of membership functions using either the back propagation learning algorithm is an error-based supervised learning algorithm. It employs an external reference signal, which acts like a teacher and generate an error signal by comparing the reference with the obtained response. Based on error signal, the network modifies the design parameters to improve the system performance. It uses gradient descent method to update the parameters. The input/output data pairs are often called as training data or learning patterns. They are clamped onto the network and functions are propagated to the output unit. The network output is compared with the desired output values. The error measure *EP*, for *P* pattern at the output node in layer 6 may be given as:

$$E^P = \frac{1}{2} \left( T^P - O\_\delta^P \right)^2 \tag{4.17}$$

Where *<sup>P</sup> <sup>T</sup>* are the target or desired output and 6 *<sup>P</sup> O* the single node output of defuzzification layer in the network. Further the sum of squared errors for the entire training data set is:

$$E = \sum\_{P} E^{P} = \frac{1}{2} \sum\_{P} \left( T^{P} - O\_{6}^{P} \right)^{2} \tag{4.18}$$

The error measure with respect to node output in layer 6 is given by delta ( ):

$$
\delta = \frac{\partial E}{\partial O\_{\delta}} = -2\left(T - O\_{\delta}\right) \tag{4.19}
$$

This delta value gives the rate at which the output must be changed in order to minimize the error function, since the output of adaptive nodes of the given adaptive network depend on the design parameters so the design parameters must be updated accordingly. Now this delta value of the output unit must be propagated backward to the inner layers in order to distribute the error of output unit to all the layers connected to it and adjust the corresponding parameters the delta value for the layer 5 is given as:

$$\frac{\partial E}{\partial \text{O}\_5} = \frac{\partial E}{\partial \text{O}\_6} \frac{\partial \text{O}\_6}{\partial \text{O}\_5} \tag{4.20}$$

Similarly for any *k*th layer, the delta value may be calculated using the chain rule as:

$$
\frac{\partial E}{\partial O\_K} = \frac{\partial E}{\partial O\_{K+1}} \frac{\partial O\_{K+1}}{\partial O\_K} \tag{4.21}
$$

Now if is a set of design parameters of the given adaptive network then

$$\frac{\partial E}{\partial \alpha} = \sum\_{\alpha \in P} \frac{\partial E}{\partial O^l} \frac{\partial O^l}{\partial a} \tag{4.22}$$

Some Studies on Noise and Its Effects on Industrial/Cognitive Task Performance and Modeling 201

Cognitive task type Simple 1-3

Low 40-90 Medium 80-100 High 90-110

Moderate 2-4 Complex 3-5

None 0 % Low 25 % Moderate 50 % High 75 % Very high 100 %

Young age 15-35 years Medium age 30-50 years Old age 45-65 years

System's Linguistic Linguistic Values Fuzzy Intervals

Noise level

Age

Table 4.7. Minimum error membership functions.

Fig. 4.11. (a) Membership functions of noise level.

Reduction In cognitive task Efficiency

Table 4.6. Inputs and outputs with their associated neural fuzzy values.

Mf type Error(linear output) Error(constant output) Epoch (iteration) Tri-mf 8.0327 e-007 2.5532 e-005 190 Trap-mf 1.0955 e-006 0.2886 190 Gbell-mf 6.0788 e-007 2.1502 e-005 190 Gauss-mf 6.1237 e-007 2.2678 e-005 190 Gauss2-mf 1.0014 e-006 2.1687 e-005 190 Pi-mf 1.7942 e-006 0.2886 190 Dsig-mf 2.4415 e-006 2.4847 e-005 190 Psig-mf 1.4882 e-006 2.4847 e-005 190

Input

Output

Where *P* is the set of adaptive nodes whose output depends on thus update for the parameter is given by:

$$
\Delta a = -\eta \frac{\partial E}{\partial a} \tag{4.23}
$$

Where is the learning rate and may be calculated as:

$$\eta = \frac{K}{\sqrt{\sum\_{a} \left(\partial E / \partial a\right)^{2}}} \tag{4.24}$$

Where '*k*' is the step size. The value of *k* must be properly chosen as the change in value of *k* influences the rate of convergence.

Thus the design parameters are tuned according to the real input/output data pairs for the system .the change in value of parameter results in change in shape of membership functions initially defined by an expert .the new membership functions thus obtained after training gives a more realistic model of the system the back propagation algorithm though widely used for training neural networks may suffer from some problems. The back propagation algorithm is never assured of finding the global minimum. The error surface may have many local minima so it may get stuck during the learning process on flat or near flat regions of the error surface. This makes progress slow and uncertain.

Another efficient learning algorithm, which can be used for training the network, is hybridlearning rule. Hybrid learning rule is a combination of least square estimator (LSE) and gradient descent method (used in back propagation algorithm). It is converges faster and gives more interpretable results. The training is done in two passes. In forward pass, when training data is supplied at the input layer, the functional signals go forward to calculate each node output. The non-linear or premise parameters in layer 2 remain fixed in this pass. Thus the overall output can be expressed as the linear combination of consequents parameters. These consequents parameters can be identified using least square estimator (LSE) method. The output of layer 6 is compared with the actual output and the error measure can be calculated as in eqs.(4-17 and 4-18). In backward pass, error rate prorogates backward from output end toward the input end and non-linear parameters in layer 2 are update using the gradient descent method (eqs.(4-19)-(4-24)) as discussed in back propagation algorithm . Since the conquest parameters are optimally identified using LSE under the condition that the premise parameters are fixed, the hybrid algorithm converges much faster as it reduces the search space dimensions of the original pure back propagation algorithm.

#### **4.6.4 Implementation**

We have implementation our model using ANFIS (fuzzy logic tool box) of MATLAB@ [39]. The system is first designed using Sugeno fuzzy interference system. It is the three inputs-one output system. The input variables are the noise level, cognitive task type, and age and the reduction in cognitive task efficiency is taken as the output variable. The input parameters are represented by fuzzy sets or linguistics variables Table 4.6. We have chosen gbell shaped membership functions (it is given the minimum error as shown it Table 4.7), to characterize these fuzzy sets. The membership functions for input variables are shown in Figure 4.11(a-c).


Table 4.6. Inputs and outputs with their associated neural fuzzy values.


Table 4.7. Minimum error membership functions.

200 Fuzzy Inference System – Theory and Applications

 <sup>2</sup> / *K*

 

*E*

Where '*k*' is the step size. The value of *k* must be properly chosen as the change in value of *k*

Thus the design parameters are tuned according to the real input/output data pairs for the system .the change in value of parameter results in change in shape of membership functions initially defined by an expert .the new membership functions thus obtained after training gives a more realistic model of the system the back propagation algorithm though widely used for training neural networks may suffer from some problems. The back propagation algorithm is never assured of finding the global minimum. The error surface may have many local minima so it may get stuck during the learning process on flat or near

Another efficient learning algorithm, which can be used for training the network, is hybridlearning rule. Hybrid learning rule is a combination of least square estimator (LSE) and gradient descent method (used in back propagation algorithm). It is converges faster and gives more interpretable results. The training is done in two passes. In forward pass, when training data is supplied at the input layer, the functional signals go forward to calculate each node output. The non-linear or premise parameters in layer 2 remain fixed in this pass. Thus the overall output can be expressed as the linear combination of consequents parameters. These consequents parameters can be identified using least square estimator (LSE) method. The output of layer 6 is compared with the actual output and the error measure can be calculated as in eqs.(4-17 and 4-18). In backward pass, error rate prorogates backward from output end toward the input end and non-linear parameters in layer 2 are update using the gradient descent method (eqs.(4-19)-(4-24)) as discussed in back propagation algorithm . Since the conquest parameters are optimally identified using LSE under the condition that the premise parameters are fixed, the hybrid algorithm converges much faster as it reduces the search

We have implementation our model using ANFIS (fuzzy logic tool box) of MATLAB@ [39]. The system is first designed using Sugeno fuzzy interference system. It is the three inputs-one output system. The input variables are the noise level, cognitive task type, and age and the reduction in cognitive task efficiency is taken as the output variable. The input parameters are represented by fuzzy sets or linguistics variables Table 4.6. We have chosen gbell shaped membership functions (it is given the minimum error as shown it Table 4.7), to characterize these fuzzy sets. The membership functions for input variables are shown in Figure 4.11(a-c).

(4.23)

(4.24)

thus update for the

Where *P* is the set of adaptive nodes whose output depends on

is the learning rate and may be calculated as:

flat regions of the error surface. This makes progress slow and uncertain.

space dimensions of the original pure back propagation algorithm.

*<sup>E</sup>*

parameter

Where 

is given by:

influences the rate of convergence.

**4.6.4 Implementation** 

Fig. 4.11. (a) Membership functions of noise level.

Some Studies on Noise and Its Effects on Industrial/Cognitive Task Performance and Modeling 203

R1, IF noise level is low AND cognitive task is simple AND age is young THEN reduction in

After constructions of fuzzy inference system, the model parameters are optimized using ANFIS. The network structure consists of 78 nodes. The total number of fitting parameters is 54, of which 27 are premise and 27 are consequent parameters. A hybrid learning rule is used to train the model according to input/output data pairs. The data pairs where obtained from questionnaire it was established for this purpose. We designed and developed our model based on conclusions of our studies [40, 41, and 42], out of the total 155 input/output data sets 124 (80%) data pairs were used for training the model. It was trained for 250 epochs with step size of 0.01 and error tolerance 0%. To validate the model 31 (20%) data

The model was trained for 250 epochs and it was observed that the most of the learning was completed in the first 190 epochs as the root mean square error (RMSE) settles down to almost 0% at 190 th epoch. Figure 5.1(a) shows the training RMSE curve for the model after training the fuzzy inference system. It is found that the shape of membership functions is

cognitive task efficiency is approximately (none) 0%.

sets were used testing purpose.

**5. Result and discussion** 

Fig. 5.1. (a) Training root means squared error.

slightly modified.

Fig. 5.1. (b) Data testing

Fig. 4.11. (b) Membership functions of cognitive task type.

Fig. 4.11. (c) Membership functions of age group.

The membership functions are then aggregated using T-norm product to construct fuzzy IF-THEN rules that have a fuzzy antecedent part and constant consequent, The total number for rules is 27. Some of the rules are given below:

Fig. 4.12. Typical rules and their graphic representations in Sugeno approach.

R1, IF noise level is low AND cognitive task is simple AND age is young THEN reduction in cognitive task efficiency is approximately (none) 0%.

After constructions of fuzzy inference system, the model parameters are optimized using ANFIS. The network structure consists of 78 nodes. The total number of fitting parameters is 54, of which 27 are premise and 27 are consequent parameters. A hybrid learning rule is used to train the model according to input/output data pairs. The data pairs where obtained from questionnaire it was established for this purpose. We designed and developed our model based on conclusions of our studies [40, 41, and 42], out of the total 155 input/output data sets 124 (80%) data pairs were used for training the model. It was trained for 250 epochs with step size of 0.01 and error tolerance 0%. To validate the model 31 (20%) data sets were used testing purpose.
