**4. Automatic rolling bearing fault diagnosis**

The interpolation of a CWT can certainly be accomplished by operator visual inspection with some practice and experience. However, computerized inspection is recommended to meet increasing demand for the on-line automated condition monitoring applications.

Wavelet Analysis and Neural Networks for Bearing Fault Diagnosis 343

A feed-forward multi-layer perceptron (MLP) neural network has been developed, which consists of three layers. The input layer of five source nodes represents the normalized features extracted from the predominant Laplace wavelet transform scales. The hidden layer with four computation nodes has been used. The number of the hidden nodes is optimized using a genetic algorithm by minimization of Mean Square Error (MSE) between the actual network outputs and the corresponding target values. The output layer with four nodes represents the different bearing working conditions to be identified

The four-digit output target nodes that need to be mapped by the ANN are distinguished as: (1, 0, 0, 0) for a new bearing (NB), (0, 1, 0, 0) for a bearing with outer race fault (ORF), (0, 0, 1, 0) for an inner race fault (IRF), and (0, 0, 0, 1) for a rolling element fault (REF). Figure 6-

The training sample vector comprises the extracted features and the ideal target outputs expressed by *[x1, x2, x3, x4, x5, T ] T*, where *x1-x5* represent the input extracted features, and T

The input vector is transformed to an intermediate vector of hidden variables *h* using the activation function *f1*, Figure 2-28b. The output *hj* of the *jth* node in the hidden layer is

> 5 1 , <sup>1</sup> ( ) *<sup>N</sup> <sup>j</sup> <sup>i</sup> <sup>j</sup> <sup>i</sup> <sup>j</sup> <sup>i</sup> <sup>h</sup> <sup>f</sup> wx b*

Where *bj* and *wi,j* represent the bias and the weight of the connection between the *jth* node in

The output vector *O = (o1 o2…oM)* of the network is obtained from the vector of the intermediate variable *h* through a similar transformation using activation function *f2* at the output layer, Figure 2-28c. For example, the output of neuron *k* can be expressed as

> 4 2 , <sup>1</sup> ( ) *<sup>M</sup> <sup>k</sup> lk l k <sup>l</sup> <sup>O</sup> <sup>f</sup> whb*

The training of an MLP network is achieved by modifying the connection weights and biases iteratively to optimize a performance criterion. One of the widely used performance criteria is the minimization of the mean square error (MSE) between the actual network output (*Ok*) and the corresponding target values (*T*) in the training set. The most commonly used training algorithms for MLP are based on back-propagation (BP). The BP adapts a gradient-descent approach by adjusting the ANN connection weights. The MSE is propagated backward through the network and is used to adjust the connection weights between the layers, thus improving the network classification performance. The process is repeated until the overall MSE value drops below some predetermined threshold (stopping criterion). After the training process, the ANN weights are fixed and the system is deployed to solve the bearing condition identification problem

(13)

(14)

28a depicts the overall architecture of the proposed diagnostic system.

**4.2 Neural networks scheme** 

by the neural network.

is the four-digit target output.

using unseen vibration data.

the hidden layer and the *ith* input node respectively.

obtained as follows,

follows:

In this section a new technique for automated detection and diagnosis of rolling bearing conditions is applied. To reduce the number of the ANN inputs and speed up the training process which make the classification procedure suitable for on-line condition monitoring and diagnostics, the most dominant Laplace-wavelet transform scales based on scalekurtosis level, which represent the most correlated features to the bearing condition, are selected for feature extraction. The extracted features in the time and frequency domains are used as the ANN input vectors for the rolling bearing condition identification. The ANN classifier parameters (learning rate parameter and number of the hidden nodes) are optimized using GA by minimizing the mean square error (MSE).

#### **4.1 Feature extraction using laplace wavelet analysis**

The predominant Laplace wavelet transform scales (most informative levels) based on the scale-kurtosis value have been selected for feature extraction. Figure 27 shows the scalekurtosis distribution for different bearing conditions with the corresponding wavelet scale threshold. By using the maximum kurtosis for a normal bearing as a threshold level (the dotted line in Figure 27) for the wavelet scales, it could be seen that the scales range of 12-22 are the mostly dominant scales which can reveal the rolling bearing condition sufficiently.

The extracted features for the dominant scales are:


The extracted features were linearly normalized between [0, 1] using the relationship: *xnor = [(x-xmin)/xmax)]*, and used as input vectors to the neural network.

Fig. 27. The Kurtosis distribution for the Laplace wavelet transforms scales using Laplace wavelet.

#### **4.2 Neural networks scheme**

342 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

In this section a new technique for automated detection and diagnosis of rolling bearing conditions is applied. To reduce the number of the ANN inputs and speed up the training process which make the classification procedure suitable for on-line condition monitoring and diagnostics, the most dominant Laplace-wavelet transform scales based on scalekurtosis level, which represent the most correlated features to the bearing condition, are selected for feature extraction. The extracted features in the time and frequency domains are used as the ANN input vectors for the rolling bearing condition identification. The ANN classifier parameters (learning rate parameter and number of the hidden nodes) are

The predominant Laplace wavelet transform scales (most informative levels) based on the scale-kurtosis value have been selected for feature extraction. Figure 27 shows the scalekurtosis distribution for different bearing conditions with the corresponding wavelet scale threshold. By using the maximum kurtosis for a normal bearing as a threshold level (the dotted line in Figure 27) for the wavelet scales, it could be seen that the scales range of 12-22 are the mostly dominant scales which can reveal the rolling bearing condition sufficiently.

1. *Time domain features*: this includes the Root Mean Square (RMS), Standard Deviation

2. *Frequency domain features*: this includes the WPS peak frequency (*fmax*) to the shaft rotational frequency (*frpm*) ratio, and the WPS maximum amplitude (*Amax)* to the overall

The extracted features were linearly normalized between [0, 1] using the relationship: *xnor =* 

Kurtosis-simulated21

Healthy Outer Inner Roller

0 5 10 15 20 25 30

**Wavelet level**

Fig. 27. The Kurtosis distribution for the Laplace wavelet transforms scales using Laplace

optimized using GA by minimizing the mean square error (MSE).

**4.1 Feature extraction using laplace wavelet analysis** 

The extracted features for the dominant scales are:

*[(x-xmin)/xmax)]*, and used as input vectors to the neural network.

**Wavelet Scale Threshold**

(SD), and Kurtosis factor.

amplitude (Sum (*Ai*)) ratio.

2 2.5 3 3.5 4 4.5 5 5.5 6 6.5

wavelet.

**Kurtosis Factor**

A feed-forward multi-layer perceptron (MLP) neural network has been developed, which consists of three layers. The input layer of five source nodes represents the normalized features extracted from the predominant Laplace wavelet transform scales. The hidden layer with four computation nodes has been used. The number of the hidden nodes is optimized using a genetic algorithm by minimization of Mean Square Error (MSE) between the actual network outputs and the corresponding target values. The output layer with four nodes represents the different bearing working conditions to be identified by the neural network.

The four-digit output target nodes that need to be mapped by the ANN are distinguished as: (1, 0, 0, 0) for a new bearing (NB), (0, 1, 0, 0) for a bearing with outer race fault (ORF), (0, 0, 1, 0) for an inner race fault (IRF), and (0, 0, 0, 1) for a rolling element fault (REF). Figure 6- 28a depicts the overall architecture of the proposed diagnostic system.

The training sample vector comprises the extracted features and the ideal target outputs expressed by *[x1, x2, x3, x4, x5, T ] T*, where *x1-x5* represent the input extracted features, and T is the four-digit target output.

The input vector is transformed to an intermediate vector of hidden variables *h* using the activation function *f1*, Figure 2-28b. The output *hj* of the *jth* node in the hidden layer is obtained as follows,

$$\mathbf{w}\_{j} = f\_{1} \left( \sum\_{i=1}^{N=5} w\_{i,j} \mathbf{x}\_{i} + b\_{j} \right) \tag{13}$$

Where *bj* and *wi,j* represent the bias and the weight of the connection between the *jth* node in the hidden layer and the *ith* input node respectively.

The output vector *O = (o1 o2…oM)* of the network is obtained from the vector of the intermediate variable *h* through a similar transformation using activation function *f2* at the output layer, Figure 2-28c. For example, the output of neuron *k* can be expressed as follows:

$$O\_k = f\_2 \left(\sum\_{l=1}^{M=4} w\_{l,k} \, h\_l + b\_k\right) \tag{14}$$

The training of an MLP network is achieved by modifying the connection weights and biases iteratively to optimize a performance criterion. One of the widely used performance criteria is the minimization of the mean square error (MSE) between the actual network output (*Ok*) and the corresponding target values (*T*) in the training set. The most commonly used training algorithms for MLP are based on back-propagation (BP). The BP adapts a gradient-descent approach by adjusting the ANN connection weights. The MSE is propagated backward through the network and is used to adjust the connection weights between the layers, thus improving the network classification performance. The process is repeated until the overall MSE value drops below some predetermined threshold (stopping criterion). After the training process, the ANN weights are fixed and the system is deployed to solve the bearing condition identification problem using unseen vibration data.

Wavelet Analysis and Neural Networks for Bearing Fault Diagnosis 345

The derived WT-ANN fault classification technique was validated through real and simulated rolling element bearing vibration signals. MATLAB software has been used for the wavelet feature extraction and ANN classification based on the code flowchart shown in Figure 29.

**4.3 Implementation of WPS –ANN for bearing fault classification** 

Fig. 29. WT-ANN automatic bearing fault diagnosis MATLAB codes flow chart.

Fig. 28. (a) the applied diagnosis system, (b) the input and hidden layer, and (c) the hidden and output layer.

The ANN was created, trained and tested using MATLAB Neural Network Toolbox with Levenberg-Marquarat Back-propagation (LMBP) training algorithm. In this work, A MSE of 10E-20, a minimum gradient of 10E-10 and maximum iteration (epochs) of 1000 were used. The training process would stop if any of these conditions were met. The initial weights and biases of the network were randomly generated by the program.

(a)

(b) (c) Fig. 28. (a) the applied diagnosis system, (b) the input and hidden layer, and (c) the hidden

The ANN was created, trained and tested using MATLAB Neural Network Toolbox with Levenberg-Marquarat Back-propagation (LMBP) training algorithm. In this work, A MSE of 10E-20, a minimum gradient of 10E-10 and maximum iteration (epochs) of 1000 were used. The training process would stop if any of these conditions were met. The initial weights and

biases of the network were randomly generated by the program.

and output layer.
