**4.1. Methodology**

GEP was invented by Ferreira [8], and it is the natural development of genetic algorithms and genetic programming. GEP uses linear chromosome which is composed of genes containing terminal and non-terminal symbols. Chromosomes can be modified by mutation, transposition, root-transposition, gene transposition, gene recombination, one-point and two-point recombination. GEP genes are composed of a head and a tail. The head contains function (non-terminal) and terminal symbols, while the tail contains only terminal symbols. For each problem, the head length (denoted *h*) is chosen by users, and then the head length is used to evaluate the tail length (denoted *t*) by: *t* = (*n* − 1) × *h* + 1, where *n* is the number of arguments of the function with most arguments.

The flow of GEP is as follows:


The new feature-level fusion model using GEP will be dealt with multiple sensors fusion problem. Assume that there are *I* sensors used in machine condition monitoring. For each sensor, the raw signal is divided into some signals by the same time segment. Each of these signals is processed to extract some features. In this chapter, machine operating signal features only take into account the time-domain statistical characteristics. These feature parameters of time-domain are presented in Eequations. (13-23), where *x*(*t*) is a signal series and *N* is its number of data points.

12 Will-be-set-by-IN-TECH

To further analyze performance of the new fusion model, a new *k*-NN classifier (*k* = 3) is added to multiple classifier system. The new multiple classifier system is used to test fault diagnosis performance on data set B. And sum rule is used to compare with the new approach. The comparison results are presented in Table 10. From Table 10, it is clear that the new

> 7-NN 7-NN Parzen Parzen 3-NN 3-NN Sum New classifier classifier classifier classifier classifier classifier rule method

87.5% 88% 85.75% 86.75% 90% 87.75% 94.75% 95%

This section will propose a new multiple sources feature-level fusion model for bearing fault diagnosis using GEP. At present, the research of fault diagnosis based on feature-level fusion is still less, far from decision-level fusion attention. This is mainly because feature-level fusion is more difficult. But feature-level fusion application for fault diagnosis can be more effective to extract fault feature information. It is a way to improve the performance and robustness of

GEP was invented by Ferreira [8], and it is the natural development of genetic algorithms and genetic programming. GEP uses linear chromosome which is composed of genes containing terminal and non-terminal symbols. Chromosomes can be modified by mutation, transposition, root-transposition, gene transposition, gene recombination, one-point and two-point recombination. GEP genes are composed of a head and a tail. The head contains function (non-terminal) and terminal symbols, while the tail contains only terminal symbols. For each problem, the head length (denoted *h*) is chosen by users, and then the head length is used to evaluate the tail length (denoted *t*) by: *t* = (*n* − 1) × *h* + 1, where *n* is the number of

Step 3. To take use some operation such as selection, mutation, inserts sequence, recombine, mutation of random constant and inserts sequence of random constant to create new

Step 5. If obtain most precision of computing, evolution would be finished, else turn to Step

The new feature-level fusion model using GEP will be dealt with multiple sensors fusion problem. Assume that there are *I* sensors used in machine condition monitoring. For each sensor, the raw signal is divided into some signals by the same time segment. Each of these

Step 1. To set control parameters, select function classes, initialize population.

on B1 on B2 on B1 on B2 on B1 on B2

**Table 10.** Further comparison results of fault diagnosis performance

**4. Feature-level fusion for bearing fault diagnosis**

approach attains the highest diagnosis accuracy.

bearing fault diagnosis system.

The flow of GEP is as follows:

population.

2.

arguments of the function with most arguments.

Step 2. To parse chromosome, evaluate population.

Step 4. To implement best preservation strategy.

**4.1. Methodology**

$$p\_1 = \frac{1}{N} \sum\_{n=1}^{N} \mathbf{x}(n) \tag{13}$$

$$p\_2 = \sqrt{\frac{\sum\_{n=1}^{N} (\mathbf{x}(n) - p\_1)^2}{N - 1}} \tag{14}$$

$$p\_3 = \sqrt{\frac{\sum\_{n=1}^{N} \mathbf{x}(n)^2}{N}} \tag{15}$$

$$p\_4 = (\frac{\sum\_{n=1}^{N} \sqrt{|\mathbf{x}(n)|}}{N})^2 \tag{16}$$

$$p\_5 = \max\_{\dots} |x(t)|\tag{17}$$

$$p\_6 = \frac{\sum\_{n=1}^{N} (\mathbf{x}(n) - p\_1)^3}{(N-1)p\_2^3} \tag{18}$$

$$p\_7 = \frac{\sum\_{n=1}^{N} (\mathbf{x}(n) - p\_1)^4}{(N-1)p\_2^4} \tag{19}$$

$$p\_8 = \frac{p\_5}{p\_3} \tag{20}$$

$$p\_{\mathcal{B}} = \frac{p\_5}{p\_4} \tag{21}$$

$$p\_{10} = \frac{p\_3}{\frac{1}{N} \sum\_{n=1}^{N} |\chi(n)|} \tag{22}$$

$$p\_{11} = \frac{p\_5}{\frac{1}{N} \sum\_{n=1}^{N} |\mathbf{x}(n)|} \tag{23}$$

In the pattern recognition process of bearing fault diagnosis, we assume that there are *M* conditions including normal condition. Let *S<sup>i</sup> <sup>m</sup>* represents the set of all training samples belonging to *m*-th condition (1 ≤ *m* ≤ *M*) from the *i*-th sensor source. Feature-level fusion model is seek a way to fuse these features from different sensor sources. The new feature-level fusion model using GEP fuses these features by looking for a feature recognition function *ϕ* which maps the feature space to another space where samples in the same class are similarity and samples dissimilarity otherwise. And then, the feature recognition function *ϕ* will direct the building of a multi-source feature fusion model in reverse direction.

Functions +, −, ×, /, *sqrt*, *exp* are selected as input functions of GEP. The generation is set 5000, and fitness function is defined as:

$$Fitness = \frac{\sum\_{m=1}^{M-1} \sum\_{m'=m+1}^{M} (\sigma\_m - \sigma\_{m'})^2}{\sum\_{m=1}^{M} \sum\_{i=1}^{I} \sum\_{k \in S\_m^i} (\varphi(P\_k^i) - \sigma\_{m'})^2} \tag{24}$$

where *σ<sup>m</sup>* is the mean of all *m*-th condition samples function mapping values, its formula is:

$$\sigma\_m = \frac{1}{I} \sum\_{i=1}^{I} \frac{\sum\_{k \in S\_m^i} \varphi(P\_k^i)}{|S\_m^i|} \tag{25}$$

After GEP training, a perfect feature recognition function *ϕ* can be got. Using function *ϕ*, we can calculate the mean mapping value of each operating condition samples from a certain sensor source. For building the multi-source feature evaluation matrix, the samples which are correctly classified are selected to calculate their mean. Multi-source feature evaluation matrix is composed by these mean values as Equation 26 shows.

$$\begin{bmatrix} \rho\_1(1) & \rho\_1(2) & \dots & \rho\_1(11) \\\\ \rho\_2(1) & \rho\_2(2) & \dots & \rho\_2(11) \\\\ \vdots & \vdots & \ddots & \vdots \\\\ \rho\_M(1) & \rho\_M(2) & \dots & \rho\_M(11) \end{bmatrix} \tag{26}$$

halves, one for training and the other for testing. The task of data set A is to identify different type of faults, while the experiment over data set B is carried out to further investigate the diagnosis performance of developing faults when the fusion model is trained by incipient faulty samples. And the experiment over data set C is to test the diagnosis performance of

Bearing Fault Diagnosis Using Information Fusion and Intelligent Algorithms 129

Table 12 gives the results of these three experiments. From Table 12, we can see that the new feature-level fusion model using GEP can get stable, good diagnosis performance. And it is clear that testing performance is higher than training performance in the experiment on data set C. That is to say, when the new feature-level fusion model is trained by the serious faulty

Data set Training recognition accuracy Testing recognition accuracy

In order to observe the performance change when the new feature-fusion model uses multiple source information instead of single source information, the new method is used to test bearing fault diagnosis performance with single sensor source. Table 13 gives the performance comparison result between more than one sensor (here using two sensors) and single sensor. From Table 13, we can see multi-sensor testing performance is greatly higher than the single

Multi-sensor testing performance increasing 0.56 0.48 0.57

This chapter has introduced some new methods for bearing fault diagnosis. These new approaches are using information fusion and intelligent algorithms. Bearing fault diagnosis is still an ongoing research subject over a decade and attracting a huge number of researchers in different areas. But most of those current using techniques mainly deal with single-source data. Many researches have shown that an individual decision system with a single data source can only acquire a limited classification capability which may not be enough for a particular application. So, we study a new way for bearing fault diagnosis using information

Information fusion is a field still under research. Generally, information fusion process may happen in three levels: sensor level, feature level and decision level. Here, we propose a new feature level fusion method and a new decision level fusion method for bearing fault diagnosis. The feature level fusion method is using GEP which is a new intelligent algorithm. And it is a parallel fusion method. The decision level fusion approach is based on a new multiple classifier ensemble method. It analyzes raw vibration signal, and completes the feature extraction by using EMD and fractal feature parameter calculation. From experimental results, we can see that these new fusion model for bearing fault diagnosis task can get good

Data set A B C

A 83.75% 81.25% B 83.75% 72.50% C 76.25% 81.88%

incipient faults when the fusion model is trained by the serious faulty samples.

samples, it can easily identify incipient faults.

fusion technology and intelligent algorithm.

**5. Conclusion**

**Table 12.** Fault diagnosis performance using feature-fusion model

sensor application using the new feature-level fusion model.

**Table 13.** Performance comparison between multi-sensor and single sensor

In Equation 26, each element represents the mean value of each feature component of each operating condition. For example, *ρ*2(1) represents the mean of all correctly classified samples of the first feature from the 2-th operating condition.
