**3.1. Methodology**

Given a specific pattern recognition problem, different classifier has different classification performance. Very satisfactory results can not always be got if we simply conduct a study on a single classifier to improve its classification accuracy. Multiple classifier system (MCS) can overcome limitations of individual classifier and enhance classification accuracy. The techniques of combining the outputs of several classifiers have been applied to a wide range of real problems and it has been shown that MCSs outperform the traditional approach of using a single high-performance classifier [26].

The most often used classifiers combination approaches in MCS include the majority voting [30], the weighted combination (weighted averaging) [18], the probabilistic schemes [16, 17], the Bayesian approach (naïve Bayes combination) [1, 18, 30], the Dempster-Shafer (D-S) theory of evidence [5, 30] and etc. This section will propose a new classifiers combination method which treats the combination process as linear programming problem.

Assume that *K* base classifiers are used in MCS, and *M* kinds of fault states including normal condition on the bearing fault diagnosis problem. Then, a decision matrix can be given as follow in the process of multiple classifiers combination.

$$D(\mathbf{x}) = \begin{bmatrix} P\_1(F\_1|\mathbf{x}) & P\_1(F\_2|\mathbf{x}) & \dots & P\_1(F\_M|\mathbf{x}) \\ P\_2(F\_1|\mathbf{x}) & P\_2(F\_2|\mathbf{x}) & \dots & P\_2(F\_M|\mathbf{x}) \\ \vdots & \vdots & \ddots & \vdots \\ P\_K(F\_1|\mathbf{x}) & P\_K(F\_2|\mathbf{x}) & \dots & P\_K(F\_M|\mathbf{x}) \end{bmatrix} \tag{5}$$

In the proposed decision-level fusion method for bearing fault diagnosis, we set a rule: if the real-time decision support value of base classifier is higher, its dynamic association weight of

That is to say, the relationship between dynamic association weights is determined by the relationship between real-time decision support values of different base classifiers. And these relationships will be used in the linear programming problem by the form of relationship vectors. Relationship vectors are defined as Table 5 shows (for example, *K* = 3). From Table 5, it is clear that each real-time relationship between DSVs is re-expressed by one or two relationship vectors. And it is also clear that each relationship vector is *K* dimensions. All

Real-time relationship between DSVs Relationship vectors

In order to simplify the fusion formula, a *K* × 1 matrix *β* (*β* = [*β*1; *β*2;...; *βK*]) is used to replace *β<sup>k</sup>* (*k* ∈ {1, 2, . . . , *K*}) for calculating the decision output of global classifier *E*. Then, Equation 6 can be transformed into following simplified form: *PE*(*x*) = *D*(*x*)*<sup>T</sup> β*. And the

*<sup>i</sup>*=1(*PE*(*Fi*|*x*) <sup>−</sup> 1/*M*)2]/*<sup>M</sup>* is simplified as:

constraint rules by the form: *Rβ* ≤ 0. Now, we can give complete linear programming problem description as Equation 11 shows, where *N* is the count of relationship vectors of

max ||*D*(*x*)*<sup>T</sup> <sup>β</sup>* <sup>−</sup> <sup>1</sup>

Solving the linear programming problem as above, we can obtain the dynamic association weight matrix *β*. Using this dynamic association weight matrix, the fusion decision vector of global classifier *E* can be calculated. And the final decision of bearing fault diagnosis can be

*subject to* [1]1×*K<sup>β</sup>* = <sup>1</sup>

*E*(*x*) = *i with PE*(*Fi*|*x*) = max

*M* ||<sup>2</sup>

[0]*K*×<sup>1</sup> ≤ *<sup>β</sup>* ≤ [1]*K*×<sup>1</sup>

1≤*i*�≤*M*

*μ*<sup>1</sup> *< μ*<sup>2</sup> [1 − 1 0] *μ*<sup>1</sup> = *μ*<sup>2</sup> [−110] & [1 − 1 0] *μ*<sup>1</sup> *> μ*<sup>2</sup> [−110]

∑*<sup>M</sup>*

*<sup>M</sup>* ||2. Finally, use relationship matrix *R* to formulize

*<sup>R</sup><sup>β</sup>* ≤ [0]*N*×<sup>1</sup> (11)

*PE*(*Fi*�|*x*) (12)

*<sup>i</sup>*=1(*PE*(*Fi*|*x*) − 1/*M*)2,

these relationship vectors compose a relationship matrix which is denoted as *R*.

*i f μ<sup>k</sup> < μk*� *then β<sup>k</sup> < βk*� , ∀ *k*, *k*� ∈ {1, 2, . . . , *K*}, *k*� �= *k* (8)

Bearing Fault Diagnosis Using Information Fusion and Intelligent Algorithms 123

*i f μ<sup>k</sup>* = *μk*� *then β<sup>k</sup>* = *βk*� , ∀ *k*, *k*� ∈ {1, 2, . . . , *K*}, *k*� �= *k* (9)

*i f μ<sup>k</sup> > μk*� *then β<sup>k</sup> > βk*� , ∀ *k*, *k*� ∈ {1, 2, . . . , *K*}, *k*� �= *k* (10)

it is bigger. And this rule can be described as follows:

**Table 5.** Example for relationship vector construction (K=3)

[∑*<sup>M</sup>*

then further simplified to: ||*D*(*x*)*<sup>T</sup> <sup>β</sup>* <sup>−</sup> <sup>1</sup>

objective function

got by:

current relationship matrix.

The new method introduced in this section will fuse those posterior probabilities in the decision matrix for constructing a global classifier *E* to make final decision. The posterior probability output of global classifier *E* for each fault state is calculated by following mode:

$$P\_E(F\_i|\mathbf{x}) = \sum\_{k=1}^K \beta\_k P\_k(F\_i|\mathbf{x}), \quad \forall \ i \in \{1, 2, \dots, M\}, \tag{6}$$

where *β<sup>k</sup>* (∑*<sup>K</sup> <sup>k</sup>*=<sup>1</sup> *β<sup>k</sup>* = 1) is a dynamic association weight in MCS.

This new decision-level fusion method for bearing false diagnosis is based on the assumption: the base classifier has higher real-time recognition accuracy, if its posterior probabilities of all fault states are greater difference. That is to say, if individual decision system very determines that current operating condition belongs to a certain type of fault states, the posterior probability of the certain fault state will much higher than others. Using this hypothesis, the problem of multiple classifiers combination can be converted into a linear programming problem. And the objective function of this linear programming is defined as: � [∑*<sup>M</sup> <sup>i</sup>*=1(*PE*(*Fi*|*x*) − 1/*M*)2]/*M*.

In current using classifier ensemble methods, base classifier's statistical performance is a major consideration factor. But we find the realtime decision information also can be a consideration factor. And in the new MCS method, we use within-class decision support [13] which is defined as: within-class decision support indicates that base classifier individual class recognition output gets the decision support degree from other same class recognition outputs in MCS. This decision support degree is measured by the difference between current output and its nearest output. For example, the within-class decision support of *Pk*(*Fi*|*x*) which denotes posterior probability of the *i*-th state from the *k*-th base classifier is: 1 − min 1≤*k*�≤*K*,*k*��=*k* |*Pk*(*Fi*|*x*) − *Pk*�(*Fi*|*x*)|.

Real-time decision support value (DSV) of base classifier in MCS is the sum of all class recognition output's within-class decision support value. And it is easy to get its calculation formula as:

$$\mu\_k = \sum\_{i=1}^{M} (1 - \min\_{1 \le k' \le K, k' \ne k} |P\_k(F\_i|\mathbf{x}) - P\_{k'}(F\_i|\mathbf{x})|), \quad \forall \ k \in \{1, 2, \dots, K\} \tag{7}$$

In the proposed decision-level fusion method for bearing fault diagnosis, we set a rule: if the real-time decision support value of base classifier is higher, its dynamic association weight of it is bigger. And this rule can be described as follows:

$$\text{if } \text{ } \mu\_k < \mu\_{k'} \text{ then } \text{ } \beta\_k < \beta\_{k'}, \text{ } \forall k, k' \in \{1, 2, \dots, K\}, k' \neq k \tag{8}$$

$$\text{if } \; \mu\_k = \mu\_{k'} \text{ then } \; \beta\_k = \beta\_{k'}, \; \forall k, k' \in \{1, 2, \dots, K\}, k' \neq k \tag{9}$$

$$\text{if } \; \mu\_k > \mu\_{k'} \text{ then } \; \beta\_k > \beta\_{k'} \quad \forall k, k' \in \{1, 2, \dots, K\}, k' \neq k \tag{10}$$

That is to say, the relationship between dynamic association weights is determined by the relationship between real-time decision support values of different base classifiers. And these relationships will be used in the linear programming problem by the form of relationship vectors. Relationship vectors are defined as Table 5 shows (for example, *K* = 3). From Table 5, it is clear that each real-time relationship between DSVs is re-expressed by one or two relationship vectors. And it is also clear that each relationship vector is *K* dimensions. All these relationship vectors compose a relationship matrix which is denoted as *R*.


**Table 5.** Example for relationship vector construction (K=3)

8 Will-be-set-by-IN-TECH

the Bayesian approach (naïve Bayes combination) [1, 18, 30], the Dempster-Shafer (D-S) theory of evidence [5, 30] and etc. This section will propose a new classifiers combination method

Assume that *K* base classifiers are used in MCS, and *M* kinds of fault states including normal condition on the bearing fault diagnosis problem. Then, a decision matrix can be given as

> *P*1(*F*1|*x*) *P*1(*F*2|*x*) ... *P*1(*FM*|*x*) *P*2(*F*1|*x*) *P*2(*F*2|*x*) ... *P*2(*FM*|*x*)

> *PK*(*F*1|*x*) *PK*(*F*2|*x*) ... *PK*(*FM*|*x*)

. ... .

. .

*βkPk*(*Fi*|*x*), ∀ *i* ∈ {1, 2, . . . , *M*}, (6)


⎤ ⎥ ⎥ ⎥ ⎦

(5)

.

The new method introduced in this section will fuse those posterior probabilities in the decision matrix for constructing a global classifier *E* to make final decision. The posterior probability output of global classifier *E* for each fault state is calculated by following mode:

This new decision-level fusion method for bearing false diagnosis is based on the assumption: the base classifier has higher real-time recognition accuracy, if its posterior probabilities of all fault states are greater difference. That is to say, if individual decision system very determines that current operating condition belongs to a certain type of fault states, the posterior probability of the certain fault state will much higher than others. Using this hypothesis, the problem of multiple classifiers combination can be converted into a linear programming problem. And the objective function of this linear programming is defined as:

In current using classifier ensemble methods, base classifier's statistical performance is a major consideration factor. But we find the realtime decision information also can be a consideration factor. And in the new MCS method, we use within-class decision support [13] which is defined as: within-class decision support indicates that base classifier individual class recognition output gets the decision support degree from other same class recognition outputs in MCS. This decision support degree is measured by the difference between current output and its nearest output. For example, the within-class decision support of *Pk*(*Fi*|*x*) which denotes posterior probability of the *i*-th state from the *k*-th base classifier is: 1 −

Real-time decision support value (DSV) of base classifier in MCS is the sum of all class recognition output's within-class decision support value. And it is easy to get its calculation

which treats the combination process as linear programming problem.

follow in the process of multiple classifiers combination.

⎡ ⎢ ⎢ ⎢ ⎣

. .

*K* ∑ *k*=1

*<sup>k</sup>*=<sup>1</sup> *β<sup>k</sup>* = 1) is a dynamic association weight in MCS.

. .

*D*(*x*) =

*PE*(*Fi*|*x*) =

*<sup>i</sup>*=1(*PE*(*Fi*|*x*) − 1/*M*)2]/*M*.


*M* ∑ *i*=1

(1 − min 1≤*k*�≤*K*,*k*��=*k*

*μ<sup>k</sup>* =

where *β<sup>k</sup>* (∑*<sup>K</sup>*

� [∑*<sup>M</sup>*

min 1≤*k*�≤*K*,*k*��=*k*

formula as:

In order to simplify the fusion formula, a *K* × 1 matrix *β* (*β* = [*β*1; *β*2;...; *βK*]) is used to replace *β<sup>k</sup>* (*k* ∈ {1, 2, . . . , *K*}) for calculating the decision output of global classifier *E*. Then, Equation 6 can be transformed into following simplified form: *PE*(*x*) = *D*(*x*)*<sup>T</sup> β*. And the objective function [∑*<sup>M</sup> <sup>i</sup>*=1(*PE*(*Fi*|*x*) <sup>−</sup> 1/*M*)2]/*<sup>M</sup>* is simplified as: ∑*<sup>M</sup> <sup>i</sup>*=1(*PE*(*Fi*|*x*) − 1/*M*)2, then further simplified to: ||*D*(*x*)*<sup>T</sup> <sup>β</sup>* <sup>−</sup> <sup>1</sup> *<sup>M</sup>* ||2. Finally, use relationship matrix *R* to formulize constraint rules by the form: *Rβ* ≤ 0. Now, we can give complete linear programming problem description as Equation 11 shows, where *N* is the count of relationship vectors of current relationship matrix.

$$\max \quad ||D(\mathbf{x})^T \boldsymbol{\beta} - \frac{1}{M}||\_2$$

$$subject \; to \quad \quad [1]\_{1 \times K} \boldsymbol{\beta} = 1$$

$$[0]\_{K \times 1} \le \boldsymbol{\beta} \le [1]\_{K \times 1}$$

$$R\boldsymbol{\beta} \le [0]\_{N \times 1} \tag{11}$$

Solving the linear programming problem as above, we can obtain the dynamic association weight matrix *β*. Using this dynamic association weight matrix, the fusion decision vector of global classifier *E* can be calculated. And the final decision of bearing fault diagnosis can be got by:

$$E(\mathbf{x}) = \text{i } \text{with } P\_E(F\_i|\mathbf{x}) = \max\_{1 \le i' \le M} P\_E(F\_{i'}|\mathbf{x}) \tag{12}$$
