**1. Introduction**

190 Efficiency, Performance and Robustness of Gas Turbines

[20] Carapellucci R, Milazzo A, Scheme of power enhancement for combined cycle plants

PCT/IT2006/000332, WO/2006/123388, November 23; 2006.

through steam injection. International Patent (PCT) application n.

A gas turbine engine can be considered as a very complex and expensive mechanical system; furthermore, its failure can cause catastrophic consequences. That is why it is desirable to provide the engine by an effective condition monitoring system. Such an automated system based on measured parameters performs monitoring and diagnosis of the engine without the need of its shutdown and disassembly. In order to improve gas turbine reliability and reduce maintenance costs, many advanced monitoring systems have been developed recent decades. Design and use of these systems were spurred by the progress in instrumentation, communication techniques, and computer technology. In fact, development and use of such systems has become today a standard practice for new engines.

As shown in (Rao, 1996), an advanced monitoring system consists of different components intended to cover all gas turbine subsystems. A diagnostic analysis of registered gas path variables (pressures, temperatures, rotation speeds, fuel consumption, etc.) can be considered as a principal component and integral part of the system. Many different types of gas path performance degradation, such as foreign object damage, fouling, tip rubs, seal wear, and erosion, are known and can be diagnosed. Detailed descriptions of these abrupt faults and gradual deterioration mechanisms can be found, for instance, in (Rao, 1996; Meher-Homji et al., 2001). In addition to the mentioned gas path faults, the analysis of gas path variables (gas path analysis, GPA) also allows detecting sensor malfunctions and wrong operation of a control system (Tsalavoutas et al., 2000). Moreover, this analysis allows estimating main measured engine performances like shaft power, thrust, overall engine efficiency, specific fuel consumption, and compressor surge margin.

The GPA is an area of extensive studies and thousands of published works can be found in this area. Some common observations that follow from the publications and help to explain the structure of the present chapter are given below.

According to known publications, a total diagnostic process usually includes a preliminary stage of feature extraction and three principal stages of monitoring (fault detection), detailed diagnosis (fault localization), and prognosis. Each stage is usually presented by specific algorithms.

The feature extraction means extraction of useful diagnostic information from raw measurement data. This stage includes measurement validation and computing deviations. The deviation of a monitored variable is determined as a discrepancy between a measured value and an engine base-line model. In contrast to the monitored variables themselves that

Gas Turbine Diagnostics 193

The diagnostic algorithms based on gas turbine model identification constitute the second approach (Volponi et al, 2003; Tsalavoutas et al, 2000; MacIsaac & Muir, 1991; Benvenuti, 2001; Tsalavoutas et al, 2000; Aretakis et al., 2003; Doel, 2003). The researchers apply different mathematical methods, for instance, Kalman filter (Volponi et al., 2003) and weighted-least-squares (Doel, 2003). Aretakis et al. (Aretakis et al., 2003) use a combinatorial approach in order to get the estimations when input information is limited. When the researchers have in their disposal the data registered through a prolonged period, they calculate successive estimations and analyze them in time (Tsalavoutas et al., 2000). The identification represents an effective technique of model accuracy enhancement. During the identification such fault parameters are determined which minimize the distance between simulated gas path variables and measured ones. Besides the better model accuracy, the simplification of a diagnostic process is provided because the found estimations of the fault parameters contain information of a current technical state of the components. A final diagnostic decision is made without restrictions imposed by a rigid classification. This is a

Although two described approaches are applied first of all for the fault localization, it is easy to show that they can be extended on the other to stages of the diagnostic process. Thus, all GPA methods can be realized through both pattern recognition and system identification

Among last trends in the area of the GPA it is also worth to mention the transition from the option of conventional one-point diagnosis (one operating steady state point considered) to the options of multi-point diagnosis (multiple operating points) (Kamboukos & Mathioudakis, 2006) and to the diagnosis under transient operating conditions (Turney & Halasz,1993; Ogaji et al., 2003). To characterize diagnostic efficiency, probabilities of correct and wrong diagnosis united in a so-called confusion matrix (Davison & Bird, 2008; Butler et

Thus, the gas path analysis can be recognized as a developed area of common gas turbine diagnostics. The GPA embraces different stages, approaches, options, and methods. Total number of algorithms and their variations is very great. Nevertheless, in this great variety of algorithms and publications it is difficult to find clear recommendations on how to design a new monitoring system. This area does not seem to be sufficiently systematized and

There are advantages from the monitoring system application since the stages of engine testing and production, and it is important that the system be developed as soon as possible. In order to design an effective system in short time, the designer needs clear instructions on how to choose a system structure and how to tailor each system algorithm. That is why the investigations in the area of gas turbine diagnostics should take into consideration real diagnostic conditions as much as possible and should focus on practical recommendations

The present chapter focuses on reliability of gas path diagnosis. To enhance overall reliability, every particular problem of a total diagnosis process should be solved as exactly as possible. In the chapter these problems are considered consequently and new solutions are proposed to reduce the gap between simulated diagnostic process and real engine maintenance conditions. The principles are formulated and practical recommendations are

techniques. The combination of the approaches is also possible.

main advantage of the approach.

al., 2006) are widely applied now.

generalized.

for the designer.

strongly depend on an engine operating mode, the deviations, when properly computed, do not depend on the mode and can be good indicators of an engine health condition. Since the described deviations are input parameters to all diagnostic algorithms, close attention should be paid to the issue of the accuracy of the base-line model and deviations. Some interesting studies, for instance, (Mesbahi et al., 2001; Fast et al., 2009), completely devoted to deviation computation were performed the last decade. One of focuses of the present chapter is on this issue as well.

Among GPA techniques, the fault localization algorithms may be considered as the most important and sophisticated. They involve different mathematical gas turbine models to describe possible faults. In spite of the availability of recorded data, the models are required because real gas turbine faults occur rarely. Recorded data are sufficient to describe only some intensive and practically permanent deterioration mechanisms, such as compressor fouling and erosion. The compressor fouling is the most common cause of the deterioration of stationary gas turbines; its impact on gas turbine performance is well described, see (Meher-Homji et al., 2001). In helicopter engines, compressor airfoils are often affected by erosion because of dust and sand in the sucked air (Meher-Homji et al., 2001).

The models connect faults of different engine components with the corresponding changes of monitored variables, assisting with fault description. Among fault simulation tools, a nonlinear thermodynamic model is of utmost importance and, with this model, many other particular models can be created. Its description includes mass, energy, and momentum conservation laws and requires detailed knowledge of the gas turbine under analysis. The model can be classified as physics-based and presents a complex software package. Such sophisticated models have been used in gas turbine diagnostics since the works of Saravanamuttoo H.I.H. (Saravanamuttoo et al, 1983). The last two decades, the use of these models instead of less exact linear models has become a standard practice.

All fault localization methods can be broken down into two main approaches. The first approach is based on the pattern recognition theory while the second approach applies system identification methods. Reliable fault localization presents a complicated recognition problem because many negative factors impede correct diagnostic decisions. The main factors affecting diagnosis accuracy are a) fault variety and rare occurrence of the same fault, b) inadequacy of the used engine models, c) dependence of fault manifestations on engine operating conditions and engine-to-engine differences, d) sensor noise and malfunctions, and e) control system inaccuracy and possible malfunction. Advances of computer technology have inspired the application of pattern recognition techniques for gas turbine diagnosis. A lot of applications can be found in literature including, but not limited to, Artificial Neural Networks (Roemer et al, 2000; Ogaji et al, 2003; Volponi et al., 2003; Sampath et al., 2006; Butler et al., 2006; Roemer et al, 2000; Greitzer et al, 1999; Romessis et al, 2006; Loboda, Yepifanov et al, 2007; Loboda, Yepifanov et al, 2006), Genetic Algorithms (Sampath et al., 2006), Support Vector Machines (Butler et al., 2006), Correspondence and Discrimination Analysis (Pipe, 1987), and Bayesian Approach (Romessis et al., 2006; Loboda & Yepifanov, 2006). Regardless of applied technique, a fault classification is an integral part of a fault recognition process. Its accuracy is a crucial factor for a success of diagnostic analysis that is why the classification has to be as much close to reality as possible. Since the information about real faults is accumulated along with the time of engine fleet usage, the advantages of the recognition-based approach increase correspondingly.

strongly depend on an engine operating mode, the deviations, when properly computed, do not depend on the mode and can be good indicators of an engine health condition. Since the described deviations are input parameters to all diagnostic algorithms, close attention should be paid to the issue of the accuracy of the base-line model and deviations. Some interesting studies, for instance, (Mesbahi et al., 2001; Fast et al., 2009), completely devoted to deviation computation were performed the last decade. One of focuses of the present

Among GPA techniques, the fault localization algorithms may be considered as the most important and sophisticated. They involve different mathematical gas turbine models to describe possible faults. In spite of the availability of recorded data, the models are required because real gas turbine faults occur rarely. Recorded data are sufficient to describe only some intensive and practically permanent deterioration mechanisms, such as compressor fouling and erosion. The compressor fouling is the most common cause of the deterioration of stationary gas turbines; its impact on gas turbine performance is well described, see (Meher-Homji et al., 2001). In helicopter engines, compressor airfoils are often affected by

The models connect faults of different engine components with the corresponding changes of monitored variables, assisting with fault description. Among fault simulation tools, a nonlinear thermodynamic model is of utmost importance and, with this model, many other particular models can be created. Its description includes mass, energy, and momentum conservation laws and requires detailed knowledge of the gas turbine under analysis. The model can be classified as physics-based and presents a complex software package. Such sophisticated models have been used in gas turbine diagnostics since the works of Saravanamuttoo H.I.H. (Saravanamuttoo et al, 1983). The last two decades, the use of these

All fault localization methods can be broken down into two main approaches. The first approach is based on the pattern recognition theory while the second approach applies system identification methods. Reliable fault localization presents a complicated recognition problem because many negative factors impede correct diagnostic decisions. The main factors affecting diagnosis accuracy are a) fault variety and rare occurrence of the same fault, b) inadequacy of the used engine models, c) dependence of fault manifestations on engine operating conditions and engine-to-engine differences, d) sensor noise and malfunctions, and e) control system inaccuracy and possible malfunction. Advances of computer technology have inspired the application of pattern recognition techniques for gas turbine diagnosis. A lot of applications can be found in literature including, but not limited to, Artificial Neural Networks (Roemer et al, 2000; Ogaji et al, 2003; Volponi et al., 2003; Sampath et al., 2006; Butler et al., 2006; Roemer et al, 2000; Greitzer et al, 1999; Romessis et al, 2006; Loboda, Yepifanov et al, 2007; Loboda, Yepifanov et al, 2006), Genetic Algorithms (Sampath et al., 2006), Support Vector Machines (Butler et al., 2006), Correspondence and Discrimination Analysis (Pipe, 1987), and Bayesian Approach (Romessis et al., 2006; Loboda & Yepifanov, 2006). Regardless of applied technique, a fault classification is an integral part of a fault recognition process. Its accuracy is a crucial factor for a success of diagnostic analysis that is why the classification has to be as much close to reality as possible. Since the information about real faults is accumulated along with the time of engine fleet usage, the

erosion because of dust and sand in the sucked air (Meher-Homji et al., 2001).

models instead of less exact linear models has become a standard practice.

advantages of the recognition-based approach increase correspondingly.

chapter is on this issue as well.

The diagnostic algorithms based on gas turbine model identification constitute the second approach (Volponi et al, 2003; Tsalavoutas et al, 2000; MacIsaac & Muir, 1991; Benvenuti, 2001; Tsalavoutas et al, 2000; Aretakis et al., 2003; Doel, 2003). The researchers apply different mathematical methods, for instance, Kalman filter (Volponi et al., 2003) and weighted-least-squares (Doel, 2003). Aretakis et al. (Aretakis et al., 2003) use a combinatorial approach in order to get the estimations when input information is limited. When the researchers have in their disposal the data registered through a prolonged period, they calculate successive estimations and analyze them in time (Tsalavoutas et al., 2000). The identification represents an effective technique of model accuracy enhancement. During the identification such fault parameters are determined which minimize the distance between simulated gas path variables and measured ones. Besides the better model accuracy, the simplification of a diagnostic process is provided because the found estimations of the fault parameters contain information of a current technical state of the components. A final diagnostic decision is made without restrictions imposed by a rigid classification. This is a main advantage of the approach.

Although two described approaches are applied first of all for the fault localization, it is easy to show that they can be extended on the other to stages of the diagnostic process. Thus, all GPA methods can be realized through both pattern recognition and system identification techniques. The combination of the approaches is also possible.

Among last trends in the area of the GPA it is also worth to mention the transition from the option of conventional one-point diagnosis (one operating steady state point considered) to the options of multi-point diagnosis (multiple operating points) (Kamboukos & Mathioudakis, 2006) and to the diagnosis under transient operating conditions (Turney & Halasz,1993; Ogaji et al., 2003). To characterize diagnostic efficiency, probabilities of correct and wrong diagnosis united in a so-called confusion matrix (Davison & Bird, 2008; Butler et al., 2006) are widely applied now.

Thus, the gas path analysis can be recognized as a developed area of common gas turbine diagnostics. The GPA embraces different stages, approaches, options, and methods. Total number of algorithms and their variations is very great. Nevertheless, in this great variety of algorithms and publications it is difficult to find clear recommendations on how to design a new monitoring system. This area does not seem to be sufficiently systematized and generalized.

There are advantages from the monitoring system application since the stages of engine testing and production, and it is important that the system be developed as soon as possible. In order to design an effective system in short time, the designer needs clear instructions on how to choose a system structure and how to tailor each system algorithm. That is why the investigations in the area of gas turbine diagnostics should take into consideration real diagnostic conditions as much as possible and should focus on practical recommendations for the designer.

The present chapter focuses on reliability of gas path diagnosis. To enhance overall reliability, every particular problem of a total diagnosis process should be solved as exactly as possible. In the chapter these problems are considered consequently and new solutions are proposed to reduce the gap between simulated diagnostic process and real engine maintenance conditions. The principles are formulated and practical recommendations are

Gas Turbine Diagnostics 195

The thermodynamic model can be tailored to real data by means of system identification

empiric information on different faults is not available. The study (Loboda & Yepifanov,

Software of the nonlinear thermodynamic model allows calculating a matrix of fault

 δ *Y H* → →

Θ at a fixed operating condition. For the changes

faults, linearization errors are not too great. They seem to be smaller than the mentioned

why, the linear model can be successfully applied for fault simulation. Additionally, it is useful for analytical analysis of complex diagnostic issues. Consequently, we can state that

If the nonlinear thermodynamic model for steady states is available (static model), a dynamic model can be developed with less efforts. Since transients provide more information than steady states and transient analysis allows continuous diagnosis, the dynamic gas turbine model is in increasing demand. As in the case of the static model, the

and <sup>→</sup>

Given the above explanations, the dynamic model is presented by a structural expression

*Y FUt t* ( ( ), , ) → →→

A separate influence of variable *t* is explained by inertia of gas turbine rotors, moving gas, and heat interchange processes. Mathematically, the dynamic model is a system of differential equation including time-derivatives and, for each time step, the solution represents a quasi-steady state operating point. Right parts of the differential equations are calculated through the algebraic equation system, which is similar to the system of the static model. That is why, the dynamic model includes the majority of the static model

Athough the described models are sufficient to simulate a healthy condition and possible faults of gas turbines, the design of monitoring systems cannot be based only on these simulation tools. Models are not always adequate enough and every possibility to make them more accurate comparing with real data should be used. Additionally, diagnostic analysis of recorded data gives new information about possible faults. Although direct tracking time plots of measurements can provide some useful information about gas path

subroutines and these two models tend to form common software package.

→

δ *Y* →

→

is given as a function of time and a time variable *t* is also added

= Θ . (3)

 and *Y* →

= Θ . (2)

will be exact enough.

because

Θ and *Y* →

induced by small changes of

Θ typical for real

= Θ . That is

→

δ

depend on the quantities *U*

→

techniques. As a result, the dependency between variables *U*

The linear model computes a vector of relative deviations

the model will remain important in gas turbine diagnostics.

dynamic model describes how the monitored variables *Y*

→

influence coefficients *H* for a linear model

δ

→

fault parameters

Θ . However, the vector *U*

as an independent argument.

Nevertheless, it is much more difficult to fit the dependency between <sup>→</sup>

2010) shows that differences between real and simulated faults can be visible.

δ

inadequacy of the thermodynamic model in describing the dependency *Y f*( ) → →

given to develop an effective condition monitoring system. The chapter is structured as follows: thermodynamic models, data validation and tracking the deviations, fault classification, fault recognition techniques, multi-point diagnosis, diagnosis under transient conditions, and system identification techniques.

### **2. Thermodynamic models**

A nonlinear thermodynamic model of steady state operation can be characterized as component-based because each gas turbine component is presented in this model by its full manufacture performance map. The model is described by the following structural formula

$$
\vec{Y} = F(\vec{U}, \vec{\Theta}) \,. \tag{1}
$$

The model takes into account the influence of an operating point (mode) on monitored variables *Y* → through the vector *U* → of operating conditions (control variables and ambient air parameters). Gas turbine health condition is reflected by means of a vector <sup>→</sup> Θ of fault parameters. As shown in Fig.1, these parameters shift a little the performance maps of engine components (compressors, combustor, turbines, and other devices) and in this way allow simulating different deterioration mechanisms of varying severity.

Fig. 1. Component map shifting by the fault parameters (v1,v2 - component performances)

The thermodynamic model is a typical physics-based model due to objective physical principles realized. The model has a capacity to simulate baseline engine behaviour. Additionally, since the fault parameters change the component performances involving in the calculations, the model is capable to reflect different types of gas turbine degradation. From the mathematical point of view, equation (1) is a result of the solving a system of nonlinear algebraic equations reflecting mass and energy balance at steady states. All engine components should be thoroughly described to form the equations therefore the model presents complex software including dozens subroutines.

given to develop an effective condition monitoring system. The chapter is structured as follows: thermodynamic models, data validation and tracking the deviations, fault classification, fault recognition techniques, multi-point diagnosis, diagnosis under transient

A nonlinear thermodynamic model of steady state operation can be characterized as component-based because each gas turbine component is presented in this model by its full manufacture performance map. The model is described by the following structural formula

*Y FU*(,) → →→

The model takes into account the influence of an operating point (mode) on monitored

parameters. As shown in Fig.1, these parameters shift a little the performance maps of engine components (compressors, combustor, turbines, and other devices) and in this way

Fig. 1. Component map shifting by the fault parameters (v1,v2 - component performances)

presents complex software including dozens subroutines.

The thermodynamic model is a typical physics-based model due to objective physical principles realized. The model has a capacity to simulate baseline engine behaviour. Additionally, since the fault parameters change the component performances involving in the calculations, the model is capable to reflect different types of gas turbine degradation. From the mathematical point of view, equation (1) is a result of the solving a system of nonlinear algebraic equations reflecting mass and energy balance at steady states. All engine components should be thoroughly described to form the equations therefore the model

air parameters). Gas turbine health condition is reflected by means of a vector <sup>→</sup>

= Θ . (1)

Θ of fault

of operating conditions (control variables and ambient

conditions, and system identification techniques.

through the vector *U*

→

allow simulating different deterioration mechanisms of varying severity.

**2. Thermodynamic models** 

variables *Y*

→

The thermodynamic model can be tailored to real data by means of system identification techniques. As a result, the dependency between variables *U* → and *Y* → will be exact enough. Nevertheless, it is much more difficult to fit the dependency between <sup>→</sup> Θ and *Y* → because empiric information on different faults is not available. The study (Loboda & Yepifanov, 2010) shows that differences between real and simulated faults can be visible.

Software of the nonlinear thermodynamic model allows calculating a matrix of fault influence coefficients *H* for a linear model

$$
\vec{\delta Y} = H \vec{\delta \Theta} \, . \tag{2}
$$

The linear model computes a vector of relative deviations δ *Y* → induced by small changes of fault parameters δ → Θ at a fixed operating condition. For the changes δ → Θ typical for real faults, linearization errors are not too great. They seem to be smaller than the mentioned inadequacy of the thermodynamic model in describing the dependency *Y f*( ) → → = Θ . That is why, the linear model can be successfully applied for fault simulation. Additionally, it is useful for analytical analysis of complex diagnostic issues. Consequently, we can state that the model will remain important in gas turbine diagnostics.

If the nonlinear thermodynamic model for steady states is available (static model), a dynamic model can be developed with less efforts. Since transients provide more information than steady states and transient analysis allows continuous diagnosis, the dynamic gas turbine model is in increasing demand. As in the case of the static model, the dynamic model describes how the monitored variables *Y* → depend on the quantities *U* → and <sup>→</sup> Θ . However, the vector *U* → is given as a function of time and a time variable *t* is also added as an independent argument.

Given the above explanations, the dynamic model is presented by a structural expression

$$
\vec{Y} = F(\vec{\mathcal{U}}(t), \vec{\Theta}, t) \,. \tag{3}
$$

A separate influence of variable *t* is explained by inertia of gas turbine rotors, moving gas, and heat interchange processes. Mathematically, the dynamic model is a system of differential equation including time-derivatives and, for each time step, the solution represents a quasi-steady state operating point. Right parts of the differential equations are calculated through the algebraic equation system, which is similar to the system of the static model. That is why, the dynamic model includes the majority of the static model subroutines and these two models tend to form common software package.

Athough the described models are sufficient to simulate a healthy condition and possible faults of gas turbines, the design of monitoring systems cannot be based only on these simulation tools. Models are not always adequate enough and every possibility to make them more accurate comparing with real data should be used. Additionally, diagnostic analysis of recorded data gives new information about possible faults. Although direct tracking time plots of measurements can provide some useful information about gas path

Gas Turbine Diagnostics 197

data to compute model coefficients (these data are called a reference set), the errors related with the model will be approximately equal in deviations of all particular probes. That is why, the differences between deviations of one probe and deviations of the other probes can denote probable errors and faults of this probe. In the synchronous deviation curves

constructed versus an engine operation time, such differences will be well visible.

*Y* , % vs. operation time *t,* hours

of a real temperature profile distortion because of a hot section problem.

Direct analysis of thermocouple probe measurements can be useful as well. To this effect, synchronous plots for all particular probes are constructed vs. the operation time. Engine operating conditions change from one time point to another and this explains common temporal changes of the curves. Anomalies in behaviour of a particular probe can confirm a probe's malfunction. Synchronized perturbations in curves of some probes may be the result

A gas turbine driver for an electric generator (let us call it GT2) has been chosen as a test case to analyse possible malfunctions of EGT thermocouple probes (Loboda, Feldshteyn et al, 2009). Figure 3 (a) shows EGT deviations for 5 probes and for the temperature averaged for 11 different probes. It is known that the washings took place at the time points *t* = 803, 1916, 3098, and 4317. As can be seen, deviation plots reflect in a variable manner the influence of the fouling and washings. The deviation dTtmed does it better than deviations of particular probes. Among deviations dTti, quantities dTt5 and dTt6 , for example, have almost the same diagnostic quality as dTtmed, while quantities dTt1 and dTt2 are of little quality. Such differences can be partly explained by variations in probe accuracy and reliability. For example, elevated random errors of the deviations dTt1 and dTt2 over the whole analyzed period can be induced by greater noise of the first and second EGT probes. The dTt1 fluctuations in the time interval 1900 - 2600 are probably results of frequent incipient faults of the first probe. However, the shifts of the deviations dTt1, dTt2, and dTt6 around the point *t* = 3351 present the most interest for the current analysis. The shifts look

Fig. 2. GT1 deviations

δ

(*TT* - exhaust gas temperature, *PC* - compressor pressure)

like a washing result but they have opposite directions.

degradation and sensor malfunctions, deviations of monitored variables from their baseline values provide more diagnostic information.

### **3. Data validation and computing the deviations**

A systematic change of the deviations is induced by engine degradation, for instance, compressor fouling. Noticeable random errors are also added because of high sensitivity of the deviations to sensor malfunctions and baseline value inaccuracy. As mentioned in section 1, close attention should be paid to deviation accuracy. The ways to make the deviations more accurate are considered below.

### **3.1 Deviations and sensor malfunctions**

Typically, historical engine sensor data previously filtered, averaged, and periodically recorded at steady states are used for gas turbine monitoring and diagnosis. The parameters recorded in the same time include measured values \* *Um* → and \* *Y* → of engine operational conditions and monitored variables correspondingly. For monitored variables *Yii m* , 1, <sup>=</sup> the corresponding deviations

$$\begin{aligned} \text{d}\,\text{\textbullet}\,Y\_i^\* &= \frac{Y\_i^\* - Y\_{0i}\text{(}\stackrel{\rightarrow}{\text{U}}\_m^\*\text{)}}{Y\_{0i}\text{(}\text{U}\_m^\*\text{)}} \end{aligned} \tag{4}$$

are computed as relative differences between measured values \* *Yi* and engine baseline values \* <sup>0</sup> ( ) *Y Ui m* → . The baseline values are written here as function because a healthy engine operation depends on operating conditions. A totality 0( ) *Y Um* → → of baseline functions is usually called a baseline model.

Figure 2 illustrates the deviations computed for two monitored variables of a gas turbine power plant for natural gas pipelines (Let us call this plant as GT1). It may be concluded about the behavior of the deviations that a compressor washing (*t* = 7970h) as well as previous and subsequent compressor fouling periods are well-distinguishable. However, the fluctuations are still significant here and capable to mask degradation effects.

As follows from equation (4), the deviation errors can be induced by both malfunctions of sensors and inadequacy of the baseline model. Consequently, it is important to look for and exclude erroneous recorded data as well as to enhance the model. Let us first consider how to detect and identify sensor malfunction cases in recorded data. Various graphical tools help to solve this problem. Deviation plots are very good in malfunction detection but the found deviation anomalies are not always sufficient to determine the anomaly cause: sensor malfunction or model inadequacy. That is why additional plots and theoretical analysis are utilized to identify the anomaly.

Let us handle the problem of sensor malfunctions by the example of EGT measurements. The availability of parallel measurements by a suite of thermocouple probes installed in the same engine station gives us new possibilities of thermocouple malfunction detection by means of deviation analysis. If we choose the same baseline model arguments and the same

degradation and sensor malfunctions, deviations of monitored variables from their baseline

A systematic change of the deviations is induced by engine degradation, for instance, compressor fouling. Noticeable random errors are also added because of high sensitivity of the deviations to sensor malfunctions and baseline value inaccuracy. As mentioned in section 1, close attention should be paid to deviation accuracy. The ways to make the

Typically, historical engine sensor data previously filtered, averaged, and periodically recorded at steady states are used for gas turbine monitoring and diagnosis. The parameters

conditions and monitored variables correspondingly. For monitored variables *Yii m* , 1, <sup>=</sup> the

*Y YU <sup>Y</sup> Y U*

are computed as relative differences between measured values \* *Yi* and engine baseline

Figure 2 illustrates the deviations computed for two monitored variables of a gas turbine power plant for natural gas pipelines (Let us call this plant as GT1). It may be concluded about the behavior of the deviations that a compressor washing (*t* = 7970h) as well as previous and subsequent compressor fouling periods are well-distinguishable. However,

As follows from equation (4), the deviation errors can be induced by both malfunctions of sensors and inadequacy of the baseline model. Consequently, it is important to look for and exclude erroneous recorded data as well as to enhance the model. Let us first consider how to detect and identify sensor malfunction cases in recorded data. Various graphical tools help to solve this problem. Deviation plots are very good in malfunction detection but the found deviation anomalies are not always sufficient to determine the anomaly cause: sensor malfunction or model inadequacy. That is why additional plots and theoretical analysis are

Let us handle the problem of sensor malfunctions by the example of EGT measurements. The availability of parallel measurements by a suite of thermocouple probes installed in the same engine station gives us new possibilities of thermocouple malfunction detection by means of deviation analysis. If we choose the same baseline model arguments and the same

*i*

the fluctuations are still significant here and capable to mask degradation effects.

δ

operation depends on operating conditions. A totality 0( ) *Y Um*

\* \* \* 0

> \* 0

( ) *i im*

→

*i m*

( )

. The baseline values are written here as function because a healthy engine

→ →

→

→

and \* *Y* →

<sup>−</sup> <sup>=</sup> (4)

of engine operational

of baseline functions is

values provide more diagnostic information.

deviations more accurate are considered below.

**3.1 Deviations and sensor malfunctions** 

corresponding deviations

usually called a baseline model.

utilized to identify the anomaly.

values \* <sup>0</sup> ( ) *Y Ui m* →

**3. Data validation and computing the deviations** 

recorded in the same time include measured values \* *Um*

data to compute model coefficients (these data are called a reference set), the errors related with the model will be approximately equal in deviations of all particular probes. That is why, the differences between deviations of one probe and deviations of the other probes can denote probable errors and faults of this probe. In the synchronous deviation curves constructed versus an engine operation time, such differences will be well visible.

Fig. 2. GT1 deviations δ*Y* , % vs. operation time *t,* hours (*TT* - exhaust gas temperature, *PC* - compressor pressure)

Direct analysis of thermocouple probe measurements can be useful as well. To this effect, synchronous plots for all particular probes are constructed vs. the operation time. Engine operating conditions change from one time point to another and this explains common temporal changes of the curves. Anomalies in behaviour of a particular probe can confirm a probe's malfunction. Synchronized perturbations in curves of some probes may be the result of a real temperature profile distortion because of a hot section problem.

A gas turbine driver for an electric generator (let us call it GT2) has been chosen as a test case to analyse possible malfunctions of EGT thermocouple probes (Loboda, Feldshteyn et al, 2009). Figure 3 (a) shows EGT deviations for 5 probes and for the temperature averaged for 11 different probes. It is known that the washings took place at the time points *t* = 803, 1916, 3098, and 4317. As can be seen, deviation plots reflect in a variable manner the influence of the fouling and washings. The deviation dTtmed does it better than deviations of particular probes. Among deviations dTti, quantities dTt5 and dTt6 , for example, have almost the same diagnostic quality as dTtmed, while quantities dTt1 and dTt2 are of little quality. Such differences can be partly explained by variations in probe accuracy and reliability. For example, elevated random errors of the deviations dTt1 and dTt2 over the whole analyzed period can be induced by greater noise of the first and second EGT probes. The dTt1 fluctuations in the time interval 1900 - 2600 are probably results of frequent incipient faults of the first probe. However, the shifts of the deviations dTt1, dTt2, and dTt6 around the point *t* = 3351 present the most interest for the current analysis. The shifts look like a washing result but they have opposite directions.

Gas Turbine Diagnostics 199

In addition to rectification of recorded data, baseline model improvement provides further enhancement of deviation accuracy. Adequacy of this model depends on a correct choice of

function, and reference set. Many variations of these factors have been proposed, realized and compared (Loboda et al., 2004; Loboda, Yepifanov et al., 2009) in order to choose the best choice of each factor. The criterion to compare the choices was an integral quality of the

Summarizing the results of all proposals to make the baseline model more adequate, we draw two general conclusions. First, the majority of the proposed choices improved the deviations; however, the improvement was always very limited. In other words, many choices yielded practically the same equal deviation accuracy about 0.5±%. This means that principally new choices should be proposed and considered to significantly enhance the deviations. Second, not all real operating conditions were included as arguments of the baseline model. Some variables of real operating conditions are not always measured or recorded, for example, inlet air humidity, air bleeding and bypass valves' positions, and engine box temperature. Since such variables exert influence upon a real engine and its monitored variables but are not taken into consideration in the baseline model, the corresponding deviation errors take place. A similar negative effect can occur if sensor

**Model arguments Approximating function Reference set** 

• Second order polynomials determined by the singular value decomposition (SVD)

polynomials determined by the least square method

• Full third order polynomials determined by the least square method (LSM) • Neural networks, in particular, multilayer

Thus, practically all ways to enhance the baseline model with available real data and thermodynamic model have been examined. The achieved deviation accuracy is not too low. That is why, new considerable accuracy enhancement will not come easily. It is necessary to find new operating condition variables, to determine their influence on monitored variables, and to make the baseline model more accurate. It seems to us that this will

method

(LSM)

perceptron

• Full second order

→

), type of an approximating

• Reference set of different volume • Reference set computed by the thermodynamic

• Reference set of great volume for a model of a degraded engine. This model is simply converted into the necessary baseline

model

model

**3.2 Deviations and baseline model adequacy** 

three elements: model arguments (elements of the vector *U*

corresponding deviations. Table 1 reflects the considered cases.

systematic error changes along with operation time.

Table 1. Choices of the baseline model elements

• Selection of the best power set parameter • Selection of the best ambient air parameters • Including additional arguments (e.g. humidity

parameter)

Parallel plots of probe measurements themselves shown in Fig. 3 (b) confirm the anomalies in recorded data. Small synchronized shifts can be seen here. It is visible in the graph that probes 7 and 8 are synchronously displaced by about 10 degrees during two time intervals *t* = 962.5-966.5 and *t* = 971.5-972.5. Additionally, the same measurement increase is observable in the probe 1 curve at time *t* = 971.5-972.5. In this way, unlike independent outliers found beforehand (Loboda, Feldshteyn et al., 2009), the considered case presents correlated shifts in data of some probes and therefore is more complicated. Two explanations can be proposed for this case. The first of them is related with any common problem of the measurement system that affects some probes and alters their data. So, the outliers can be classified as measurement errors. The second supposes that the measurements are correct but a real EGT profile has been changed in the noted time points. It can be possible because there is no information that EGT and PTT probe profiles should be absolutely stable during engine operation. The available data are not sufficient to give a unique explanation; more recorded data should be attracted.

In addition to the case of thermocouples considered before, many other cases of abnormal sensor data behaviour have been detected and interpreted. The reader can find them in (Loboda, Feldshteyn et al., 2009; Loboda et al., 2004; Loboda, Yepifanov et al., 2009).

a deviation dTtmed of a mean EGT variable thermocouple probes: anomaly cases

Fig. 3. EGT deviations and temperatures themselves (GT2)

Parallel plots of probe measurements themselves shown in Fig. 3 (b) confirm the anomalies in recorded data. Small synchronized shifts can be seen here. It is visible in the graph that probes 7 and 8 are synchronously displaced by about 10 degrees during two time intervals *t* = 962.5-966.5 and *t* = 971.5-972.5. Additionally, the same measurement increase is observable in the probe 1 curve at time *t* = 971.5-972.5. In this way, unlike independent outliers found beforehand (Loboda, Feldshteyn et al., 2009), the considered case presents correlated shifts in data of some probes and therefore is more complicated. Two explanations can be proposed for this case. The first of them is related with any common problem of the measurement system that affects some probes and alters their data. So, the outliers can be classified as measurement errors. The second supposes that the measurements are correct but a real EGT profile has been changed in the noted time points. It can be possible because there is no information that EGT and PTT probe profiles should be absolutely stable during engine operation. The available data are not sufficient to give a unique explanation; more

In addition to the case of thermocouples considered before, many other cases of abnormal sensor data behaviour have been detected and interpreted. The reader can find them in

> b) EGT measurement by 11 thermocouple probes: anomaly cases

(Loboda, Feldshteyn et al., 2009; Loboda et al., 2004; Loboda, Yepifanov et al., 2009).

recorded data should be attracted.

a) Deviations dTti for 5 thermocouple probes and a deviation dTtmed of a mean EGT variable

Fig. 3. EGT deviations and temperatures themselves (GT2)

### **3.2 Deviations and baseline model adequacy**

In addition to rectification of recorded data, baseline model improvement provides further enhancement of deviation accuracy. Adequacy of this model depends on a correct choice of three elements: model arguments (elements of the vector *U* → ), type of an approximating function, and reference set. Many variations of these factors have been proposed, realized and compared (Loboda et al., 2004; Loboda, Yepifanov et al., 2009) in order to choose the best choice of each factor. The criterion to compare the choices was an integral quality of the corresponding deviations. Table 1 reflects the considered cases.

Summarizing the results of all proposals to make the baseline model more adequate, we draw two general conclusions. First, the majority of the proposed choices improved the deviations; however, the improvement was always very limited. In other words, many choices yielded practically the same equal deviation accuracy about 0.5±%. This means that principally new choices should be proposed and considered to significantly enhance the deviations. Second, not all real operating conditions were included as arguments of the baseline model. Some variables of real operating conditions are not always measured or recorded, for example, inlet air humidity, air bleeding and bypass valves' positions, and engine box temperature. Since such variables exert influence upon a real engine and its monitored variables but are not taken into consideration in the baseline model, the corresponding deviation errors take place. A similar negative effect can occur if sensor systematic error changes along with operation time.


Table 1. Choices of the baseline model elements

Thus, practically all ways to enhance the baseline model with available real data and thermodynamic model have been examined. The achieved deviation accuracy is not too low.

That is why, new considerable accuracy enhancement will not come easily. It is necessary to find new operating condition variables, to determine their influence on monitored variables, and to make the baseline model more accurate. It seems to us that this will

Gas Turbine Diagnostics 201

Apart from the described errors related to the arguments of the function \* 0( ) *Y Um*

∧

0

consequently, on *EY* is significantly smaller then the influence of \* *Um*

\*

employed to measure currently analyzed values \* *Y* and \* *Um*

→

the corresponding error component in the deviations \*

δ

the considerations made, we arrive to a final expression for the deviation

0

This expression includes four error types introduced above, namely *EY* , *EUm*

*YU E U E E <sup>Y</sup>*

→ →

*YU E U <sup>Y</sup>*

Let us now analyze how each error can influence on inaccuracy of the deviation \*

function has a proper error *<sup>Y</sup>*<sup>0</sup> *E* (Type IV error). It can result from such factors as a systematic error in measurements of the variable *Y*, inadequate function type, improper algorithm for estimating function's coefficient, errors in the reference set, limited volume of the set data, and influence of engine deterioration on these data. Given *<sup>Y</sup>*<sup>0</sup> *E* and a true

*<sup>m</sup> <sup>U</sup> EEUU Um* → →→→ \ += \*

can be written as

\*\* \* <sup>0</sup> <sup>0</sup> () () () *Y U YU E U m m Ym* ∧ →→ →

Let us now substitute equations (6), (8), and (9) into expression (5). As a result, the deviation

\*

→ → → →→→

→→ → → ΔΘ + − + ΔΘ = − + +

*YU E E E U*

(, ) ( , ) <sup>1</sup> (\ ) ( )

*Y m Um U*

*U Um Y m*

0

0

\*

− + ΔΘ in this expression can be simplified because of the

<< , c) The influence of *EU*

0

\*

→

\*

(, ) ( ) <sup>1</sup>

*U Um Y m*

*Y m*

do not depend on engine operating time.

*Y* will not be affected by the systematic component of *EY* . As to the random

δ

(\ ) ( )

→→ → → ΔΘ + = − + +

→ → →

*YU E E E U*

analysis is performed under the following assumptions. First, the same sensors were

data. Second, gross errors have been filtered out. Third, a systematic error and distribution

*Type I error*. Since the sensor performance is invariable, every systematic change of the error *EY* will be accompanied by the same change in *<sup>Y</sup>*<sup>0</sup> *E* . As a consequence, accuracy of the

component, it is usually given by the multidimensional Gaussian distribution. That is why,

can be given by *U UE <sup>m</sup>* \ *<sup>U</sup>*

→ →→

= and the equation (9) is

. (8)

= + . (9)

→

→

∧ →

. (10)

Δ Θ on *Y* and,

, and *<sup>Y</sup>*<sup>0</sup> *E* .

*Y* . This

. Taking into account

and <sup>→</sup>

. (11)

, *EU* →

as well as the reference set

*Y* can also be described by this

δ

→

, the

→

that , the vector *Um*

function *Y*<sup>0</sup> , the function estimation *Y*<sup>0</sup>

A dependency \* *EU E E Y m* ( ,) *Um U*

of random errors in \* *Y* and \* *Um*

deviation \* δ

distribution.

following reasons: a) *E Y <sup>Y</sup>* << , b) \* *E U Um <sup>m</sup>*

\*

→ →→→

δ

converted to a form

*<sup>m</sup> EUU <sup>U</sup>* → → → ∪=

\* δ

*Y* is written as

require a hard combined work of engine and sensor designers, maintenance personnel, and diagnosticians.

As a small part of this work, the next subsection analyzes possible sources of deviation errors. This analysis firstly performed in (Loboda, 2011).

### **3.3 Theoretical analysis of possible errors in real deviations**

The analysis is based on expression (4). For a monitored variable *Y*, this expression can be rewritten as

$$
\delta Y^\* = \frac{Y^\*}{\stackrel{\frown}{Y\_0}(\stackrel{\rightarrow}{\mathcal{U}\_m})} - 1 \,. \tag{5}
$$

This equation shows that inaccuracy of the deviation is completely determined by errors in a term \* \* <sup>0</sup> ( ) *Y YUm* ∧ → , in which *Y*<sup>0</sup> ∧ denotes an estimation of the baseline function for the variable *Y*. It is shown below that these errors can be divided into four types.

The measurement \* *Y* differs from a true value *Y* by an error *EY* called in this paper as a Type I error. In its turn, the true value depends on a vector *U* → of real operating conditions and on engine health conditions given by the vector <sup>→</sup> Δ Θ . As a consequence, the value \* *Y* can be determined as

$$Y^\* = Y(\vec{\mathcal{U}}, \vec{\Delta\Theta}) + E\_\gamma(\vec{\mathcal{U}}, \vec{\Delta\Theta}) \quad . \tag{6}$$

The error *EY* is defined here as a function because, in general, measurement errors may depend on the value *Y* and, consequently, on the variables *U* → and <sup>→</sup> Δ Θ .

One more obvious cause of the deviation inaccuracy is related with measurement errors in operating conditions presented in equation (5) by the vector \* *Um* → . Given a vector of measurement errors *EUm* → , which presents Type II errors, the measured operating conditions are written as

$$
\stackrel{\rightarrow}{\iota}\stackrel{\rightarrow}{\iota}\_{m} = \stackrel{\rightarrow}{\iota}\stackrel{\rightarrow}{\iota}\_{\iota\iota m}.\tag{7}
$$

The next error type (Type III) is also related to engine operating conditions however it is not so evident. The point is that not all real operating conditions denominated in the present paper by a [(*n*+*k*)×1]-vector can be included as arguments of the baseline function. Some variables of real operating conditions are not always measured or recorded, for example, inlet air humidity, air bleeding and bypass valves' positions, and engine box temperature. Let us unite all these additional variables in a (*k*×1)-vector *EU* → . Since such variables exert influence upon a real engine and its measured variable \* *Y* but are not taken into consideration in the baseline function \* 0( ) *Y Um* ∧ → , the corresponding deviation errors take place. A similar negative effect can occur if sensor systematic error changes in time. Given → *U*

require a hard combined work of engine and sensor designers, maintenance personnel,

As a small part of this work, the next subsection analyzes possible sources of deviation

The analysis is based on expression (4). For a monitored variable *Y*, this expression can be

\*

1

<sup>∧</sup> <sup>→</sup> = − . (5)

of real operating conditions

. Given a vector of

. Since such variables exert

Δ Θ . As a consequence, the value \* *Y*

denotes an estimation of the baseline function for the

→

= ΔΘ + ΔΘ . (6)

 and <sup>→</sup> Δ Θ .

→

= + . (7)

, the corresponding deviation errors take

→

, which presents Type II errors, the measured operating conditions

→

\* <sup>0</sup>

*Y U*

This equation shows that inaccuracy of the deviation is completely determined by errors in a

The measurement \* *Y* differs from a true value *Y* by an error *EY* called in this paper as a

\* (, ) (, ) *Y YU E UY* →→ →→

The error *EY* is defined here as a function because, in general, measurement errors may

One more obvious cause of the deviation inaccuracy is related with measurement errors in

\* *U UE <sup>m</sup> <sup>m</sup> Um* → → →

The next error type (Type III) is also related to engine operating conditions however it is not so evident. The point is that not all real operating conditions denominated in the present paper by a [(*n*+*k*)×1]-vector can be included as arguments of the baseline function. Some variables of real operating conditions are not always measured or recorded, for example, inlet air humidity, air bleeding and bypass valves' positions, and engine box temperature.

influence upon a real engine and its measured variable \* *Y* but are not taken into

place. A similar negative effect can occur if sensor systematic error changes in time. Given

∧ →

( ) *<sup>m</sup>*

\*

δ

variable *Y*. It is shown below that these errors can be divided into four types.

*<sup>Y</sup> <sup>Y</sup>*

errors. This analysis firstly performed in (Loboda, 2011).

, in which *Y*<sup>0</sup>

**3.3 Theoretical analysis of possible errors in real deviations** 

∧

Type I error. In its turn, the true value depends on a vector *U*

and on engine health conditions given by the vector <sup>→</sup>

depend on the value *Y* and, consequently, on the variables *U*

operating conditions presented in equation (5) by the vector \* *Um*

Let us unite all these additional variables in a (*k*×1)-vector *EU*

→ *U*

consideration in the baseline function \* 0( ) *Y Um*

and diagnosticians.

rewritten as

term \* \* <sup>0</sup> ( ) *Y YUm*

can be determined as

measurement errors *EUm*

are written as

→

∧ →

that , the vector *Um* → can be given by *U UE <sup>m</sup>* \ *<sup>U</sup>* → →→ = and the equation (9) is converted to a form *<sup>m</sup> EUU <sup>U</sup>* → → → ∪=

$$
\stackrel{\rightarrow}{U}\_{m} = \stackrel{\rightarrow}{U}\stackrel{\rightarrow}{\cdot}E\_{U} + \stackrel{\rightarrow}{E}\_{Um} \,. \tag{8}
$$

Apart from the described errors related to the arguments of the function \* 0( ) *Y Um* ∧ → , the function has a proper error *<sup>Y</sup>*<sup>0</sup> *E* (Type IV error). It can result from such factors as a systematic error in measurements of the variable *Y*, inadequate function type, improper algorithm for estimating function's coefficient, errors in the reference set, limited volume of the set data, and influence of engine deterioration on these data. Given *<sup>Y</sup>*<sup>0</sup> *E* and a true function *Y*<sup>0</sup> , the function estimation *Y*<sup>0</sup> ∧ can be written as

$$
\hat{Y}\_0(\overrightarrow{\boldsymbol{L}I\_m^\cdot}) = Y\_0(\overrightarrow{\boldsymbol{L}I\_m^\cdot}) + E\_{\mathbf{y}\_0}(\overrightarrow{\boldsymbol{L}I\_m^\cdot})\,. \tag{9}
$$

Let us now substitute equations (6), (8), and (9) into expression (5). As a result, the deviation \* δ*Y* is written as

$$
\delta \mathcal{Y}^\* = \frac{Y(\stackrel{\rightarrow}{\mathcal{U}}, \stackrel{\rightarrow}{\Delta \Theta}) + E\_Y(\stackrel{\rightarrow}{\mathcal{U}}\_m^\rightarrow - \stackrel{\rightarrow}{E}\iota\_m + \stackrel{\rightarrow}{E}\iota\_m \stackrel{\rightarrow}{\Delta \Theta})}{Y\_0(\stackrel{\rightarrow}{\mathcal{U}}\stackrel{\rightarrow}{\}} \stackrel{\rightarrow}{E}\_{\mathcal{U}} + \stackrel{\rightarrow}{E}\_{\mathcal{U}m}) + E\_{\chi\_0}(\stackrel{\rightarrow}{\mathcal{U}}\_m^\rightarrow)} - 1 \,\tag{10}
$$

A dependency \* *EU E E Y m* ( ,) *Um U* → →→→ − + ΔΘ in this expression can be simplified because of the following reasons: a) *E Y <sup>Y</sup>* << , b) \* *E U Um <sup>m</sup>* → → << , c) The influence of *EU* → and <sup>→</sup> Δ Θ on *Y* and, consequently, on *EY* is significantly smaller then the influence of \* *Um* → . Taking into account the considerations made, we arrive to a final expression for the deviation

$$
\delta Y^\* = \frac{Y(\stackrel{\rightarrow}{\mathcal{U}}, \stackrel{\rightarrow}{\Delta \stackrel{\rightarrow}{\Theta}}) + E\_Y(\stackrel{\rightarrow}{\mathcal{U}\_m^\*})}{Y\_0(\stackrel{\rightarrow}{\mathcal{U}}\stackrel{\rightarrow}{\mathcal{E}}\_{\mathcal{U}} + \stackrel{\rightarrow}{E\_{\mathcal{U}m}}) + E\_{Y\_0}(\stackrel{\rightarrow}{\mathcal{U}\_m^\*})} - 1 \,\,\,\,\tag{11}
$$

This expression includes four error types introduced above, namely *EY* , *EUm* → , *EU* → , and *<sup>Y</sup>*<sup>0</sup> *E* . Let us now analyze how each error can influence on inaccuracy of the deviation \* δ*Y* . This analysis is performed under the following assumptions. First, the same sensors were employed to measure currently analyzed values \* *Y* and \* *Um* → as well as the reference set data. Second, gross errors have been filtered out. Third, a systematic error and distribution of random errors in \* *Y* and \* *Um* → do not depend on engine operating time.

*Type I error*. Since the sensor performance is invariable, every systematic change of the error *EY* will be accompanied by the same change in *<sup>Y</sup>*<sup>0</sup> *E* . As a consequence, accuracy of the deviation \* δ*Y* will not be affected by the systematic component of *EY* . As to the random component, it is usually given by the multidimensional Gaussian distribution. That is why, the corresponding error component in the deviations \* δ*Y* can also be described by this distribution.

Gas Turbine Diagnostics 203

are presented here. A parameter *Ya* denotes an amplitude of random errors in the deviation

reliability. Diagram (a) illustrates the deviation errors simulated through the multidimensional Gaussian distribution of sensor errors (Type I and Type II errors). Such simulation is traditionally applied in gas turbine fault recognition algorithms. Diagram (b) shows the errors extracted from real data-based deviations. Both diagrams show visible error correlation between the presented deviations. But there are visible differences as well: the distribution of real deviation errors is less regular. Not taken into consideration in fault recognition algorithms, these differences can affect the reliability of gas turbine diagnosis. Therefore, to make the diagnosis more reliable, simulated noise should be as close as

As mentioned in the introduction, mathematical models are widely used in gas turbine diagnostics to describe engine performance degradation and faults and the deviations are employed to reveal the degradation influence. For this reason, a classification for fault recognition algorithms is usually formed in the space of the deviations with the use of

For the purposes of diagnosis, existing variety of engine faults should be broken down into a limited number of classes. The hypothesis commonly used in the pattern recognition

that are set before recognition itself. This assumption is also accepted in gas turbine diagnostics. As a rule, each fault class corresponds to one engine component and is described by its fault parameters. If we change only one fault parameter, a single fault class is formed whereas a multiple fault class is created by an independent variation of two and

As mentioned in the introduction, the deviations are good indicators of engine faults. That is why the normalized deviations *Z* computed for *m* available monitored variables form an appropriate space to recognize the faults. On the basis of the above considerations, a

Some recognition techniques, for example, the Bayesian approach, need a probabilistic description of the used classification. In this case each fault class *Dj* should be described by its

functions themselves because it is a principal problem of mathematical statistics to assess them. That is why the probabilistic description can be realized only for a simplified fault classes.

1 2 , ,..., *DD Dq* (13)

→

→

. The difficulty of this approach is related with density

(*m*×1) that unites

computed with the

theory states that a recognized object can belong only to one of *q* classes

recognition space (diagnostic space) is formed on the basis of the vector *Z*

presents a pattern to be recognized.

elemental normalized deviations. One value of the vector \* *Z*

→

*Y* . Such normalization simplifies fault class description and enhances diagnosis

variable \* δ

possible to real errors.

more fault parameters.

measurements \* *Y*

→

probability density function \* (/ )*<sup>j</sup> f Z D*

**4. Classification for fault recognition algorithms** 

nonlinear or linear gas turbine models.

**4.1 Fault classification in the deviation space** 

*Type II errors*. It is easy to show that the system component of the errors *EUm* → cannot influence a lot the deviation \* δ*Y* . As to the random component, it can be described by the multidimensional Gaussian distribution, as in the case of the monitored variables *Y*. Because every change of the arguments \* *Um* → has an influence on baseline values of all monitored variables, their baseline values *Y*<sup>0</sup> ∧ and, consequently, deviations \* δ*Y* have correlation. Thus, random errors of operating conditions induce correlated deviation errors that cannot be described by the multidimensional Gaussian distribution. It is very likely that permanent noise with a small scatter observed in Fig.2 results from the errors of Type I and Type II.

sensor error simulation

Fig. 4. Simulated and real deviation errors of the GT1 (Z2 – normalized deviation of the EGT; Z3 - normalized deviation of the power turbine temperature)

*Type III error*. Presence of such an error has been confirmed after analyzing all other error types. This error occurs because the additional operating conditions *EU* → do not change baseline function but exert influence on a real engine and, accordingly, on all variables *Y*. For this reason, any change of *EU* → can induce synchronous errors of the deviations \* δ*Y* of all monitored variables. It is very likely that most deviation fluctuations in Fig.2 origin from the Type III errors.

*Type IV error.* This error varies in time along with changes in the operating conditions \* *Um* → producing perturbations in the deviation variable \* δ*Y* . The perturbations can be both independent and correlated depending on particular causes of the error *<sup>Y</sup>*<sup>0</sup> *E* . Although the baseline function adequacy is a challenge, the error can be reduced to an acceptable level by applying a proper function type and using a representative reference set.

Distributions of simulated and real deviation errors are shown in Fig.4. Normalized deviations

$$Z^\* = \delta Y^\* \left/ a\_\chi \right. \tag{12}$$

multidimensional Gaussian distribution, as in the case of the monitored variables *Y*. Because

Thus, random errors of operating conditions induce correlated deviation errors that cannot be described by the multidimensional Gaussian distribution. It is very likely that permanent noise with a small scatter observed in Fig.2 results from the errors of Type I and Type II.

Fig. 4. Simulated and real deviation errors of the GT1 (Z2 – normalized deviation of the

*Type III error*. Presence of such an error has been confirmed after analyzing all other error

baseline function but exert influence on a real engine and, accordingly, on all variables *Y*.

all monitored variables. It is very likely that most deviation fluctuations in Fig.2 origin from

*Type IV error.* This error varies in time along with changes in the operating conditions \* *Um*

independent and correlated depending on particular causes of the error *<sup>Y</sup>*<sup>0</sup> *E* . Although the baseline function adequacy is a challenge, the error can be reduced to an acceptable level by

Distributions of simulated and real deviation errors are shown in Fig.4. Normalized

\* \* *Z YaY* = δ

δ

EGT; Z3 - normalized deviation of the power turbine temperature)

→

producing perturbations in the deviation variable \*

types. This error occurs because the additional operating conditions *EU*

applying a proper function type and using a representative reference set.

*Y* . As to the random component, it can be described by the

and, consequently, deviations \*

has an influence on baseline values of all monitored

δ

b) Real deviation errors

can induce synchronous errors of the deviations \*

→

*Y* . The perturbations can be both

(12)

do not change

δ*Y* of

→

→

*Y* have correlation.

cannot

*Type II errors*. It is easy to show that the system component of the errors *EUm*

δ

→

∧

influence a lot the deviation \*

every change of the arguments \* *Um*

variables, their baseline values *Y*<sup>0</sup>

a) Deviation errors computed through sensor error simulation

For this reason, any change of *EU*

the Type III errors.

deviations

are presented here. A parameter *Ya* denotes an amplitude of random errors in the deviation variable \* δ*Y* . Such normalization simplifies fault class description and enhances diagnosis reliability. Diagram (a) illustrates the deviation errors simulated through the multidimensional Gaussian distribution of sensor errors (Type I and Type II errors). Such simulation is traditionally applied in gas turbine fault recognition algorithms. Diagram (b) shows the errors extracted from real data-based deviations. Both diagrams show visible error correlation between the presented deviations. But there are visible differences as well: the distribution of real deviation errors is less regular. Not taken into consideration in fault recognition algorithms, these differences can affect the reliability of gas turbine diagnosis. Therefore, to make the diagnosis more reliable, simulated noise should be as close as possible to real errors.

### **4. Classification for fault recognition algorithms**

As mentioned in the introduction, mathematical models are widely used in gas turbine diagnostics to describe engine performance degradation and faults and the deviations are employed to reveal the degradation influence. For this reason, a classification for fault recognition algorithms is usually formed in the space of the deviations with the use of nonlinear or linear gas turbine models.

### **4.1 Fault classification in the deviation space**

For the purposes of diagnosis, existing variety of engine faults should be broken down into a limited number of classes. The hypothesis commonly used in the pattern recognition theory states that a recognized object can belong only to one of *q* classes

$$D\_1, D\_2, \dots, D\_q \tag{13}$$

that are set before recognition itself. This assumption is also accepted in gas turbine diagnostics. As a rule, each fault class corresponds to one engine component and is described by its fault parameters. If we change only one fault parameter, a single fault class is formed whereas a multiple fault class is created by an independent variation of two and more fault parameters.

As mentioned in the introduction, the deviations are good indicators of engine faults. That is why the normalized deviations *Z* computed for *m* available monitored variables form an appropriate space to recognize the faults. On the basis of the above considerations, a recognition space (diagnostic space) is formed on the basis of the vector *Z* → (*m*×1) that unites elemental normalized deviations. One value of the vector \* *Z* → computed with the measurements \* *Y* → presents a pattern to be recognized.

Some recognition techniques, for example, the Bayesian approach, need a probabilistic description of the used classification. In this case each fault class *Dj* should be described by its probability density function \* (/ )*<sup>j</sup> f Z D* → . The difficulty of this approach is related with density functions themselves because it is a principal problem of mathematical statistics to assess them. That is why the probabilistic description can be realized only for a simplified fault classes.

Gas Turbine Diagnostics 205

strongly dependent on a mode change. Therefore we intended to draw up the classification that would be independent from operational conditions. The classification has been created by incorporating patterns from different steady states into every class. In this case, a region occupied by a class is more diffused inducing greater class intersection. This objectively leads to additional losses of the diagnosis reliability but the investigations have shown that these losses are insignificant. Such new classification was compared with a conventional classification for one operating mode. The comparison was made under different diagnostic conditions (different engines, steady state operating conditions and fault class types). Additionally, the comparison was performed under transient operating conditions. The resulting losses did not exceed 2%. Thus, the universal classification does not significantly reduce the diagnosis reliability level. On the other hand, the suggested classification drastically simplifies the gas turbine diagnosis because it is formed once and used later without changes to diagnose an engine at any operating mode. Therefore, the diagnostic algorithms based on the universal fault classification can be successfully implemented in

Another way to enhance a convenient classification is related with the idea to embed a real fault class into model based classification. The idea of such a mixed classification has been

Since model errors, which can be significant, are transmitted to the model-based classification, the idea appears to make the description of some classes more accurate using real data. Such a mixed fault classification will incorporate both model-based and datadriven fault classes. The classification will combine a profound common diagnosis with a higher diagnostic accuracy for the data-driven classes. To support the idea, a data-driven class of the fouling based on real fouling data has been created and incorporated into the model-based classification. The resulting mixed classification and a convinient model-based classification were embedded into a diagnostic algorithm. It was found that the application of a model based classification to real data influenced by compressor fouling causes severe diagnosis errors of over 30 per cent. However, the switch to the mixed classification results

The next way to make the classification more realistic consists in the insertion of real deviation errors into the description of model based classes proposed in (Loboda, 2011).

The most of researchers also take into account random errors in the monitored variables and operating conditions applying the Gaussian distribution to that end. However, as shown in section 3, the difference between such traditionally simulated errors and real deviation errors can be significant. That is why it is proposed to draw a noise part from the deviations and integrate it into the description of simulated fault classes. Three alternative schemes, two existing and one new, of deviation error representation in diagnostic algorithms have been realized. They were compared with the use of probability of correct diagnosis *P* . Figure 5 illustrates the fault classification with simulated measurement errors like in Fig.4 (a).

real condition monitoring systems.

**4.3 Mixed fault classification** 

in a decrease of error of up to 3 percent only.

**4.4 Classification with more realistic deviation errors** 

proposed and validated in paper (Loboda & Yepifanov, 2010).

More recognition techniques use a statistical classification description. In this case the fault classes are given by samples of patterns, namely vectors \* *Z* → . In this way, a whole fault classification is a union of pattern samples of all classes. Apart from the simplification of a class formation process, the replacement of the density functions by pattern samples allows creating more complex fault classes only on the basis of real data. However, gas turbine faults are still often simulated mathematically because of rare appearance of real faults and high costs of physical fault simulation.

The deviations, induced by the faults that are embedded into the thermodynamic model via a change <sup>→</sup> Δ Θ , can be computed according to a formula

$$Z\_{i} = \frac{Y\_{i}(\stackrel{\rightarrow}{\mathcal{U}}\_{\prime}\stackrel{\rightarrow}{\Theta}\_{0} + \Delta\stackrel{\rightarrow}{\Theta}) - Y\_{i}(\stackrel{\rightarrow}{\mathcal{U}}\_{\prime}\stackrel{\rightarrow}{\Theta}\_{0})}{Y\_{i}(\stackrel{\rightarrow}{\mathcal{U}}\_{\prime}\stackrel{\rightarrow}{\Theta}\_{0})a\_{\gamma\_{i}}} + \varepsilon\_{i\prime}, i = 1, m\ . \tag{14}$$

The vector 0 Θ corresponds here to a healthy engine. Random errors *<sup>i</sup>* ε make deviations more realistic. They can be added directly to systematic parts of the deviations or can be introduced through the simulation of random measurement errors in *Yi* and *U* → . The deviation vectors \* *Z* → (patterns) for faults of different type and severity are generated by the model through changing a structure and length of the vector <sup>→</sup> Δ Θ . The resulting totality **Zl\*** of all classification's patterns is typically called a learning set because it is applied to train the used recognition technique, for example, a neural network.

There is a common statistical rule that a function determined on one portion of the random data should be tested on another. Consequently, to verify the technique trained on the learning set, we need one more set. The necessary set **Zv\***, called a validation set, is created in the same way as the set **Zl\***. The only exception is that different series of the random numbers are involved in the calculations of the fault severity and errors in the deviations. A class of every pattern of the validation set is beforehand known. Therefore, applying the trained technique to this set, we can compare the diagnosis with a known class and compute a vector *P* → of true diagnosis probabilities and an averaged probability *P* . These probabilities quantify class distinquishability and engine diagnosability and are good criteria to tailor and compare recognition techniques.

The patterns of the learning and validation sets described above are generated at a fixed operating mode given by a constant vector *U* → . Such a classification is intended for diagnosis at the same mode. The principal to make the classification and fault recognition more universal is described below.

#### **4.2 Universal fault classification**

→

The principle of a universal fault classification had been proposed and investigated in [Loboda & Feldshteyn, 2007; Loboda & Yepifanov, 2010). During the diagnosis at different operating modes, it has been found that class presentation in the diagnostic space *Z* → is not

More recognition techniques use a statistical classification description. In this case the fault

classification is a union of pattern samples of all classes. Apart from the simplification of a class formation process, the replacement of the density functions by pattern samples allows creating more complex fault classes only on the basis of real data. However, gas turbine faults are still often simulated mathematically because of rare appearance of real faults and

The deviations, induced by the faults that are embedded into the thermodynamic model via

0 0 0

*YU YU Z i <sup>m</sup>*

more realistic. They can be added directly to systematic parts of the deviations or can be

of all classification's patterns is typically called a learning set because it is applied to train

There is a common statistical rule that a function determined on one portion of the random data should be tested on another. Consequently, to verify the technique trained on the learning set, we need one more set. The necessary set **Zv\***, called a validation set, is created in the same way as the set **Zl\***. The only exception is that different series of the random numbers are involved in the calculations of the fault severity and errors in the deviations. A class of every pattern of the validation set is beforehand known. Therefore, applying the trained technique to this set, we can compare the diagnosis with a known class and compute

probabilities quantify class distinquishability and engine diagnosability and are good

The patterns of the learning and validation sets described above are generated at a fixed

at the same mode. The principal to make the classification and fault recognition more

The principle of a universal fault classification had been proposed and investigated in [Loboda & Feldshteyn, 2007; Loboda & Yepifanov, 2010). During the diagnosis at different

operating modes, it has been found that class presentation in the diagnostic space *Z*

→

of true diagnosis probabilities and an averaged probability *P* . These

(, ) *i i i i i Yi*

Θ corresponds here to a healthy engine. Random errors *<sup>i</sup>*

introduced through the simulation of random measurement errors in *Yi* and *U*

→→ → →→ → → Θ +ΔΘ − Θ <sup>=</sup> + = Θ

*YU a*

(, ) (, ) , 1,

ε

(patterns) for faults of different type and severity are generated by the

→

. In this way, a whole fault

. (14)

Δ Θ . The resulting totality **Zl\***

make deviations

→

→

is not

. The

ε

. Such a classification is intended for diagnosis

classes are given by samples of patterns, namely vectors \* *Z*

Δ Θ , can be computed according to a formula

model through changing a structure and length of the vector <sup>→</sup>

the used recognition technique, for example, a neural network.

criteria to tailor and compare recognition techniques.

operating mode given by a constant vector *U*

high costs of physical fault simulation.

→

a change <sup>→</sup>

The vector 0

a vector *P*

→

universal is described below.

**4.2 Universal fault classification** 

deviation vectors \* *Z*

→

strongly dependent on a mode change. Therefore we intended to draw up the classification that would be independent from operational conditions. The classification has been created by incorporating patterns from different steady states into every class. In this case, a region occupied by a class is more diffused inducing greater class intersection. This objectively leads to additional losses of the diagnosis reliability but the investigations have shown that these losses are insignificant. Such new classification was compared with a conventional classification for one operating mode. The comparison was made under different diagnostic conditions (different engines, steady state operating conditions and fault class types). Additionally, the comparison was performed under transient operating conditions. The resulting losses did not exceed 2%. Thus, the universal classification does not significantly reduce the diagnosis reliability level. On the other hand, the suggested classification drastically simplifies the gas turbine diagnosis because it is formed once and used later without changes to diagnose an engine at any operating mode. Therefore, the diagnostic algorithms based on the universal fault classification can be successfully implemented in real condition monitoring systems.

Another way to enhance a convenient classification is related with the idea to embed a real fault class into model based classification. The idea of such a mixed classification has been proposed and validated in paper (Loboda & Yepifanov, 2010).

### **4.3 Mixed fault classification**

Since model errors, which can be significant, are transmitted to the model-based classification, the idea appears to make the description of some classes more accurate using real data. Such a mixed fault classification will incorporate both model-based and datadriven fault classes. The classification will combine a profound common diagnosis with a higher diagnostic accuracy for the data-driven classes. To support the idea, a data-driven class of the fouling based on real fouling data has been created and incorporated into the model-based classification. The resulting mixed classification and a convinient model-based classification were embedded into a diagnostic algorithm. It was found that the application of a model based classification to real data influenced by compressor fouling causes severe diagnosis errors of over 30 per cent. However, the switch to the mixed classification results in a decrease of error of up to 3 percent only.

The next way to make the classification more realistic consists in the insertion of real deviation errors into the description of model based classes proposed in (Loboda, 2011).

### **4.4 Classification with more realistic deviation errors**

The most of researchers also take into account random errors in the monitored variables and operating conditions applying the Gaussian distribution to that end. However, as shown in section 3, the difference between such traditionally simulated errors and real deviation errors can be significant. That is why it is proposed to draw a noise part from the deviations and integrate it into the description of simulated fault classes. Three alternative schemes, two existing and one new, of deviation error representation in diagnostic algorithms have been realized. They were compared with the use of probability of correct diagnosis *P* . Figure 5 illustrates the fault classification with simulated measurement errors like in Fig.4 (a).

Gas Turbine Diagnostics 207

To continue the comparison of recognition techniques, paper (Loboda, Feldshteyn et al., 2011) compares two network types: a multilayer perceptron (MLP) and a radial basis network (RBN). To draw firm conclusions on the networks' applicability, comparative calculations were repeated for different variations of diagnostic conditions. In particular, two different engines were chosen as test cases. The comparison results are shown in Table 2. It can be seen that the differences between the techniques are very smal. On average for

Mode 1 Mode 2

Basic node numbers

> Case 3 0.8031 0.8009

> Case 7 0.8660 0.8663

Enlarged node numbers

> Case 4 0.8059

> Case 8 0.8686

Enlarged node numbers

> Case 2 0.8184 0.8186

RBN 0.8058

Case 6 0.8765 0.8783

RBN 0.8701

The comparison was repeated for an aircraft turbofan engine denoted as GT3. The corresponding average probability increment was found to be 0.0028 (0.28%). In this way, an advantage of the radial basis network in the application to the analyzed turbofan seems to be a little more notable than in the case of the industrial gas turbine. By way of summing up the comparison results, the conclusion is that the radial basis network is a little more accurate than the perceptron, however the difference can be considered as insignificant.

The comparison of recognition techniques has been completed in paper (Estrada Moreno & Loboda, 2011), in which the MLP and probabilistic neural network (PNN) are compared. The comparison under different diagnostic conditions has revealed that the diagnosis by the PNN is less reliable. However, the averaged difference of the probability is not greater than 0.5%. Once more we can state that the compared techniques are practically equal in

The fact that the fault recognition performances of four different recognition techniques, namely, Bayessian approach, MLP, RBN, and PNN, are very close is worthy of some discussion. What explanation can be provided for this? We believe that all four techniques are sophisticated enough and are well suited for solving this specific problem – gas turbine fault recognition. No one of these techniques can further enhance diagnostic accuracy because the accuracy achieved is near the theoretical accuracy level that is inherent to the solving problem: gas turbine fault recognition with the given classification. Following this idea, we suppose that within the approach used and the classification accepted no other recognition technique will be capable to considerably enhance diagnostic accuracy. Instead, all efforts should be made to reduce fault class intersections, for example, by reducing measurement inaccuracy, installing more sensors in the gas path, and decreasing deviation errors. The options of multipoint diagnosis and diagnosis during transient operation will

all cases presented in the table for the GT1, the RBN gains 0.0009 (0.09%) only.

Basic node numbers

> Case 1 0.8157 0.8115

> Case 5 0.8738 0.8745

Table 2. Probabilities *P* for the networks compared on GT1 data

Class type

accuracy.

Singular MLP

Multiple MLP

also result in a higher diagnostic accuracy.

Networ k type

Preliminary calculations have shown that the distinguishability of fault classes can change by up to 6% when real errors are replaced by simulated errors. Thus, the diagnostic performance estimated with simulated noise can be inaccurate. The case was also investigated when the errors for the learning and validation set were extracted from different time portions of real data. The loss of diagnosability for this case was found drastic: from *P* = 90% - 94% in the previous cases to *P* = 59%. It has happened because real deviation errors included into the validation set increased a lot in comparison with the learning set errors. The increase of the errors occurred because the baseline model is adequate on the reference set data but loses its accuracy on the subsequently recorded data. Such a problem seems to be very probable in real diagnosis and we should be careful to avoid or mitigate it.

Fig. 5. 3D plot of four fault classes with simulated sensor errors

Although the proposed scheme is more realistic, it cannot automatically replace existing noise simulation modes. This new scheme is more complex for realization. Additionally, it needs both the thermodynamic model and extensive real data, two things rarely available together. In this way, the proposed scheme of deviation error representation can rather be recommended for a final precise estimation of gas turbine diagnosability.

Thus, we completed the analysis of different improvements of a convenient fault classification. Let us now consider the problem of choosing a recognition technique.
