1. Introduction

In the process industries, advanced process control is widely recognized as essential to meet the challenges arising from the trend toward more complex, larger scale circuit configurations, plant-wide integration, and having to make do with fewer personnel. In these environments, characterized by highly automated process operations, algorithms to detect and classify abnormal trends in process measurements are critically important.

Process diagnostic algorithms can be derived from a continuum spanning first principle models on one end to entirely data driven or statistical models on the other. The latter is typically based on historical process data and is seen as the most cost effective approach to deal with complex systems. As a consequence, diagnostic methods have seen considerable growth over the last couple of decades.

Data-driven fault diagnosis can be traced back to control charts invented by Walter Shewhart at Bell Laboratories in the 1920s to improve the reliability of their telephony transmission systems. In these statistical process control charts, variables of interest were plotted as time series within statistical upper and lower limits. Shewhart's methodology was subsequently popularized by Deming and these statistical concepts, such as Shewhart control charts (1931), cumulative sum charts (1954), and exponentially weighted moving average charts were well established by the 1960s [1].

1.1 Steady state systems

DOI: http://dx.doi.org/10.5772/intechopen.85456

Linear steady state Gaussian processes and the use of principal component analysis will first be considered as an example on the basis of this general framework, after

As mentioned in the previous section, univariate control charts do not exploit the correlation that may exist between process variables and when the assumptions of linearity, steady state, and Gaussian behavior hold, multivariate statistical process control based on the use of principal component analysis can be used very effectively for early detection and analysis of any abnormal plant behavior. Since principal component analysis plays such a major role in the design of these diag-

Analysis, monitoring, and diagnosis of process operating performance based on the use of principal components is well established. The basic theory can be summarized as follows, where X ∈ R<sup>N</sup>x<sup>M</sup> comprises the data matrix representative of the process with M variables and N observations, S is the covariance matrix of the process variables typically scaled to zero mean and unit variance, P is the loading matrix of the first k < M principal components, Λ is a diagonal matrix containing the k eigenvalues of the decomposition, Pe is the loading matrix of the M � k remaining principal components, and Λe is a diagonal matrix containing the M � k remaining eigenvalues of the decomposition. The T<sup>2</sup> and Q-diagnostics (Eqs. 2 and 3) are

<sup>N</sup> � <sup>1</sup> <sup>¼</sup> <sup>P</sup>ΛP<sup>T</sup> <sup>þ</sup> <sup>P</sup>eΛePT<sup>e</sup> (1)

N Nð Þ � <sup>k</sup> (4)

2

2. The standard normal

<sup>θ</sup> (5)

P<sup>T</sup> (3)

<sup>Q</sup> <sup>¼</sup> ð Þ <sup>x</sup> � <sup>x</sup>^ <sup>T</sup>ð Þ¼ <sup>x</sup> � <sup>x</sup>^ <sup>x</sup><sup>T</sup>Cx, where <sup>C</sup> <sup>¼</sup> PPT<sup>e</sup> (2)

q� �=Λ<sup>1</sup> <sup>þ</sup> <sup>Λ</sup>2θ θð Þ � <sup>1</sup> <sup>=</sup>θΛ<sup>1</sup>

h i

<sup>t</sup> <sup>¼</sup> <sup>x</sup><sup>T</sup>Dx, where <sup>D</sup> <sup>¼</sup> <sup>P</sup>Λ�<sup>1</sup>

In classical multivariate statistical process control based on principal component analysis, the control limits required for automated process monitoring are based on the assumption that the data are normally distributed. The α upper control limit for

UCLT<sup>2</sup>ð Þ PCA <sup>¼</sup> k Nð Þ <sup>þ</sup> <sup>1</sup> ð Þ <sup>N</sup> � <sup>1</sup> <sup>F</sup>α,k,N�<sup>k</sup>

Then upper control limit for Q is calculated by means of the χ<sup>2</sup> distribution as:

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2Λ2θ<sup>2</sup>

<sup>k</sup>þ<sup>1</sup>λji (for <sup>i</sup> = 1, 2, 3) and <sup>θ</sup> <sup>¼</sup> <sup>1</sup> � <sup>2</sup> <sup>Λ</sup>1Λ3=3Λ<sup>2</sup>

Unlike steady state systems, unsteady state or dynamic systems show time dependence. This time dependence implies the presence of autocorrelation and/or nonstationarity [3]. Autocorrelation arises when the observations within a time

deviates, c<sup>α</sup> corresponding to the upper (1�α) percentile, while M is the total number of principal components (variables). The residual Qiis more likely to have a normal distribution than the principal component scores, since it is a measure of the

which other methods proposed over the last few decades will be reviewed.

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series

nostic models, a brief outline of the methodology is in order.

<sup>S</sup> <sup>¼</sup> <sup>X</sup><sup>T</sup><sup>X</sup>

T<sup>2</sup> is calculated from N observations based on the F-distribution, that is,

Λ<sup>1</sup> 1 þ c<sup>α</sup>

<sup>T</sup>Λ�<sup>1</sup>

commonly used in process monitoring schemes.

<sup>T</sup><sup>2</sup> <sup>¼</sup> <sup>t</sup>

UCLQ PCA ð Þ ¼

nondeterministic behavior of the system.

where <sup>Λ</sup><sup>1</sup> <sup>¼</sup> <sup>∑</sup><sup>M</sup>

1.2 Unsteady state systems

5

These univariate control charts do not exploit the correlation that may exist between process variables. In the case of process data, crosscorrelation is present, owing to restrictions enforced by mass and energy conservation principles, as well as the possible existence of a large number of different sensor readings on essentially the same process variable. These shortcomings have given rise to multivariate methods or multivariate statistical process control and related methods that have proliferated exponentially over the last number of years. These approaches can be viewed on the basis of the elementary operations involved in the fault diagnostic process, as outlined in Figure 1 [2].

In this diagram, (i) a data matrix (X� ), representative of the process, is preprocessed or transformed to (ii) data matrix X and then mapped to (iii) a feature space (F) within some bounded region (iv) LF. These features can be used to (v) reconstruct the data (X^ ), from which (vi) an error matrix (E) is generated, with scores again mostly confined to some bounded region LE (vii).

Fault detection and fault diagnosis are typically done in both the feature space (F) and the error space (E), based on the use of forward (ℑ) and reverse mapping (ℜ) models and suitable confidence limits LF and LE for the feature and error spaces. Alternatively, forward mapping in to the feature space can be only used for process monitoring.

Preprocessing of the data prior to fault diagnosis has received considerable attention over the last decade or so as a basis for the development of methods that can deal with nonlinearities in the data, lagged variables and unfolding of higher dimensional data. These approaches will mostly be discussed in the second part of the chapter.

Figure 1. Generalized framework for unsupervised process fault diagnosis.

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series DOI: http://dx.doi.org/10.5772/intechopen.85456

## 1.1 Steady state systems

Data-driven fault diagnosis can be traced back to control charts invented by Walter Shewhart at Bell Laboratories in the 1920s to improve the reliability of their telephony transmission systems. In these statistical process control charts, variables of interest were plotted as time series within statistical upper and lower limits. Shewhart's methodology was subsequently popularized by Deming and these statistical concepts, such as Shewhart control charts (1931), cumulative sum charts (1954), and exponentially weighted moving average charts were well established by

These univariate control charts do not exploit the correlation that may exist between process variables. In the case of process data, crosscorrelation is present, owing to restrictions enforced by mass and energy conservation principles, as well as the possible existence of a large number of different sensor readings on essentially the same process variable. These shortcomings have given rise to multivariate methods or multivariate statistical process control and related methods that have proliferated exponentially over the last number of years. These approaches can be viewed on the basis of the elementary operations involved in the fault diagnostic

In this diagram, (i) a data matrix (X� ), representative of the process, is preprocessed or transformed to (ii) data matrix X and then mapped to (iii) a feature space (F) within some bounded region (iv) LF. These features can be used to (v) reconstruct the data (X^ ), from which (vi) an error matrix (E) is generated, with

Fault detection and fault diagnosis are typically done in both the feature space (F) and the error space (E), based on the use of forward (ℑ) and reverse mapping (ℜ) models and suitable confidence limits LF and LE for the feature and error spaces. Alternatively, forward mapping in to the feature space can be only used for

Preprocessing of the data prior to fault diagnosis has received considerable attention over the last decade or so as a basis for the development of methods that can deal with nonlinearities in the data, lagged variables and unfolding of higher dimensional data. These approaches will mostly be discussed in the second part of

scores again mostly confined to some bounded region LE (vii).

the 1960s [1].

process, as outlined in Figure 1 [2].

Time Series Analysis - Data, Methods, and Applications

process monitoring.

the chapter.

Figure 1.

4

Generalized framework for unsupervised process fault diagnosis.

Linear steady state Gaussian processes and the use of principal component analysis will first be considered as an example on the basis of this general framework, after which other methods proposed over the last few decades will be reviewed.

As mentioned in the previous section, univariate control charts do not exploit the correlation that may exist between process variables and when the assumptions of linearity, steady state, and Gaussian behavior hold, multivariate statistical process control based on the use of principal component analysis can be used very effectively for early detection and analysis of any abnormal plant behavior. Since principal component analysis plays such a major role in the design of these diagnostic models, a brief outline of the methodology is in order.

Analysis, monitoring, and diagnosis of process operating performance based on the use of principal components is well established. The basic theory can be summarized as follows, where X ∈ R<sup>N</sup>x<sup>M</sup> comprises the data matrix representative of the process with M variables and N observations, S is the covariance matrix of the process variables typically scaled to zero mean and unit variance, P is the loading matrix of the first k < M principal components, Λ is a diagonal matrix containing the k eigenvalues of the decomposition, Pe is the loading matrix of the M � k remaining principal components, and Λe is a diagonal matrix containing the M � k remaining eigenvalues of the decomposition. The T<sup>2</sup> and Q-diagnostics (Eqs. 2 and 3) are commonly used in process monitoring schemes.

$$\mathbf{S} = \frac{\mathbf{X}^T \mathbf{X}}{N - 1} = \mathbf{P} \mathbf{A} \mathbf{P}^T + \widetilde{\mathbf{P}} \widetilde{\mathbf{A}} \widetilde{\mathbf{P}} \mathbf{T} \tag{1}$$

$$\mathbf{Q} = (\mathbf{x} - \hat{\mathbf{x}})^T (\mathbf{x} - \hat{\mathbf{x}}) = \mathbf{x}^T \mathbf{C} \mathbf{x}, \text{ where } \mathbf{C} = \mathbf{P} \tilde{\mathbf{P}} \mathbf{T} \tag{2}$$

$$\mathbf{T}^2 = \mathbf{t}^T \boldsymbol{\Lambda}^{-1} \mathbf{t} = \mathbf{x}^T \mathbf{D} \mathbf{x}, \text{ where } \mathbf{D} = \mathbf{P} \boldsymbol{\Lambda}^{-1} \mathbf{P}^T \tag{3}$$

In classical multivariate statistical process control based on principal component analysis, the control limits required for automated process monitoring are based on the assumption that the data are normally distributed. The α upper control limit for T<sup>2</sup> is calculated from N observations based on the F-distribution, that is,

$$\text{UCL}\_{T^2(PCA)} = \frac{k(N+1)(N-1)F\_{a,k,N-k}}{N(N-k)} \tag{4}$$

Then upper control limit for Q is calculated by means of the χ<sup>2</sup> distribution as:

$$UCL\_{Q(PCA)} = \frac{\Lambda\_1 \left[1 + c\_a \sqrt{\left(2\Lambda\_2 \Theta^2\right)}/\Lambda\_1 + \Lambda\_2 \Theta(\Theta - 1)/\Theta \Lambda\_1^2} \right]}{\Theta} \tag{5}$$

where <sup>Λ</sup><sup>1</sup> <sup>¼</sup> <sup>∑</sup><sup>M</sup> <sup>k</sup>þ<sup>1</sup>λji (for <sup>i</sup> = 1, 2, 3) and <sup>θ</sup> <sup>¼</sup> <sup>1</sup> � <sup>2</sup> <sup>Λ</sup>1Λ3=3Λ<sup>2</sup> 2. The standard normal deviates, c<sup>α</sup> corresponding to the upper (1�α) percentile, while M is the total number of principal components (variables). The residual Qiis more likely to have a normal distribution than the principal component scores, since it is a measure of the nondeterministic behavior of the system.

#### 1.2 Unsteady state systems

Unlike steady state systems, unsteady state or dynamic systems show time dependence. This time dependence implies the presence of autocorrelation and/or nonstationarity [3]. Autocorrelation arises when the observations within a time

these variables. These features can subsequently be dealt with by the same methods used for steady state systems, such as principal component analysis, independent component analysis, kernel methods, etc., some of which are considered in more

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series

In dynamic PCA, first proposed by Ku et al. [4], the PCA model is built on the data matrix X residing in the window, to account for auto- and crosscorrrelation between variables. This approach implicitly estimates the autoregressive structure of the data (e.g., [5]). As functions of the model, the T<sup>2</sup> and Q-statistics will also be functions of the lag parameters. Since the mean and covariance structures are assumed to be invariant, the same global model is used to evaluate observations at

Although dynamic PCA is designed to deal with autocorrelation in the data, the resultant score variables will still be autocorrelated or even crosscorrelated when no autocorrelation is present [4, 6]. These autocorrelated score variables have the drawback that they can lead to higher rates of false alarms when using Hotelling's T<sup>2</sup>

Several remedies have been proposed to alleviate this problem, for example, wavelet filtering [7], ARMA filtering [6], and the use of residuals from predictive models [8]. Nonlinear PCA models have been considered by several authors [9–13].

Stefatos and Hamza [14] and Hsu et al. [15] have introduced diagnostic methods using an approach based on dynamic independent component analysis capable of accurately detecting and isolating the root causes of individual faults. Nonlinear variants of these approaches have been investigated by Cai et al. [16], who have integrated the kernel FastICA algorithm with a manifold learning method known as locality preserving projection. Moreover, kernel FastICA was used to integrate FastICA and kernel PCA to exploit the advantages of both algorithms, as indicated

Slow feature analysis [20] is an unsupervised learning method, whereby functions g xð Þ are identified to extract slowly varying features yð Þt from rapidly varying signals xð Þt . This is done virtually instantaneously, that is, one time slice of the output is based on very few time slices of the input. Extensions of the method have

Multiscale methods can be seen as a complementary approach preceding feature extraction from the time series. In this case, each process variable is extended or replaced by different versions of the variable at different scales. For example, with multiscale PCA, wavelets are used to decompose the process variables under scrutiny into multiple scale representations before application of PCA to detect and identify faulty conditions in process operations. In this way, autocorrelation of variables is implicitly accounted for, resulting in a more sensitive method for detecting process anomalies. Multiscale PCA constitutes a promising extension of

2.1 Dynamic principal component analysis (DPCA)

DOI: http://dx.doi.org/10.5772/intechopen.85456

detail below.

any future time point.

2.2 Independent component analysis

been proposed by other authors [21–23].

2.3 Slow feature analysis

2.4 Multiscale methods

7

by Zhang and Qin [17], Zhang [18], and Zhang et al. [19].

statistic.

#### Figure 2.

Dynamic process monitoring as an extension of steady state approaches.

series are not independent, while nonstationarity means that the parameters governing a process change over time, for example, the mean, covariance or other higher order statistics. Therefore, in principle at least, these systems cannot be treated directly by the methods dealing with steady state systems.

Broadly speaking, methodologies dealing with dynamic process systems are all aimed at dealing with the issues arising from the time dependence of the data. Essentially, these approaches are based on the analysis of a segment of the time series data, as captured by a fixed or a moving window, as indicated in Figure 2. The time series segment amounts to observation of the process over a time interval, and the window length should be sufficient to capture the dynamics of the systems.

Dynamic process monitoring can be as simple as monitoring the mean or the variance of a signal, in which case, a test window as shown in Figure 2 would not be required, and model maintenance would not be an issue. In more complex systems, as could be characterized by large multivariate sets of signals or high-dimensional signals, such as streaming video or hyperspectral data, feature extraction is often model-based. That is, a model derived from the data in the base window is applied to the data in the test window. For example, principal component models can be used for this purpose.

Where models are used and the nature of the signals changes as a result of process drift, recalibration of the models need to be done either at regular intervals or episodically, that is, when a change occurs. Some models, such as those based on principal and independent components can be updated recursively, as discussed in more detail in Sections 4 and 5. Alternatively, the model is updated ab initio at regular intervals.

Moreover, most feature extraction methods are unsupervised, that is, the time series data are unlabeled. Where supervised methods are used, features are extracted based on their ability to predict some label, such as the future evolution of the time series.
