4. Case study: Tennessee Eastman process

Finally, as an example of the application of a process monitoring scheme incorporating feature extraction from time series data in a moving window, the following study can be considered. It is based on the Tennessee Eastman benchmark process widely used in these types of studies. The feature extraction process considered here is an extension of recurrent quantitative analysis discussed in Section 2.5.2. Instead of using thresholded recurrence plots, unthresholded or global recurrence plots are considered, as explained in more detail in below.

#### 4.1 Tennessee Eastman process data

The Tennessee Eastman (TE) process as proposed by Downs and Vogel [97] and has been used as a benchmark in numerous process control and monitoring studies [98]. It captures the dynamic behavior of an actual chemical process, the layout of which is shown in Figure 3.

The plant consists of 5 units, namely a reactor, condenser, compressor, stripper and separator, as well as eight components (four gaseous reactants A, C, D, E, one inert reactant B, and three liquid products F, G, H) [97]. In this instance, the plantwide control structure suggested by Lyman and Georgakis [99] was used to simulate the process and to generate data related to varying operating conditions. The data set is available at http://web.mit.edu/braatzgroup.

A total of four data sets were used, that is, one data set associated with NOC and the remaining three associated with three different faults conditions. The TE process comprises 52 variables, of which 22 are continuous process measurements, 19 are composition measurements, and the remaining 11 are manipulated variables. These variables are presented in Table 3. Each data set consisted of 960 measurements sampled at 3 min intervals.

Process measurement Composition measurement Manipulated variable Variable Description Variable Description Variable Description A Feed 23 Reactor feed component A 42 D feed flow D Feed 24 Reactor feed component B 43 E feed flow E Feed 25 Reactor feed component C 44 A feed flow Total Feed 26 Reactor feed component D 45 Total feed flow Recycle flow 27 Reactor feed component E 46 Compressor

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series

DOI: http://dx.doi.org/10.5772/intechopen.85456

6 Reactor feed rate 28 Reactor feed component F 47 Purge valve 7 Reactor pressure 29 Purge component A 48 Separator

8 Reactor level 30 Purge component B 49 Stripper

9 Reactor temperature 31 Purge component C 50 Stripper steam

10 Purge rate 32 Purge component D 51 Reactor cooling

11 Separator temperature 33 Purge component E 52 Condenser

Fault number Description Type 3 D feed temperature Step change 9 Reactor feed D temperature Random variation

15 Condenser cooling water valve Sticking

12 Separator level 34 Purge component F 13 Separator pressure 35 Purge component G 14 Separator underflow 36 Purge component H 15 Stripper level 37 Product component D 16 Stripper pressure 38 Product component E 17 Stripper underflow 39 Product component F 18 Stripper temperature 40 Product component G 19 Stripper steam flow 41 Product component H

20 Compressor work 21 Reactor cooling water

22 Separator cooling

Table 3.

Table 4.

13

outlet temperature

water outlet temperature

Description of variables in Tennessee Eastman process.

Description faults 3, 9, and 15 in the Tennessee Eastman process.

recycle valve

product liquid flow

product liquid flow

valve

water flow

cooling water flow

Figure 3. Process flow of Tennessee Eastman benchmark problem.

The NOC samples were used to construct an off-line process monitoring model that consisted of a moving window of length b, moving sliding along the time series with a step size s. The three fault conditions are summarized in Table 4. Fault conditions 3, 9, and 15 are the most difficult to detect, and many fault diagnostic approaches fail to do so reliably.

In this case study, the approach previously proposed by Bardinas et al. [96] is applied to the three fault conditions in the TE process. The methodology can be briefly summarized as shown in Figure 4.

A window of user defined length b slides along the time series (A) with a user defined step size s, yielding time series segments (B), each of which can be represented by a similarity matrix (C) that is subsequently considered as an image from which features can be extracted via algorithms normally used in multivariate image analysis (D).

#### 4.2 Feature extraction

Two sets of features were extracted from the similarity or distance matrices, namely features from the gray level co-occurrence matrices of the images, as well as local binary pattern features, as briefly discussed below.

### 4.2.1 Gray level co-occurrence matrices (GLCMs)

GLCMs assign distributions of gray level pairs of neighboring pixels in an image based on the spatial relationships between the pixels. More formally, if y ið Þ ; j is an element of a GLCM associated with an image I of size R � S, having L gray levels, then y ið Þ ; j can be defined as

$$y(i,j) = \sum\_{r=1}^{R} \sum\_{s=1}^{S} \begin{cases} \mathbf{1}, & \text{if } I(r,s) = i, \text{and } I(r+\Delta r, s+\Delta s) = j\\ \mathbf{0}, & \text{otherwise} \end{cases} \tag{8}$$


Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series DOI: http://dx.doi.org/10.5772/intechopen.85456

#### Table 3.

The NOC samples were used to construct an off-line process monitoring model that consisted of a moving window of length b, moving sliding along the time series with a step size s. The three fault conditions are summarized in Table 4. Fault conditions 3, 9, and 15 are the most difficult to detect, and many fault diagnostic

In this case study, the approach previously proposed by Bardinas et al. [96] is applied to the three fault conditions in the TE process. The methodology can be

A window of user defined length b slides along the time series (A) with a user

represented by a similarity matrix (C) that is subsequently considered as an image from which features can be extracted via algorithms normally used in multivariate

Two sets of features were extracted from the similarity or distance matrices, namely features from the gray level co-occurrence matrices of the images, as well as

GLCMs assign distributions of gray level pairs of neighboring pixels in an image based on the spatial relationships between the pixels. More formally, if y ið Þ ; j is an element of a GLCM associated with an image I of size R � S, having L gray levels,

0, otherwise

1, if I rð Þ¼ ; s i, and I rð þ Δr; s þ ΔsÞ ¼ j

(8)

defined step size s, yielding time series segments (B), each of which can be

approaches fail to do so reliably.

image analysis (D).

Figure 3.

4.2 Feature extraction

then y ið Þ ; j can be defined as

12

y ið Þ¼ ; j ∑

R r¼1 ∑ S s¼1

briefly summarized as shown in Figure 4.

Process flow of Tennessee Eastman benchmark problem.

Time Series Analysis - Data, Methods, and Applications

local binary pattern features, as briefly discussed below.

4.2.1 Gray level co-occurrence matrices (GLCMs)

Description of variables in Tennessee Eastman process.


Table 4. Description faults 3, 9, and 15 in the Tennessee Eastman process.

Figure 4.

Process monitoring methodology (after Bardinas et al., 2018). (A) Time series matrix, (B) Segmented time series matrix, (C) Distance matrices, and (D) Features and labels.

where ð Þ r; s and ð Þ r þ Δr; s þ Δs denote the positions of the reference and neighboring pixels, respectively. From this matrix, various textural descriptors can be defined. Only four of these were used, as defined by Haralick et al. [100], namely contrast, correlation, energy, and homogeneity.

### 4.2.2 Local binary patters (LBPs)

LBPs are nonparametric descriptors of the local structure of the image [101]. The LBP operator is defined for a pixel in the image as a set of binary values obtained by comparing the center pixel intensity with its neighboring pixels. If the neighboring pixel exceeds the intensity of the center pixel value, this pixel is set to 1 (otherwise 0). Formally, given the central pixel's coordinates x0; y<sup>0</sup> , the resulting LBP can be obtained in the decimal form as

$$LBP\left(\mathbf{x}\_0, \mathbf{y}\_0\right) = \sum\_{p=0}^{p-1} s\left(i\_p - i\_0\right) \mathbf{2}^p \tag{9}$$

where the gray level intensity value of the central pixel is i0, that of its p'th neighbor is ip. Moreover, the function sð Þ∙ is defined as

$$s(\mathbf{x}) = \begin{cases} \mathbf{0}, & \text{if } \mathbf{x} < \mathbf{0} \\ \mathbf{1}, & \text{if } \mathbf{x} \ge \mathbf{0} \end{cases} \tag{10}$$

In Figure 6, principal component score plots of the two optimal feature sets are shown. The large LBP feature set could not be visualized reliably, as the first two principal components could only capture 52.5% of the variance of the features. The variance in the smaller GLCM feature set could be captured with high reliability by the first two principal components of the four features. Here, the differences between the normal operating data ("0" legend) and the other fault conditions are

The approach outlined in Section 2.5.4. considered in more detail in the above case study is an extension of recurrent quantification analysis with the advantage that information is not lost when the similarity matrix of the signal is thresholded. Also, while thresholding does not preclude the use of a wide range of feature extraction algorithms, recurrent quantification has mostly been applied to dynamic systems based on a set of engineered features that allow some modicum of physical

In most diagnostic systems, this is not essential and therefore more predictive feature sets may be constructed. These features could be engineered, as was considered in the case study or they could be learned, by taking advantage of state-

clear ("3," "9," and "15").

4.4 Discussion

Figure 5.

Figure 6.

Grid search optimization of the window length (b) and step size (s).

DOI: http://dx.doi.org/10.5772/intechopen.85456

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series

Principal component score plots of GLCM (left) and LBP features (right).

interpretation.

15

of-the-art developments in deep learning.

#### 4.3 Selection of window length and step size

Apart from the selection of a feature extraction method, one of the main choices that need to be made in the process monitoring scheme is the length of the sliding window. If this is too small, the essential dynamics of the time series would not be captured. On the other hand, if it is too large, it would result in a considerable lag before any change in the process can be detected. There is also the possibility that transient changes may go undetected altogether. In the case of a moving window, the step size of the moves also needs to be considered. The selection of these two parameters can be done by means of a grid search, and the results of which are shown in Figure 5.

As indicated in Figure 5, the optimal window size was b = 1000 and the step size was s = 20 for both the GLCM and LBP features that were used as predictors. With these settings, the 500-tree random forest model [102] was able to differentiate between the normal operating conditions and the three fault classes with a reliability of approximately 82%.

Process Fault Diagnosis for Continuous Dynamic Systems Over Multivariate Time Series DOI: http://dx.doi.org/10.5772/intechopen.85456

Figure 5.

where ð Þ r; s and ð Þ r þ Δr; s þ Δs denote the positions of the reference and neighboring pixels, respectively. From this matrix, various textural descriptors can be defined. Only four of these were used, as defined by Haralick et al. [100], namely

Process monitoring methodology (after Bardinas et al., 2018). (A) Time series matrix, (B) Segmented time

LBPs are nonparametric descriptors of the local structure of the image [101]. The LBP operator is defined for a pixel in the image as a set of binary values obtained by comparing the center pixel intensity with its neighboring pixels. If the neighboring pixel exceeds the intensity of the center pixel value, this pixel is set to 1

> p�1 p¼0

0, if x < 0 1, if x≥0

Apart from the selection of a feature extraction method, one of the main choices that need to be made in the process monitoring scheme is the length of the sliding window. If this is too small, the essential dynamics of the time series would not be captured. On the other hand, if it is too large, it would result in a considerable lag before any change in the process can be detected. There is also the possibility that transient changes may go undetected altogether. In the case of a moving window, the step size of the moves also needs to be considered. The selection of these two parameters can be done by means of a grid search, and the results of which are

As indicated in Figure 5, the optimal window size was b = 1000 and the step size was s = 20 for both the GLCM and LBP features that were used as predictors. With these settings, the 500-tree random forest model [102] was able to differentiate between the normal operating conditions and the three fault classes with a reliabil-

where the gray level intensity value of the central pixel is i0, that of its p'th

s ip � i<sup>0</sup>

, the resulting

(10)

2<sup>p</sup> (9)

(otherwise 0). Formally, given the central pixel's coordinates x0; y<sup>0</sup>

LBP x0; y<sup>0</sup>

s xð Þ¼

neighbor is ip. Moreover, the function sð Þ∙ is defined as

4.3 Selection of window length and step size

shown in Figure 5.

14

ity of approximately 82%.

<sup>¼</sup> <sup>∑</sup>

contrast, correlation, energy, and homogeneity.

series matrix, (C) Distance matrices, and (D) Features and labels.

Time Series Analysis - Data, Methods, and Applications

LBP can be obtained in the decimal form as

4.2.2 Local binary patters (LBPs)

Figure 4.

Grid search optimization of the window length (b) and step size (s).

Figure 6. Principal component score plots of GLCM (left) and LBP features (right).

In Figure 6, principal component score plots of the two optimal feature sets are shown. The large LBP feature set could not be visualized reliably, as the first two principal components could only capture 52.5% of the variance of the features. The variance in the smaller GLCM feature set could be captured with high reliability by the first two principal components of the four features. Here, the differences between the normal operating data ("0" legend) and the other fault conditions are clear ("3," "9," and "15").

#### 4.4 Discussion

The approach outlined in Section 2.5.4. considered in more detail in the above case study is an extension of recurrent quantification analysis with the advantage that information is not lost when the similarity matrix of the signal is thresholded. Also, while thresholding does not preclude the use of a wide range of feature extraction algorithms, recurrent quantification has mostly been applied to dynamic systems based on a set of engineered features that allow some modicum of physical interpretation.

In most diagnostic systems, this is not essential and therefore more predictive feature sets may be constructed. These features could be engineered, as was considered in the case study or they could be learned, by taking advantage of stateof-the-art developments in deep learning.

In addition, the following general observations can be made not only with regard to the approach considered in this case study but also to other approaches reviewed in this chapter.

