**3. Orthonormal function approximation**

Under the condition of synchronous batch processes, the data from batch process are supposed to take the form of three-way array: *j*=1,2…*J* variables are measured at *k*=1,2,…*K* time intervals throughout *i*=1,2,…*I* batch runs. The most effective unfolding the three-way data on monitoring is to put its slices (*I*×*J*) side by side to the right, starting with the one corresponding to the first interval, then to generate a large two–dimensional matrix (*I*×*JK*) (Nomikos and MacGregor 1994, 1995; Wold et al., 1987). The variable in the twodimensional matrix is treated as a new variable for building PCA model. Nevertheless, the batch processes are asynchronous in some cases so that two–dimensional matrix (*I*×*JK*) can not be formed. Unlike translation, expansion and contraction of process measurements to generate equal duration in DTW, orthonormal function is employed to eliminate the problem resulted from the different operating time to turn the implicit system information into several key parameters which cover the necessary part of the operating conditions for each variable in each batch (Chen and Liu, 2000; Neogi and Schlags, 1998).

#### **3.1 Orthonormal function**

246 Principal Component Analysis

Step 0: Select one of the scaled trajectories *Bk* as the referenced trajectories *BREF* on the technic requirement. Set weight matrix **W** equal to the identity matrix. Then execute the

Step 1: Apply the DTW method between *Bi*, *i*=1,…,*I*, and *BREF*. Let , 1, *Bi I <sup>i</sup>* be the

Step 3: For each variable, compute the sum of squared deviations from *B* , whose inverse

( , ) [ ( , ) ( , )] *REF <sup>I</sup> <sup>b</sup> i*

As a diagonal matrix, **W** should be normalized so that the sum of the weight is equal to the

1 / (,) *N j W W N Wjj*

Step 4: In most case, the times of iterations are not greater than 3, so keep the same referenced trajectory: *BREF*=*Bk*. If the more iterations are needed, set the reference equal to

Now, a available complete trajectory of one new batch *B*RAW, NEW (*b*NEW×*N*) needs to be monitored using MPCA/MICA. It has to be synchronized before the monitoring scheme is applied because most probably the new batch trajectory *B*RAW, NEW hardly accord with the

When being scaled, each variable in the new batch *B*RAW, NEW is divided with the average range from referenced trajectory to get the resulting scaled new trajectory, *B* NEW. *B* NEW is synchronized with referenced trajectory *BREF* using **W** from Eq.8, 9 in the synchronization

Under the condition of synchronous batch processes, the data from batch process are supposed to take the form of three-way array: *j*=1,2…*J* variables are measured at *k*=1,2,…*K* time intervals throughout *i*=1,2,…*I* batch runs. The most effective unfolding the three-way data on monitoring is to put its slices (*I*×*J*) side by side to the right, starting with the one corresponding to the first interval, then to generate a large two–dimensional matrix (*I*×*JK*) (Nomikos and MacGregor 1994, 1995; Wold et al., 1987). The variable in the twodimensional matrix is treated as a new variable for building PCA model. Nevertheless, the batch processes are asynchronous in some cases so that two–dimensional matrix (*I*×*JK*) can

(*b*NEW×*N*) which can be used in MPCA/MICA model.

1 1

*i k W j j B k j Bk j*

1 2

(8)

(9)

synchronized trajectories whose common durations is same as the one of *BREF*.

Step 2: Compute the average trajectory *B* from average values of all *Bi*.

will be the newer weight of the particular variable for the next iteration.

following steps for a specified maximum number of iterations.

number of variables, that is, **W** could be replaced as:

**2.6 Offline implementation of DTW for batch monitoring** 

the average trajectory: *BREF*= *B* .

referenced trajectory *BREF*.

procedure to get the result BNEW

**3. Orthonormal function approximation** 

On the concept of Orthonormal Function Approximation (OFA), the process measurements of each variable in each batch run can be mapped onto the same number of orthonormal coefficients to represent the key information. As an univariate trajectory, the profile of each variable in each batch run can be represented as a function *F*(*t*), which can be approximated in terms of an orthonormal set {*φn* } of continuous function:

$$F(t) \equiv F\_u(C, t) = \sum\_{n=0}^{N-1} \alpha\_n \wp\_u(t) \tag{10}$$

where the coefficients, { } *C <sup>n</sup>* , ( ) ( ) *n n F t t dt* are the projection of *F*(*t*) onto each basis function. Therefore, the coefficients *C* of the orthogonal function is representative of the measured variable *F*(*t*) of one batch run. Not being calculated from a set of *K* measurements, the coefficient *αn* can be derived practically with orthonormal decomposition of *F*(*t*):

$$\begin{aligned} E\_0(t\_k) &= F(t\_k) \\ \alpha\_n &= [\Phi\_n^T \Phi\_n]^{-1} \Phi\_n \mathbf{E}\_n \\ E\_{n+1}(t\_k) &= E\_n(t\_k) - \alpha\_n \phi\_n(t\_k) \\ k &= 1, 2, \dots, K\_i \\ m &= 0, 1, \dots, N - 1; i = 1, 2, \dots, I \end{aligned} \tag{11}$$

where **E***n*= [*En*(*t*1) *En*(*t*2)…*En*(*tki*)]T and Φ*n*= [*φn*(*t*1) *φn*(*t*2)…*φn*(*tki*)]T. The Legendre polynomial basis function is regard as an effective function to be used due to the finite time interval for each batch run (Chen and Liu, 2000):

$$\begin{aligned} \varphi\_n(t) &= \sqrt{\frac{2n+1}{2}} P\_n(t) \\ P\_n(t) &= \frac{1}{2^n n!} \frac{d^n}{dt^n} [(t^2 - 1)^n] \end{aligned} \tag{12}$$

where *t*∈[-1,1]. When *n*=0, the constant coefficient *α0* is for 0 0 ( ) ( )/ 2 *t Pt* and *P*0(*t*)=1. Before applying the orthonormal function approximation, the variables of the system with different units needs to be pretreated in order to be put on an equal basis. However, mean centering of the measurement data is not necessary because the constant coefficient *α0* is for *φ*0 orthonormal basis function. Mean centering will affect the constant coefficient for *φ*<sup>0</sup> corresponding to zero. The ratio convergence test for mathematical series is applied to determine the approximation error associated with the reduction in the number of the basis spaces (Moore and Anthony, 1989). The measure of approximation effectiveness can be obtained as:

On-Line Monitoring of Batch Process with Multiway PCA/ICA 249

1 2

**Θ**

**3.2 Offline implementation of OFA for batch monitoring** 

beginning of the batch run, to detect an abnormal operation.

reduced space and calculates the *t* scores at each time interval as:

where , ,0 ,1 , 1 [, ] *<sup>j</sup> Cij i i iN* 

variable *j*.

to implement PCA/ICA algorithm.

**4. Online monitoring schemes** 

**4.1 Traditional online monitoring schemes** 

  11 12 1 21 22 2

*CC C CC C*

*, , ,J , , ,J*

 

, represents the coefficient vector of the approximation

(15)

*I, I, I,J*

*CC C*

function for the measurement variable *j* at batch *i*, and *Nj* is the needed number of terms for

When one new batch is completed, after being applied orthonormal function transformation, all the variables of the batch along the time trajectory become a row vector composed of a series of coefficients ,1 ,2 , [ , ,..., ] *new new new new J CC C* that can be projected onto Θ

It is assumed that the future measurements are in perfect accordance with their mean trajectories as calculated from reference database, the first approach is to fill the unknown part of *xnew* with zeros. In other words, batch is supposed to operate normally for the rest of its duration with no deviations in its mean trajectories. On the analysis of Nomikos and MacGregor (1995), the advantage of this approach is a good graphical representation of the batch operation in the *t* plots and the quick detection of an abnormality in the SPE plot, whereas the drawback of this approach is that the *t* scores are reluctant, especially at the

On the hypothesis that the future deviations form the mean trajectories will retain for the rest of the batch duration at their current values at the time interval *k*, the second approach is to fill the unknown part of *xnew* with current scaled values under the assumption that the same errors will persist for the rest of the batch run. Although the SPE chart is not relative sensitive than one in the first approach, the *t* scores pick up an abnormality more quickly (Nomikos and MacGregor, 1995). Nomikos and MacGregor (1995) had to suggest that the future deviations will decay linearly or exponentially from their current values to the end of the batch run, to share the advantages and disadvantages of the first two approaches.

The unknown future observations can be regarded as missing data from a batch in MPCA on the third approach. To be consistent with the already measured values up to current time *k*, and with the correlation structure of the observation variables in the database as defined by the *p*-loading matrices of MPCA model, one can use the sub model of principal components of the reference database without excessive consideration of the unknown future values. MPCA projects the already known measurements , ( ( 1)) *new k x kJ* into the

> 1 , , ( ) *T T*

*R k k k k new k t PP Px* (16)

$$G(N) = \frac{\left\|F\_N\right\|^2 - \left\|F\_{N-1}\right\|^2}{\left\|F\_N\right\|^2} = \frac{\alpha\_{N-1}^2}{\sum\_{n=0}^{N-1} \alpha\_n^2} \tag{13}$$

*N N* (14)

where <sup>2</sup> *FN* is the square of the Euclidean function norm of approximation *FN*(*C*, *t*). When a consistent minimum *Gij*(*N*) is reached, the required optimal number of terms *Nij* can be chosen for the measurement variable *j* at batch *i* (Moore and Anthony, 1989). Therefore, most of the behavior of the original *F*(*t*) is extracted from the coefficients *C*. Nevertheless, the maximum number of terms of the approximated function for each variable in all batch runs is taken to obtain enough more terms whose expansion *FN*(*t*) extracts the main behavior of *F*(*t*).

max{ } *<sup>j</sup> ij <sup>i</sup>*

Fig. 4. The three-way array **X** in each batch run of different duration maps into a coefficient matrix **Θ**

Therefore, the problem originated from the different operational time in each batch run is eliminated with the orthonormal approximation method when the same number of coefficients is used for the same measured variable. In this way, the key parameters contain the necessary part of the operating condition for each variable in each batch run. Like the multiway method, the coefficients are reorganized into time-ordered block and the blocks

can be put in order with multiway matrices 1 ( ) *J j j I N* **<sup>Θ</sup>** : 248 Principal Component Analysis

( ) *N N <sup>N</sup>*

consistent minimum *Gij*(*N*) is reached, the required optimal number of terms *Nij* can be chosen for the measurement variable *j* at batch *i* (Moore and Anthony, 1989). Therefore, most of the behavior of the original *F*(*t*) is extracted from the coefficients *C*. Nevertheless, the maximum number of terms of the approximated function for each variable in all batch runs is taken to obtain enough more terms whose expansion *FN*(*t*) extracts the main behavior

max{ } *<sup>j</sup> ij <sup>i</sup>*

*C*1,1 *C*1,2 *C*1,*<sup>J</sup> C*2,1 *C*2,2 *C*2,*<sup>J</sup>*

OFA

*CI*,1 *CI*,2 *CI*,*<sup>J</sup>*

...

*N*<sup>1</sup> *N*<sup>2</sup> *NJ*

Fig. 4. The three-way array **X** in each batch run of different duration maps into a coefficient

Therefore, the problem originated from the different operational time in each batch run is eliminated with the orthonormal approximation method when the same number of coefficients is used for the same measured variable. In this way, the key parameters contain the necessary part of the operating condition for each variable in each batch run. Like the multiway method, the coefficients are reorganized into time-ordered block and the blocks

> ( ) *J*

*j j I N* **<sup>Θ</sup>** :

*F*

*F F*

*G N*

Variables

*I*

can be put in order with multiway matrices 1

Time

**X**

*I*

*J*

where <sup>2</sup>

of *F*(*t*).

matrix **Θ**

2 2 2 1 1 2 1

*FN* is the square of the Euclidean function norm of approximation *FN*(*C*, *t*). When a

2 0

*N N* (14)

(13)

*N <sup>N</sup> <sup>n</sup> n*

*K*

 

$$\mathbf{O} = \begin{bmatrix} \mathbf{C}\_{1,1} & \mathbf{C}\_{1,2} & \cdots & \mathbf{C}\_{1,l} \\ \mathbf{C}\_{2,1} & \mathbf{C}\_{2,2} & \cdots & \mathbf{C}\_{2,l} \\ & & \vdots \\ \mathbf{C}\_{l,1} & \mathbf{C}\_{l,2} & \cdots & \mathbf{C}\_{l,l} \end{bmatrix} \tag{15}$$

where , ,0 ,1 , 1 [, ] *<sup>j</sup> Cij i i iN* , represents the coefficient vector of the approximation function for the measurement variable *j* at batch *i*, and *Nj* is the needed number of terms for variable *j*.
