**2.4.3 An improvement of DTW algorithm for more measurements**

In some processes, the measurement may be relative too large to be satisfied with the need of memory of many calculated minimum accumulate distance *DA* (*i*, *j*). Gao et al. (2001) presented a solution to overcome the problem 'out of memory'. Their idea is that *DA* (*i*, *j*) 244 Principal Component Analysis

*d*(*i*, *j*) is the weighted local distance between the *i* vector of the *T* trajectory and the *j* vector of the *R* trajectory, therein **W** is a positive definite weight matrix that reflects the relative

As mentioned above, DTW works with pairs of patterns. Therefore, the problem of whether

Let *Bi*(*bi*×*N*), *i*=1,2,…,I be a training set of good quality batches for MPCA/MICA models, where *bi* is the number of observations and *N* is number of measured variables, and one defined reference batch trajectories *BREF*, the objective is to synchronize each *Bi* with *BREF* 

Symmetric DTW algorithms include all points in the original trajectories, but expanded trajectories of various lengths, because the length is determined by DTW. After synchronization, each *Bi* will be individually synchronized with *BREF*, but not with each

Although asymmetric may eliminate some points, they will produce synchronized trajectories of equal length, because each time axis of *Bi* will be mapped with the one of *BREF* so that they

Unavoidably, the asymmetric algorithms have to skip some points in the optimal path, so the characteristics of some segments may be left out after synchronization to construct incomplete MPCA/MICA model from 'trimmed' trajectories to cause miss/false alarm.

The essence of DTW is to match the pairs of two trajectories on synchronization. At first, on symmetric DTW algorithm, the optimal path is reconstructed following above 3 constraints and Eq.5,6. Aligning points of *Bi* with *BREF* on asymmetric synchronization, some statuses

(b) Some point of *Bi* may be matched with the point in various time index of *BREF*, which

(c) More than one point of *Bi* may be averaged to a point that will be aligned with the particular point of *BREF*, because they are aligned with only one point of *BREF* in symmetric DTW algorithm. Although some local feature of points may be smoothed, it is proved that

In some processes, the measurement may be relative too large to be satisfied with the need of memory of many calculated minimum accumulate distance *DA* (*i*, *j*). Gao et al. (2001) presented a solution to overcome the problem 'out of memory'. Their idea is that *DA* (*i*, *j*)

(a) Some point of *Bi* may be copied multiply, because it matches several points of *BREF*;

ensure that all *Bi* after asymmetric operation have the same duration *bREF*.

**2.4.3 An improvement of DTW algorithm for more measurements** 

all are synchronized with reference trajectories *BREF* and synchronized with each other.

**2.4.2 The circumstance of combination of symmetric and asymmetric DTW** 

**2.4 Synchronization based on combination of symmetric and asymmetric DTW** 

**2.4.1 The advantage and disadvantage of symmetric and asymmetric DTW** 

importance of each measured variables.

(*bREF*×*N*).

other unfortunately.

would appear:

symmetric or asymmetric is suitable for synchronization.

means it will be transferred after synchronization;

should not be worked out until the final result *DA* (*t*, *r*) to accumulate a large number of the medium result. The programming can be composed with local dynamic programming in strip of adjacent time intervals, following is the improved algorithm under the three constraints and eq.5, 6, which is shown in Fig.3.


3) The local optimal path could be searched between the columns (*i*–1, :) and (*i*, :). The start point of the path is (*IP*, *JP*) and the relay end point is (*IE*, *JE*), where *IE*=*IP*+1, *JE* is ascertained on the following comparison:

$$\begin{aligned} \mathbf{J}\_E &= \underset{\mathbf{r}^\*}{\arg\min} \{ D\_A(I\_E, I\_\mathbb{P}), D\_A(I\_E, I\_\mathbb{P} + 1), \dots & D\_A(I\_E, q) \\ q &= \min \{ r, f \text{fix} [I\_E \* (r - 1) / (t - 1) + M] \} \end{aligned} \tag{7}$$

where *fix* is the function that keeps only the integer fraction of the result of computation.

4) Delete the column of *DA* (*i*-1, :), then set *IP*←*IE*, *JP*←*JE*;

5) Repeate step 2 to step 4 till *i*=*t* (*t* is one end point of pair);

6) If (*IP*, *JP*) is (*t*, *r*), searching stops; otherwise if the (*IP*, *JP*) is (*t*, *p*) (*p*<*r*), the rest path is from the point (*t*, *p*) to the final point (*t*, *r*).

Fig. 3. The local optimization between two columns in the improved DTW

### **2.5 Procedure of synchronization of batch trajectories**

The iterative procedure proposed for the synchronization of unequal batch trajectories (Kassidas et al., 1998) is a practical approach for industrial process, which is now being presented.

First of all, each variable from each batch should be scaled as preparation. Let *Bi*, *i*=1,…,*I* be the result of scaled batch trajectories from *I* good quality raw batches, the scaling method is to find the average range of each variable in raw batches by averaging the range form each batch, then to divide each variable in all batches with its average range, and store average ranges for monitoring. Then synchronization begins.

On-Line Monitoring of Batch Process with Multiway PCA/ICA 247

not be formed. Unlike translation, expansion and contraction of process measurements to generate equal duration in DTW, orthonormal function is employed to eliminate the problem resulted from the different operating time to turn the implicit system information into several key parameters which cover the necessary part of the operating conditions for

On the concept of Orthonormal Function Approximation (OFA), the process measurements of each variable in each batch run can be mapped onto the same number of orthonormal coefficients to represent the key information. As an univariate trajectory, the profile of each variable in each batch run can be represented as a function *F*(*t*), which can be approximated

> () ( ,) () *N n n n n Ft F Ct t*

function. Therefore, the coefficients *C* of the orthogonal function is representative of the measured variable *F*(*t*) of one batch run. Not being calculated from a set of *K* measurements,

1

*E t Et t*

() () ()

 

**E**

0,1, , 1; 1,2, ,

2

*n*

*n Ni I*

where **E***n*= [*En*(*t*1) *En*(*t*2)…*En*(*tki*)]T and Φ*n*= [*φn*(*t*1) *φn*(*t*2)…*φn*(*tki*)]T. The Legendre polynomial basis function is regard as an effective function to be used due to the finite time interval for

2 1 () () <sup>2</sup>

*<sup>n</sup> t Pt*

*n n*

*n n n*

*<sup>d</sup> P t <sup>t</sup> n dt*

<sup>1</sup> ( ) [( 1) ] 2 !

Before applying the orthonormal function approximation, the variables of the system with different units needs to be pretreated in order to be put on an equal basis. However, mean centering of the measurement data is not necessary because the constant coefficient *α0* is for *φ*0 orthonormal basis function. Mean centering will affect the constant coefficient for *φ*<sup>0</sup> corresponding to zero. The ratio convergence test for mathematical series is applied to determine the approximation error associated with the reduction in the number of the basis spaces (Moore and Anthony, 1989). The measure of approximation effectiveness can be

*n*

 

the coefficient *αn* can be derived practically with orthonormal decomposition of *F*(*t*):

*<sup>n</sup>* , ( ) ( ) 

> () () [ ]

 

*E t Ft*

*k k T n n n nn n k n k nn k i*

1,2, ,

*k K*

1 0

 

 

(10)

*n n F t t dt* are the projection of *F*(*t*) onto each basis

(11)

(12)

( ) ( )/ 2 *t Pt* and *P*0(*t*)=1.

each variable in each batch (Chen and Liu, 2000; Neogi and Schlags, 1998).

in terms of an orthonormal set {*φn* } of continuous function:

0

1

where *t*∈[-1,1]. When *n*=0, the constant coefficient *α0* is for 0 0

**3.1 Orthonormal function** 

where the coefficients, { } *C*

each batch run (Chen and Liu, 2000):

obtained as:

Step 0: Select one of the scaled trajectories *Bk* as the referenced trajectories *BREF* on the technic requirement. Set weight matrix **W** equal to the identity matrix. Then execute the following steps for a specified maximum number of iterations.

Step 1: Apply the DTW method between *Bi*, *i*=1,…,*I*, and *BREF*. Let , 1, *Bi I <sup>i</sup>* be the synchronized trajectories whose common durations is same as the one of *BREF*.

Step 2: Compute the average trajectory *B* from average values of all *Bi*.

Step 3: For each variable, compute the sum of squared deviations from *B* , whose inverse will be the newer weight of the particular variable for the next iteration.

$$\mathcal{W}(j\_{\prime}, j) = \left[ \sum\_{i=1}^{l} \sum\_{k=1}^{b\_{\text{REF}}} \left[ \tilde{B}\_{i}(k, j) - \overline{B}(k, j) \right]^2 \right]^{-1} \tag{8}$$

As a diagonal matrix, **W** should be normalized so that the sum of the weight is equal to the number of variables, that is, **W** could be replaced as:

$$\mathcal{W} \gets \mathcal{W}\left\{\mathcal{N} / \left[\sum\_{\prime=1}^{N} \mathcal{W}(j\_{\prime}j)\right]\right\} \tag{9}$$

Step 4: In most case, the times of iterations are not greater than 3, so keep the same referenced trajectory: *BREF*=*Bk*. If the more iterations are needed, set the reference equal to the average trajectory: *BREF*= *B* .

#### **2.6 Offline implementation of DTW for batch monitoring**

Now, a available complete trajectory of one new batch *B*RAW, NEW (*b*NEW×*N*) needs to be monitored using MPCA/MICA. It has to be synchronized before the monitoring scheme is applied because most probably the new batch trajectory *B*RAW, NEW hardly accord with the referenced trajectory *BREF*.

When being scaled, each variable in the new batch *B*RAW, NEW is divided with the average range from referenced trajectory to get the resulting scaled new trajectory, *B* NEW. *B* NEW is synchronized with referenced trajectory *BREF* using **W** from Eq.8, 9 in the synchronization procedure to get the result BNEW (*b*NEW×*N*) which can be used in MPCA/MICA model.
