**2.1 Symmetric and asymmetric DTW algorithm**

Let *R* and *T* express the multivariate trajectories of two batches, whose matrices of dimension *t*×*N* and *r*×*N*, separately, where *t* and *r* are the number of observations and *N* is the number of measured variables. In most case, *t* and *r* are not always equal, so that the two batches are not synchronized because they have not common length. Even if *t*=*r*, their trajectories may not be synchronized because of their different local characteristics. If one applies the monitoring scheme of MPCA (Nomikos and MacGregor, 1994), or the scheme of MICA (Yoo et al., 2004), by simply add or delete some measured points artificially, unnecessary variation will be included in statistical model and the subsequent statistical tests will not detect the faulty batches sensitively.

On the principle of dynamic programming to minimize a distance between two trajectories, DTW warps the two trajectories so that similar events are matched and a minimum distance between them is obtained, because DTW will shift, compress or expand some feature vectors to achieve minimum distance (Nadler and Smith, 1993).

On-Line Monitoring of Batch Process with Multiway PCA/ICA 243

This implies that the path will go through each vector of *R*, but it may skip some vectors of *T*.

In order to find the best path through the grid of *t*×*r* grid, three rules of the DTW algorithm

(2) Local constraints: the predecessor of each (*i*, *j*) point of *F*\* except (1,1) is only one from (*i*-

(3)Global constraints: the searching area is *M*( ) *M tr* widening strip area around the

The endpoint constraints illustrate that the initial and final points in both trajectories are located with certainty. The local continuity constrains consider the characteristics of time indices to avoid excessive compression or expansion of the two time scales (Myers et al. 1980). On the requirement of monotonous and non-negative path, the local constrains also prevent excessive compression or expansion from the several latest neighbors (Itakura, 1975). The

As mentioned above, for the best path through a grid of vector-to-vector distances searched by DTW algorithm, some total distance measured between the two trajectories should be minimized. The calculation of the optimal normalized total distance is impractical, a feasible substitute is minimum accumulated distance, *DA*(*i*, *j*) from point (1,1) to point (*i*, *j*)(Kassidas

D ( , ) ( , ) min[D ( 1, ),D ( 1, 1),D ( , 1)],D (1,1) (1,1) A A <sup>A</sup> <sup>A</sup> <sup>A</sup> *i j di j i j i j i j d* (5)

( , ) [ ( ,:) ( ,:)] [ ( ,:) ( ,:)]*<sup>T</sup> di j Ti Rj W Ti Rj* (6)

**2.2 Endpoints, local and global constraints** 

(1) Endpoint constraints: *c*(1)=(1,1), c(*K*)=(*t*, *r*).

1, *j*), (*i*-1, j-1) or (*i*, *j*-1) , which is shown in Fig.2.

diagonal of the *t*×*r* grid, which is shown in Fig.3.

global constraints prevent large deviation from the linear path.

Fig. 2. Local continuity constraint with no constraint on slope

**2.3 Minimum accumulated distance of the optimal path** 

et al., 1998). The suitable one is:

where

should be specified.

*cj ij j* ( ) ( ( ), ) (4)

Fig. 1. Sketch map of nonlinear time alignment for two univariate trajectories *R* and *T* with DTW

Let *i* and *j* denote the time index of the *T* and *R* trajectories, respectively. DTW will find optimal route in sequence *F*\* of *K* points on a *t*×*r* grid.

$$F \stackrel{\*}{\quad} = \{c(1), c(2), \dots, c(K)\}, \max(t, r) \le K \le t + r \tag{1}$$

where

$$\mathcal{L}c(k) = [i(k), j(k)] \tag{2}$$

and each point *c(k)* is an ordered pair indicating a position in the grid. Two univariate trajectories *T* and *R* in Figure 1 show the main idea of DTW.

Most of DTW algorithms can be classified either as symmetric or as asymmetric. Although on the former scheme, both of the time index *i* of *T* and the time index *j* of *R* are mapped onto a common time index *k*, shown as Eqs.1, 2, the result of synchronization is not ideal, because the time length of synchronized trajectories often exceeds referenced trajectories. On the other hand, the latter maps the time index of *T* on the time index of *R* or vice-versa, to expand or compress more one trajectory towards the other. Compared with Eqs.1, 2, the sequence becomes as follow:

$$F' = \{\mathfrak{c}(1), \mathfrak{c}(2), \dots, \mathfrak{c}(j), \dots, \mathfrak{c}(r)\}\tag{3}$$

and

$$\mathbf{c}(j) = (i(j), j) \tag{4}$$

This implies that the path will go through each vector of *R*, but it may skip some vectors of *T*.
