**4. The v***ε* **acceleration of the ALS algorithm**

We briefly introduce the v*ε* algorithm of Wynn [Wynn, 1962] used in the acceleration of the ALS algorithm. The v*ε* algorithm is utilized to speed up the convergence of a slowly convergent vector sequence and is very effective for linearly converging sequences. Kuroda and Sakakihara [Kuroda and Sakakihara, 2006] proposed the *ε*-accelerated EM algorithm that speeds up the convergence of the EM sequence via the v*ε* algorithm and demonstrated that its speed of convergence is significantly faster than that of the EM algorithm. Wang et al. [Wang et al., 2008] studied the convergence properties of the *ε*-accelerated EM algorithm.

Let {**Y**(*t*)}*t*≥<sup>0</sup> <sup>=</sup> {**Y**(0) , **Y**(1), **Y**(2) ,... } be a linear convergent sequence generated by an iterative computational procedure and let {**Y**˙ (*t*) }*t*≥<sup>0</sup> <sup>=</sup> {**Y**˙ (0) , **Y**˙ (1) , **Y**˙ (2) ,... } be the accelerated

of v*ε*-PRINCIPALS is set to 10−<sup>8</sup> and PRINCIPALS terminates when <sup>|</sup>*θ*(*t*+1) <sup>−</sup> *<sup>θ</sup>*(*t*)

Squares Algorithm for Nonlinear Principal Components Analysis

iterations is also set to 100,000.

is replicated 50 times.

convergence of {**X**∗(*t*)}*t*≥0.

acceleration is very obvious.

0

 500

 1000

vε-PRINCIPALS

 1500

0 500 1000 1500

PRINCIPALS

where *θ*(*t*) is the *t*-th update of *θ* calculated from Equation (4). The maximum number of

<sup>133</sup> Acceleration of Convergence of the Alternating Least

We apply these algorithms to a random data matrix of 100 observations on 20 variables with 10 levels and measure the number of iterations and CPU time taken for *r* = 3. The procedure

Table 1 is summary statistics of the numbers of iterations and CPU times of PRINCIPALS and v*ε*-PRINCIPALS from 50 simulated data. Figure 1 shows the scatter plots of the number of iterations and CPU time. The values of the second to fifth columns of the table and the figure show that PRINCIPALS requires more iterations and takes a longer computation time than v*ε*-PRINCIPALS. The values of the sixth and seventh columns in the table are summary statistics of the iteration and CPU time speed-ups for comparing the speed of convergence of PRINCIPALS with that of v*ε*-PRINCIPALS. The iteration speed-up is defined as the number of iterations required for PRINCIPALS divided by the number of iterations required for v*ε*-PRINCIPALS. The CPU time speed-up is calculated similarly to the iteration speed-up. We can see from the values of the iteration and CPU time speed-ups that v*ε*-PRINCIPALS converges 3.23 times in terms of the mean number of iterations and 2.92 times in terms of the mean CPU time faster than PRINCIPALS. Figure 2 shows the boxplots of the iteration and CPU time speed-ups. Table 1 and Figure 2 show that v*ε*-PRINCIPALS well accelerates the

The number of iterations CPU time

Fig. 1. Scatter plots of the number iterations and CPU time from 50 simulated data.

Figure 3 is the scatter plots of iteration and CPU time speed-ups for the number of iterations of PRINCIPALS. The figure demonstrates that the v*ε* acceleration speeds up greatly the convergence of {**X**∗(*t*)}*t*≥<sup>0</sup> and its speed of convergence is faster for the larger number of iterations of PRINCIPALS. For more than 400 iterations of PRINCIPALS, the speed of the v*ε* acceleration is faster 3 times more than that of PRINCIPALS and the maximum values of both speed-ups are for around 1,000 iterations of PRINCIPALS. The advantage of the v*ε*

10

5

0

0 5

vε-PRINCIPALS

 15

 20

 25 <sup>|</sup> <sup>&</sup>lt; <sup>10</sup>−8,

10 15 20 25

PRINCIPALS

sequence of {**Y**(*t*) }*t*≥0. Then the v*<sup>ε</sup>* algorithm generates {**Y**˙ (*t*) }*t*≥<sup>0</sup> by using

$$\dot{\mathbf{Y}}^{(t-1)} = \mathbf{Y}^{(t)} + \left[ \left[ \left( \mathbf{Y}^{(t-1)} - \mathbf{Y}^{(t)} \right) \right]^{-1} + \left[ \left( \mathbf{Y}^{(t+1)} - \mathbf{Y}^{(t)} \right) \right]^{-1} \right]^{-1} \tag{6}$$

where [**Y**] <sup>−</sup><sup>1</sup> = **Y** ||**Y**||<sup>2</sup> and ||**Y**|| is the Euclidean norm of **<sup>Y</sup>**. For the detailed derivation of Equation (6), see Appendix B. When {**Y**(*t*)}*t*≥<sup>0</sup> converges to a limit point **<sup>Y</sup>**(∞) of {**Y**(*t*) }*t*≥0, it is known that, in many cases, {**Y**˙ (*t*) }*t*≥<sup>0</sup> generated by the v*<sup>ε</sup>* algorithm converges to **<sup>Y</sup>**(∞) faster than {**Y**(*t*) }*t*≥0.

We assume that {**X**∗(*t*)}*t*≥<sup>0</sup> generated by PRINCIPALS converges to a limit point **<sup>X</sup>**∗(∞) . Then <sup>v</sup>*ε*-PRINCIPALS produces a faster convergent sequence {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> of {**X**∗(*t*)}*t*≥<sup>0</sup> by using the v*ε* algorithm and enables the acceleration of convergence of PRINCIPALS. The general procedure of v*ε*-PRINCIPALS iterates the following two steps:


$$\text{vec}\dot{\mathbf{X}}^{\*(t-1)} = \text{vec}\mathbf{X}^{\*(t)} + \left[ \left[ \text{vec}(\mathbf{X}^{\*(t-1)} - \mathbf{X}^{\*(t)}) \right]^{-1} + \left[ \text{vec}(\mathbf{X}^{\*(t+1)} - \mathbf{X}^{\*(t)}) \right]^{-1} \right]^{-1} \boldsymbol{\lambda}$$

where vec**X**∗ = (**X**∗� <sup>1</sup> **X**∗� <sup>2</sup> ··· **X**∗� *<sup>p</sup>* )�, and check the convergence by

$$\left\| \left| \text{vec}(\dot{\mathbf{X}}^{\*(t-1)} - \dot{\mathbf{X}}^{\*(t-2)}) \right| \right\|^2 < \delta\_\star $$

where *δ* is a desired accuracy.

Before starting the iteration, we determine initial data **X**∗(0) satisfying the restriction (3) and execute the *PRINCIPALS step* twice to generate {**X**∗(0) , **X**∗(1) , **<sup>X</sup>**∗(2)}.

<sup>v</sup>*ε*-PRINCIPALS is designed to generate {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> converging to **<sup>X</sup>**∗(∞) . Thus the estimate of **<sup>X</sup>**<sup>∗</sup> can be obtained from the final value of {**X**˙ <sup>∗</sup>(*t*) }*t*≥<sup>0</sup> when v*ε*-PRINCIPALS terminates. The estimates of **Z** and **A** can then be calculated immediately from the estimate of **X**∗ in the *Model parameter estimation step* of PRINCIPALS.

Note that **X**˙ <sup>∗</sup>(*t*−1) obtained at the *t*-th iteration of the *Acceleration step* is not used as the estimate **X**∗(*t*+1) at the (*t* + 1)-th iteration of the *PRINCIPALS step*. Thus v*ε*-PRINCIPALS speeds up the convergence of {**X**∗(*t*) }*t*≥<sup>0</sup> without affecting the convergence properties of ordinary PRINCIPALS.

#### **Numerical experiments 1: Comparison of the number of iterations and CPU time**

We study how much faster v*ε*-PRINCIPALS converges than ordinary PRINCIPALS. All computations are performed with the statistical package R [R Development Core Team, 2008] executing on Intel Core i5 3.3 GHz with 4 GB of memory. CPU times (in seconds) taken are measured by the function proc.time1. For all experiments, *δ* for convergence

<sup>1</sup> Times are typically available to 10 msec.

4 Principal Component Analysis

) −<sup>1</sup> + 

We assume that {**X**∗(*t*)}*t*≥<sup>0</sup> generated by PRINCIPALS converges to a limit point **<sup>X</sup>**∗(∞)

vec(**X**∗(*t*−1) <sup>−</sup> **<sup>X</sup>**∗(*t*)

vec(**X**˙ <sup>∗</sup>(*t*−1) <sup>−</sup> **<sup>X</sup>**˙ <sup>∗</sup>(*t*−2)

Before starting the iteration, we determine initial data **X**∗(0) satisfying the restriction (3) and

estimates of **Z** and **A** can then be calculated immediately from the estimate of **X**∗ in the *Model*

Note that **X**˙ <sup>∗</sup>(*t*−1) obtained at the *t*-th iteration of the *Acceleration step* is not used as the estimate **X**∗(*t*+1) at the (*t* + 1)-th iteration of the *PRINCIPALS step*. Thus v*ε*-PRINCIPALS speeds

We study how much faster v*ε*-PRINCIPALS converges than ordinary PRINCIPALS. All computations are performed with the statistical package R [R Development Core Team, 2008] executing on Intel Core i5 3.3 GHz with 4 GB of memory. CPU times (in seconds) taken are measured by the function proc.time1. For all experiments, *δ* for convergence

**Numerical experiments 1: Comparison of the number of iterations and CPU time**

the v*ε* algorithm and enables the acceleration of convergence of PRINCIPALS. The general

• *PRINCIPALS step*: Compute model parameters **A**(*t*) and **Z**(*t*) and determine optimal scaling

}*t*≥<sup>0</sup> by using

) −<sup>1</sup> −<sup>1</sup>

}*t*≥<sup>0</sup> converges to a limit point **<sup>Y</sup>**(∞) of {**Y**(*t*)

}*t*≥<sup>0</sup> generated by the v*<sup>ε</sup>* algorithm converges to **<sup>Y</sup>**(∞)

, (6)

}*t*≥<sup>0</sup> of {**X**∗(*t*)}*t*≥<sup>0</sup> by using

} from the v*ε* algorithm:

) −<sup>1</sup> −<sup>1</sup> ,

. Thus the estimate of

vec(**X**∗(*t*+1) <sup>−</sup> **<sup>X</sup>**∗(*t*)

}*t*≥<sup>0</sup> when v*ε*-PRINCIPALS terminates. The

}*t*≥0,

. Then

(**Y**(*t*+1) <sup>−</sup> **<sup>Y</sup>**(*t*)


, **X**∗(*t*), **X**∗(*t*+1)

) −<sup>1</sup> + 

*<sup>p</sup>* )�, and check the convergence by

) 2 < *δ*,

, **X**∗(1)

, **<sup>X</sup>**∗(2)}.

}*t*≥<sup>0</sup> without affecting the convergence properties of ordinary

}*t*≥0. Then the v*<sup>ε</sup>* algorithm generates {**Y**˙ (*t*)

(**Y**(*t*−1) <sup>−</sup> **<sup>Y</sup>**(*t*)

<sup>v</sup>*ε*-PRINCIPALS produces a faster convergent sequence {**X**˙ <sup>∗</sup>(*t*)

procedure of v*ε*-PRINCIPALS iterates the following two steps:

<sup>2</sup> ··· **X**∗�

 

<sup>v</sup>*ε*-PRINCIPALS is designed to generate {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> converging to **<sup>X</sup>**∗(∞)

sequence of {**Y**(*t*)

<sup>−</sup><sup>1</sup> = **Y**

where [**Y**]

faster than {**Y**(*t*)

parameter **X**∗(*t*+1)

where vec**X**∗ = (**X**∗�

**Y**˙ (*t*−1) = **Y**(*t*) +

Equation (6), see Appendix B. When {**Y**(*t*)

.

vec**X**˙ <sup>∗</sup>(*t*−1) = vec**X**∗(*t*) +

where *δ* is a desired accuracy.

• *Acceleration step*: Calculate **<sup>X</sup>**˙ <sup>∗</sup>(*t*−1) using {**X**∗(*t*−1)

<sup>1</sup> **X**∗�

execute the *PRINCIPALS step* twice to generate {**X**∗(0)

**<sup>X</sup>**<sup>∗</sup> can be obtained from the final value of {**X**˙ <sup>∗</sup>(*t*)

*parameter estimation step* of PRINCIPALS.

<sup>1</sup> Times are typically available to 10 msec.

up the convergence of {**X**∗(*t*)

PRINCIPALS.

it is known that, in many cases, {**Y**˙ (*t*)

}*t*≥0.

of v*ε*-PRINCIPALS is set to 10−<sup>8</sup> and PRINCIPALS terminates when <sup>|</sup>*θ*(*t*+1) <sup>−</sup> *<sup>θ</sup>*(*t*) <sup>|</sup> <sup>&</sup>lt; <sup>10</sup>−8, where *θ*(*t*) is the *t*-th update of *θ* calculated from Equation (4). The maximum number of iterations is also set to 100,000.

We apply these algorithms to a random data matrix of 100 observations on 20 variables with 10 levels and measure the number of iterations and CPU time taken for *r* = 3. The procedure is replicated 50 times.

Table 1 is summary statistics of the numbers of iterations and CPU times of PRINCIPALS and v*ε*-PRINCIPALS from 50 simulated data. Figure 1 shows the scatter plots of the number of iterations and CPU time. The values of the second to fifth columns of the table and the figure show that PRINCIPALS requires more iterations and takes a longer computation time than v*ε*-PRINCIPALS. The values of the sixth and seventh columns in the table are summary statistics of the iteration and CPU time speed-ups for comparing the speed of convergence of PRINCIPALS with that of v*ε*-PRINCIPALS. The iteration speed-up is defined as the number of iterations required for PRINCIPALS divided by the number of iterations required for v*ε*-PRINCIPALS. The CPU time speed-up is calculated similarly to the iteration speed-up. We can see from the values of the iteration and CPU time speed-ups that v*ε*-PRINCIPALS converges 3.23 times in terms of the mean number of iterations and 2.92 times in terms of the mean CPU time faster than PRINCIPALS. Figure 2 shows the boxplots of the iteration and CPU time speed-ups. Table 1 and Figure 2 show that v*ε*-PRINCIPALS well accelerates the convergence of {**X**∗(*t*)}*t*≥0.

Fig. 1. Scatter plots of the number iterations and CPU time from 50 simulated data.

Figure 3 is the scatter plots of iteration and CPU time speed-ups for the number of iterations of PRINCIPALS. The figure demonstrates that the v*ε* acceleration speeds up greatly the convergence of {**X**∗(*t*) }*t*≥<sup>0</sup> and its speed of convergence is faster for the larger number of iterations of PRINCIPALS. For more than 400 iterations of PRINCIPALS, the speed of the v*ε* acceleration is faster 3 times more than that of PRINCIPALS and the maximum values of both speed-ups are for around 1,000 iterations of PRINCIPALS. The advantage of the v*ε* acceleration is very obvious.

PRINCIPALS v*ε*-PRINCIPALS Speed-up Iteration CPU time Iteration CPU time Iteration CPU time

Minimum 136.0 2.64 46.0 1.07 1.76 1.69 1st Quartile 236.5 4.44 85.0 1.81 2.49 2.27 Median 345.5 6.37 137.0 2.72 3.28 2.76 Mean 437.0 8.02 135.0 2.70 3.23 2.92 3rd Quartile 573.2 10.39 171.2 3.40 3.74 3.41 Maximum 1564.0 28.05 348.0 6.56 5.71 5.24

<sup>135</sup> Acceleration of Convergence of the Alternating Least

Table 1. Summary statistics of the numbers of iterations and CPU times of PRINCIPALS and

We introduce the result of studies of convergence of v*ε*-PRINCIPALS from Kuroda et al. [Kuroda et al., 2011]. The data set used in the experiments is obtained in teacher evaluation by students and consists of 56 observations on 13 variables with 5 levels each; the lowest

�**X**∗(*t*) <sup>−</sup> **<sup>X</sup>**∗(*t*−1)�

�**X**˙ <sup>∗</sup>(*t*) <sup>−</sup> **<sup>X</sup>**˙ <sup>∗</sup>(*t*−1)� �**X**˙ <sup>∗</sup>(*t*−1) <sup>−</sup> **<sup>X</sup>**˙ <sup>∗</sup>(*t*−2)�

Table 2 provides the rates of convergence *τ* and *τ*˙ for each *r*. We see from the table that

conclude that v*ε*-PRINCIPALS significantly improves the rate of convergence of PRINCIPALS.

*r τ τ*˙ 1 0.060 0.001 2 0.812 0.667 3 0.489 0.323 4 0.466 0.257 5 0.493 0.388 6 0.576 0.332 7 0.473 0.372 8 0.659 0.553 9 0.645 0.494 10 0.678 0.537 11 0.592 0.473 12 0.648 0.465

�**X**˙ <sup>∗</sup>(*t*) <sup>−</sup> **<sup>X</sup>**∗(∞)

�

�**X**∗(*t*+2) <sup>−</sup> **<sup>X</sup>**∗(∞)� <sup>=</sup> 0. (7)

If the inequality 0 <sup>&</sup>lt; *<sup>τ</sup>*˙ <sup>&</sup>lt; *<sup>τ</sup>* <sup>&</sup>lt; 1 holds, we say that {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> converges faster than {**X**∗(*t*)

�**X**∗(*t*−1) <sup>−</sup> **<sup>X</sup>**∗(*t*−2)� for PRINCIPALS,

for v*ε*-PRINCIPALS.

}*t*≥<sup>0</sup> in comparison between *<sup>τ</sup>* and *<sup>τ</sup>*˙ for each *<sup>r</sup>* and thus

}*t*≥0.

v*ε*-PRINCIPALS and iteration and CPU time speed-ups from 50 simulated data.

**Numerical experiments 2: Studies of convergence**

Squares Algorithm for Nonlinear Principal Components Analysis

The rates of convergence of these algorithms are assessed as

(*t*) = lim *t*→∞

The speed of convergence of v*ε*-PRINCIPALS is investigate by

Table 2. Rates of convergence *τ* and *τ*˙ of PRINCIPALS to v*ε*-PRINCIPALS.

(*t*) = lim *t*→∞

*ρ*˙ = lim *<sup>t</sup>*→<sup>∞</sup> *<sup>ρ</sup>*˙

*t*→∞

*<sup>t</sup>*→<sup>∞</sup> *<sup>τ</sup>*(*t*) <sup>=</sup> lim

evaluation level is 1 and the highest 5.

*τ* = lim

*τ*˙ = lim *<sup>t</sup>*→<sup>∞</sup> *<sup>τ</sup>*˙

}*t*≥<sup>0</sup> converges faster than {**X**∗(*t*)

{**X**˙ <sup>∗</sup>(*t*)

Fig. 2. Boxplots of iteration and CPU time speed-ups from 50 simulated data.

Fig. 3. Scatter plots of iteration and CPU time speed-ups for the number of iterations of PRINCIPALS from 50 simulated data.

6 Principal Component Analysis

Iteration CPU time

2.0 2.5 3.0 3.5 4.0 4.5 5.0

CPU time speed-up

Fig. 3. Scatter plots of iteration and CPU time speed-ups for the number of iterations of

200 400 600 800 1000 1200 1400 1600

The number of iterations of PRINCIPALS

Fig. 2. Boxplots of iteration and CPU time speed-ups from 50 simulated data.

2

200 400 600 800 1000 1200 1400 1600

The number of iterations of PRINCIPALS

PRINCIPALS from 50 simulated data.

2

3

4

5

Iteration speed-up

3

4

 5


Table 1. Summary statistics of the numbers of iterations and CPU times of PRINCIPALS and v*ε*-PRINCIPALS and iteration and CPU time speed-ups from 50 simulated data.

#### **Numerical experiments 2: Studies of convergence**

We introduce the result of studies of convergence of v*ε*-PRINCIPALS from Kuroda et al. [Kuroda et al., 2011]. The data set used in the experiments is obtained in teacher evaluation by students and consists of 56 observations on 13 variables with 5 levels each; the lowest evaluation level is 1 and the highest 5.

The rates of convergence of these algorithms are assessed as

$$\tau = \lim\_{t \to \infty} \tau^{(t)} = \lim\_{t \to \infty} \frac{||\mathbf{X}^{\*(t)} - \mathbf{X}^{\*(t-1)}||}{||\mathbf{X}^{\*(t-1)} - \mathbf{X}^{\*(t-2)}||} \quad \text{for PRINCIPALS}\_{\prime}$$

$$\dot{\tau} = \lim\_{t \to \infty} \dot{\tau}^{(t)} = \lim\_{t \to \infty} \frac{||\dot{\mathbf{X}}^{\*(t)} - \dot{\mathbf{X}}^{\*(t-1)}||}{||\dot{\mathbf{X}}^{\*(t-1)} - \dot{\mathbf{X}}^{\*(t-2)}||} \quad \text{for } \mathbf{v}\varepsilon\text{-PRINCALS}\_{\prime}$$

If the inequality 0 <sup>&</sup>lt; *<sup>τ</sup>*˙ <sup>&</sup>lt; *<sup>τ</sup>* <sup>&</sup>lt; 1 holds, we say that {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> converges faster than {**X**∗(*t*) }*t*≥0. Table 2 provides the rates of convergence *τ* and *τ*˙ for each *r*. We see from the table that {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> converges faster than {**X**∗(*t*) }*t*≥<sup>0</sup> in comparison between *<sup>τ</sup>* and *<sup>τ</sup>*˙ for each *<sup>r</sup>* and thus conclude that v*ε*-PRINCIPALS significantly improves the rate of convergence of PRINCIPALS. The speed of convergence of v*ε*-PRINCIPALS is investigate by


Table 2. Rates of convergence *τ* and *τ*˙ of PRINCIPALS to v*ε*-PRINCIPALS.

$$\dot{\boldsymbol{\rho}} = \lim\_{t \to \infty} \dot{\boldsymbol{\rho}}^{(t)} = \lim\_{t \to \infty} \frac{||\dot{\mathbf{X}}^{\*(t)} - \mathbf{X}^{\*(\infty)}||}{||\mathbf{X}^{\*(t+2)} - \mathbf{X}^{\*(\infty)}||} = 0. \tag{7}$$

**[Backward elimination]**

to be used.

*q* := *q* − 1.

**Stage A:** *Initial fixed-variables stage*

**Stage B:** *Variable selection stage (Forward)*

**[Forward selection]**

stop.

*pCr* + (*p* − *r* − 1)(*p* − *r* + 2)/2.

**Stage A:** *Initial fixed-variables stage*

**A-2** Solve the eigenvalue problem (8).

of kernel variables is less than *q*. **Stage B:** *Variable selection stage (Backward)*

**A-1** Assign *q* variables to subset **X***V*<sup>1</sup> , usually *q* := *p*.

Squares Algorithm for Nonlinear Principal Components Analysis

**A-3** Look carefully at the eigenvalues, determine the number *r* of principal components

<sup>137</sup> Acceleration of Convergence of the Alternating Least

**A-4** Specify kernel variables which should be involved in **X***V*<sup>1</sup> , if necessary. The number

**B-1** Remove one variable from among *q* variables in **X***V*<sup>1</sup> , make a temporary subset of size *q* − 1, and compute *P* based on the subset. Repeat this for each variable in **X***V*<sup>1</sup> , then obtain *q* values on *P*. Find the best subset of size *q* − 1 which provides the largest *P* among these *q* values and remove the corresponding variable from the present **X***V*<sup>1</sup> . Put

**A-4** Redefine *q* as the number of kernel variables (here, *q* ≥ *r*). If you have kernel variables, assign them to **X***V*<sup>1</sup> . If not, put *q* := *r*, find the best subset of *q* variables which provides the largest *P* among all possible subsets of size *q* and assign it to **X***V*<sup>1</sup> .

**B-1** Adding one of the *p* − *q* variables in **X***V*<sup>2</sup> to **X***V*<sup>1</sup> , make a temporary subset of size *q* + 1 and obtain *P*. Repeat this for each variable in **X***V*<sup>2</sup> , then obtain *p* − *q P*s. Find the best subset of size *q* + 1 which provides the largest (or smallest) *P* among the *p* − *q P*s and

**B-2** If the *P* or *q* are smaller (or larger) than preassigned values, go back to **B-1**. Otherwise

In Backward elimination, to find the best subset of *q* − 1 variables, we perform M.PCA for each of *q* possible subsets of the *q* − 1 variables among *q* variables selected in the previous selection step. The total number of estimations for M.PCA from *q* = *p* − 1 to *q* = *r* is therefore large, i.e., *p* + (*p* − 1) + ··· + (*r* + 1)=(*p* − *r*)(*p* + *r* + 1)/2. In Forward selection, the total number of estimations for M.PCA from *q* = *r* to *q* = *p* − 1 is *pCr* + (*p* − *r*)+(*p* − (*r* + 1)) + ··· + 2 =

We apply PRINCIPALS and v*ε*-PRINCIPALS to variable selection in M.PCA of qualitative data using simulated data consisting of 100 observations on 10 variables with 3 levels.

Table 3 shows the number of iterations and CPU time taken by two algorithms for finding a subset of *q* variables based on 3 (= *r*) principal components. The values of the second to fifth columns in the table indicate that the number of iterations of PRINCIPALS is very large and a long computation time is taken for convergence, while v*ε*-PRINCIPALS converges considerably faster than PRINCIPALS. We can see from the sixth and seventh columns in the

add the corresponding variable to the present subset of **X***V*<sup>1</sup> . Put *q* := *q* + 1.

**Numerical experiments 3: Variable selection in M.PCA for simulated data**

**B-2** If *P* or *q* is larger than preassigned values, go to **B-1**. Otherwise stop.

**A-1** ∼ **A-3** Same as A-1 to A-3 in Backward elimination.

If {**X**˙ <sup>∗</sup>(*t*) }*t*≥<sup>0</sup> converges to th same limit point **<sup>X</sup>**∗(∞) as {**X**∗(*t*) }*t*≥<sup>0</sup> and Equation (7) holds, we say that {**X**˙ <sup>∗</sup>(*t*)}*t*≥<sup>0</sup> accelerates the convergence of {**X**∗(*t*) }*t*≥0. See Brezinski and Zaglia [Brezinski and Zaglia, 1991]. In the experiments, {**X**˙ <sup>∗</sup>(*t*) }*t*≥<sup>0</sup> converges to the final value of {**X**∗(*t*)}*t*≥<sup>0</sup> and *<sup>ρ</sup>*˙ is reduced to zero for all *<sup>r</sup>*. We see from the results that v*ε*-PRINCIPALS accelerates the convergence of {**X**∗(*t*) }.
