**3. Preparing the hand kinematics for dimensionality reduction**

In this section, we first recorded the joint angles when ten subjects participated in an experiment of reaching and grasping tasks while wearing a dataglove. Then we transformed the recorded joint angles into joint angular velocities and further preprocessed it to prepare datasets to be used as inputs to the dimensionality reduction methods.

Fig. 2. (a) Objects grasped by the subjects. (b) An example of a rapid grasp of a wooden toy nut. (c) Sensors of the CyberGlove used for analysis (dark circles.) (d) A sample of rapid movement profile (finger-joint-angular-velocity profile). Onset and end of movements are marked in the figure. Abbreviations: T, thumb; I, index finger; M, middle finger; R, ring finger; P, pinky finger; CMC, carpometacarpal joint; MCP, metacarpophalangeal joint; IP, interphalangeal joint; PIP, proximal interphalangeal joint; DIP, distal interphalangeal joint. The dotted lines showed the onset and end of movement.

#### **3.1 Experiment**

4 Will-be-set-by-IN-TECH

tremendously simplifies the task of learning new skills or adapting to new environments. Constructing internal neural representations from a linear combination of a reduced set of basis functions might be crucial for generalizing to novel tasks and new environmental

Synergies or movement primitives are viewed as small building blocks of movement that are present inherently within the movements and are shared across several movements. In other words, for example, in a set of hundred grasping movements, there might be a five or six synergies that are shared and common across all the movements. So it is to say that these hundred hand movements are composed of synergies. How do we decompose these hundred hand movements to a few building blocks of movement? This is the problem we are trying to

In order to extract these primitives, several methods have been used. Several researchers view this as a problem of extracting basis functions. In fact, PCA can be viewed as extracting basis functions that are orthogonal to each other. Radial basis functions were also used as synergy approximations. Gradient descend method and non-negative matrix factorization methods ( d'Avella et al. (2003)), multivariate statistical techniques ( Santello et al. (2002)) were used in extracting the synergies. Different from the above interpretations of synergies, Todorov & Ghahramani (2004) suggested that synergistic control may not mean dimensionality reduction or simplification, but might imply task optimization

In the coming sections we will use linear and nonlinear dimensionality reduction methods in

In the previous section, we listed different methods used to extract the synergies. In this section these methods were limited to dimensionality reduction methods as these are of

Based on the principal component analysis, Jerde et al. ( Jerde et al. (2003)) found support for the existence of postural synergies of angular configuration. The shape of human hand can be predicted using a reduced set of variables and postural synergies. Similarly, Santello et al. (1998) showed that a small number of postural synergies were sufficient to describe how human subjects grasped a large set of different objects. Moreover, Mason et al. (2001) used singular value decomposition (SVD) to demonstrate that a large number of hand postures during reach-to-grasp can be constructed by a small number of principal components or eigen

With PCA, Braido & Zhang (2004) examined the temporal co-variation between finger-joint angles. Their results supported the view that the multi-joint acts of the hand are subject to stereotypical motion patterns controlled via simple kinematic synergies. In the above mentioned study of eigen postures, Mason et al. (2001) also investigated the temporal evolutions of the eigen postures and observed similar kinematic synergies across subjects and grasps. In addition, kinematic synergies have been observed in the spatiotemporal coordination between thumb and index finger movements and co-ordination of tip-to-tip

conditions ( Flash & Hochner (2005); Poggio & Bizzi (2004)).

**2.4 Dimensionality reduction methods for extracting synergies**

**2.3 Extraction of synergies**

using optimal feedback control.

extracting the synergies.

relevance to this chapter.

finger movements ( Cole & Abbs (1986)).

postures.

solve.

The experimental setup consisted of a right-handed CyberGlove (CyberGlove Systems LLC, San Jose, CA, USA) equipped with 22 sensors which can measure angles at all the finger joints. For the purpose of reducing computational burden, in this study we only considered 10 of the sensors which correspond to the metacarpophalangeal (MCP) and interphalangeal (IP) joints of the thumb and the MCP and proximal interphalangeal (PIP) joints of the other four fingers as shown in Fig. 2(c). These ten joints can capture most characteristics of the hand in grasping tasks.

**3.3 Testing**

collected during the testing phase.

**3.4 Preprocessing**

where *v g*

the Fig. 3.

In the testing phase, subjects were instructed to grasp the above 50 objects naturally (slower than the rapid grasps) then repeat the same again. So far the tasks involved only grasping action. To widen the scope of applicability of the synergies, subjects were also asked to pose 36 American Sign Language (ASL) postures. Here subjects started from an initial posture and stopped at one ASL posture. These postures consisted of 10 numbers (0-9) and 26 alphabets (A-Z). Note that these movements are different from grasping tasks. This is the testing phase which consisted of 100 natural grasps and 36 ASL postural movements. The synergies were derived from the hand movements collected in the training phase using linear and nonlinear dimensionality reduction methods. Then they were used in the reconstruction of movements

Application of Linear and Nonlinear Dimensionality Reduction Methods 113

After obtaining the joint angles at various times from the rapid grasps, angular velocities were calculated. These angular velocities were filtered from noise. Only the relevant projectile movement (about 0.45 second or 39 samples at a sampling rate of 86 Hz) of the entire

Next an angular-velocity matrix, denoted *V*, was constructed for each subject. Angular-velocity profiles of the 10 joints corresponding to one rapid grasp were cascaded such that each row of the angular-velocity matrix represented one movement in time. The

<sup>1</sup>(39) ··· *<sup>v</sup>*<sup>1</sup>

<sup>1</sup> (39) ··· *v*

<sup>1</sup> (39) ··· *<sup>v</sup>*<sup>100</sup>

In this section we derived synergies using two unique linear dimensionality reduction methods, namely, PCA and MDS. The angular-velocity matrix computed in preprocessing was used as input to these methods. Linear methods are easy to use and demand less

The winning advantage of PCA is less time for computation and equally effective results when compared to gradient descent methods( Vinjamuri et al. (2007)). PCs are essentially the most commonly used patterns across the data. In this case, PCs are the synergies which are most commonly used across different movements. Moreover these PCs when

computational power when compared to nonlinear methods, hence this first exercise.

*<sup>i</sup>* (*t*) represents the angular velocity of joint *i* (*i* = 1, ..., 10) at time *t* (*t* = 1, ..., 39) in the *g*-th rapid-grasping task (*g* = 1, ..., 100). An illustration of this transformation was shown in

*g*

<sup>10</sup>(1) ··· *<sup>v</sup>*<sup>1</sup>

<sup>10</sup>(1) ··· *v*

<sup>10</sup> (1) ··· *<sup>v</sup>*<sup>100</sup>

<sup>10</sup>(39)

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

(1)

<sup>10</sup> (39)

*g* <sup>10</sup>(39)

angular-velocity profile was preserved and the rest was truncated (Fig. 2(d)).

matrix consisted of 100 rows and 39 × 10 = 390 columns:

*v*1

*v g*

*v*<sup>100</sup>

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

<sup>1</sup>(1) ··· *<sup>v</sup>*<sup>1</sup>

<sup>1</sup> (1) ··· *v*

<sup>1</sup> (1) ··· *<sup>v</sup>*<sup>100</sup>

*g*

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

*V* =

**4. Linear dimensionality reduction methods**

**4.1 Principal component analysis**

Fig. 3. Angular-velocity matrix. (a) Cascading angular-velocity profiles of 10 joints to form a row of the angular-velocity matrix. (b) Each row of the angular-velocity matrix represents a grasping task. In each row the angular-velocity profiles of 10 joints are separated by dotted red lines. Hundred such tasks put together is an angular-velocity matrix.

A typical task consisted of grasping the objects of various shapes and sizes as shown in Fig. 2(a). Objects (wooden and plastic) of different shapes (spheres, circular discs, rectangles, pentagons, nuts, and bolts) and different dimensions were used in the grasping tasks and were selected based on two strategies. One was gradually increasing sizes of similar shaped objects, and the other was using different shapes. Start and stop times of each task were signaled by computer-generated beeps. In each task, the subject was in a seated position, resting his/her right hand at a corner of a table and upon hearing the beep, grasped the object placed on the table. At the time of the start beep hand was in rest posture, and then the subject grasped the object and held it until the stop beep. Between the grasps, there was enough time for the subjects to avoid the affects due to fatigue on succeeding tasks. The experiment was split into two phases, training phase and testing phase, the difference in these two being the velocity of grasps and types of grasps.

#### **3.2 Training**

In the training phase, subjects were instructed to rapidly grasp 50 objects, one at a time. This was repeated for the same 50 objects, and thus the whole training phase obtained 100 rapid grasps. Only these 100 rapid grasps were used in extracting synergies.

#### **3.3 Testing**

6 Will-be-set-by-IN-TECH

. . .

Samples

Fig. 3. Angular-velocity matrix. (a) Cascading angular-velocity profiles of 10 joints to form a row of the angular-velocity matrix. (b) Each row of the angular-velocity matrix represents a grasping task. In each row the angular-velocity profiles of 10 joints are separated by dotted

A typical task consisted of grasping the objects of various shapes and sizes as shown in Fig. 2(a). Objects (wooden and plastic) of different shapes (spheres, circular discs, rectangles, pentagons, nuts, and bolts) and different dimensions were used in the grasping tasks and were selected based on two strategies. One was gradually increasing sizes of similar shaped objects, and the other was using different shapes. Start and stop times of each task were signaled by computer-generated beeps. In each task, the subject was in a seated position, resting his/her right hand at a corner of a table and upon hearing the beep, grasped the object placed on the table. At the time of the start beep hand was in rest posture, and then the subject grasped the object and held it until the stop beep. Between the grasps, there was enough time for the subjects to avoid the affects due to fatigue on succeeding tasks. The experiment was split into two phases, training phase and testing phase, the difference in these two being the velocity of

In the training phase, subjects were instructed to rapidly grasp 50 objects, one at a time. This was repeated for the same 50 objects, and thus the whole training phase obtained 100 rapid

red lines. Hundred such tasks put together is an angular-velocity matrix.

grasps. Only these 100 rapid grasps were used in extracting synergies.

J1 J2 J10

Pos1

Pos2

Pos3

Pos100

(a)

J1 J2

J10

(b)

(radians/sample)

Angular

grasps and types of grasps.

**3.2 Training**

velocities In the testing phase, subjects were instructed to grasp the above 50 objects naturally (slower than the rapid grasps) then repeat the same again. So far the tasks involved only grasping action. To widen the scope of applicability of the synergies, subjects were also asked to pose 36 American Sign Language (ASL) postures. Here subjects started from an initial posture and stopped at one ASL posture. These postures consisted of 10 numbers (0-9) and 26 alphabets (A-Z). Note that these movements are different from grasping tasks. This is the testing phase which consisted of 100 natural grasps and 36 ASL postural movements. The synergies were derived from the hand movements collected in the training phase using linear and nonlinear dimensionality reduction methods. Then they were used in the reconstruction of movements collected during the testing phase.

#### **3.4 Preprocessing**

After obtaining the joint angles at various times from the rapid grasps, angular velocities were calculated. These angular velocities were filtered from noise. Only the relevant projectile movement (about 0.45 second or 39 samples at a sampling rate of 86 Hz) of the entire angular-velocity profile was preserved and the rest was truncated (Fig. 2(d)).

Next an angular-velocity matrix, denoted *V*, was constructed for each subject. Angular-velocity profiles of the 10 joints corresponding to one rapid grasp were cascaded such that each row of the angular-velocity matrix represented one movement in time. The matrix consisted of 100 rows and 39 × 10 = 390 columns:

$$V = \begin{bmatrix} v\_1^1(1) & \cdots & v\_1^1(39) & \cdots & v\_{10}^1(1) & \cdots & v\_{10}^1(39) \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ v\_1^{\mathcal{S}}(1) & \cdots & v\_1^{\mathcal{S}}(39) & \cdots & v\_{10}^{\mathcal{S}}(1) & \cdots & v\_{10}^{\mathcal{S}}(39) \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ v\_1^{100}(1) & \cdots & v\_1^{100}(39) & \cdots & v\_{10}^{100}(1) & \cdots & v\_{10}^{100}(39) \end{bmatrix} \tag{1}$$

where *v g <sup>i</sup>* (*t*) represents the angular velocity of joint *i* (*i* = 1, ..., 10) at time *t* (*t* = 1, ..., 39) in the *g*-th rapid-grasping task (*g* = 1, ..., 100). An illustration of this transformation was shown in the Fig. 3.

#### **4. Linear dimensionality reduction methods**

In this section we derived synergies using two unique linear dimensionality reduction methods, namely, PCA and MDS. The angular-velocity matrix computed in preprocessing was used as input to these methods. Linear methods are easy to use and demand less computational power when compared to nonlinear methods, hence this first exercise.

#### **4.1 Principal component analysis**

The winning advantage of PCA is less time for computation and equally effective results when compared to gradient descent methods( Vinjamuri et al. (2007)). PCs are essentially the most commonly used patterns across the data. In this case, PCs are the synergies which are most commonly used across different movements. Moreover these PCs when

where *Um* is a 100-by-*m* matrix containing the first *m* columns of *U* and *Sm* is a *m*-by-390

Application of Linear and Nonlinear Dimensionality Reduction Methods 115

Then each row of *Sm* is called a *principal component* (PC), and *W* is called the weight matrix.

<sup>1</sup>(39) ··· *<sup>s</sup>*<sup>1</sup>

<sup>1</sup> (39) ··· *<sup>s</sup><sup>m</sup>*

*w*1

. . . . . . . . .

*wg*

. . . . . . . . .

*w*<sup>100</sup>

According to (4), each row of *V* can be approximated by a linear combination of *m* PCs, and

*m* ∑ *j*=1 *wg j s j i*

Thus the above SVD procedure has found a solution to the synergy-extraction problem: The

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

, *j* = 1, ..., *m*

<sup>1</sup> ··· *<sup>w</sup>*<sup>1</sup> *m*

<sup>1</sup> ··· *<sup>w</sup><sup>g</sup> m*

<sup>1</sup> ··· *<sup>w</sup>*<sup>100</sup> *m*

<sup>10</sup>(1) ··· *<sup>s</sup>*<sup>1</sup>

<sup>10</sup>(1) ··· *<sup>s</sup><sup>m</sup>*

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

*<sup>V</sup>* <sup>≈</sup> *<sup>V</sup>*˜ <sup>=</sup> *WSm*. (4)

⎤ ⎥

<sup>⎦</sup> (5)

. (6)

(*t*) (7)

<sup>10</sup>(39)

<sup>10</sup>(39)

matrix containing the first *m* rows of *S*. Denoting *W* = *Um* diag{*λ*1, ..., *λm*}, we have

For easy comparison, let us name the elements of *Sm* in a way similar to (1):

<sup>1</sup>(1) ··· *<sup>s</sup>*<sup>1</sup>

<sup>1</sup> (1) ··· *<sup>s</sup><sup>m</sup>*

*W* =

*v g <sup>i</sup>* (*t*) ≈

angular-velocity profiles (obtained by rearranging all joints row-wise for the PCs)

*j* <sup>1</sup>(39)

*j* <sup>2</sup>(39)

*j* <sup>10</sup>(39)

can be viewed as synergies. According to (4) or (7), these synergies can serve as "building

To decide *m*, the number of PCs or synergies that we want to use in reconstruction of the testing movements, we consider the accuracy of approximation in (4) or (7). The

<sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>λ</sup>*<sup>2</sup>

<sup>2</sup> <sup>+</sup> ··· <sup>+</sup> *<sup>λ</sup>*<sup>2</sup>

*m*

100 .

<sup>1</sup>(1) ··· *s*

<sup>2</sup>(1) ··· *s*

<sup>10</sup>(1) ··· *s*

blocks" to reconstruct joint-angular-velocity profiles of hand movements.

*λ*2 <sup>1</sup> <sup>+</sup> *<sup>λ</sup>*<sup>2</sup>

*λ*2 <sup>1</sup> <sup>+</sup> *<sup>λ</sup>*<sup>2</sup>

⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

⎢ ⎣

and name the elements of *W* in the following way:

according to (4), (1), (5), and (6), we have

for *i* = 1, ..., 10, *g* = 1, ..., 100, and *t* = 1, ..., 39.

⎡ ⎢ ⎢ ⎢ ⎢ ⎣ *s j*

*s j*

*s j*

approximation accuracy can be measured by an index defined as

. . . . . . . . .

*s*1

*sm*

. . . . . . . . . . . . . . . . . . . . .

*Sm* <sup>≡</sup> <sup>⎡</sup>

graphically visualized revealed anatomical implications of physiological properties of human hand prehension( Vinjamuri et al. (2010a;b)).

There are several ways to implement PCA. Two most widely used methods were shown below. First method has three steps: (1) Subtract mean from the data (2) Calculate covariance matrix (3) Compute eigen values and eigen vectors of covariance matrix. Principal components are eigen vectors. Second method uses singular value decomposition (SVD). Third method is a function readily available in Statistics Tool Box of MATLAB which essentially implements first method.

```
load hald; %Load sample dataset in MATLAB
Data = ingredients';
%Fetch dimensions of the data (M Dimensions x N observations)
[M,N] = size(Data);
MeanofData = mean(Data,2); %Calculate Mean of the Data
Data = Data - repmat(MeanofData,1,N); %Subtract Mean from the Data
Covariance = 1 / (N-1) * Data * Data'; %Calculate the covariance of the Data
[pc, latent] = eig(Covariance); %Find the eigenvectors and eigenvalues
latent = diag(latent); %Take only diagonal elements that are eigen values
[rd, rindices] = sort(-1*latent); %Sort them in decreasing order
latent = latent(rindices); %Extract eigen values corresponding to indices
pc = pc(:,rindices); %Extract principal components corresponding to indices
score = (pc' * Data)'; %Project the Data into PC space
load hald; %Load sample dataset in MATLAB
```

```
Data = ingredients';
%Fetch dimensions of the data (M Dimensions x N observations)
[M,N] = size(Data);
MeanofData = mean(Data,2); %Calculate Mean of the Data
Data = Data - repmat(MeanofData,1,N); %Subtract Mean from the Data
Y = Data' / sqrt(N-1); %Normalized Data
[u,S,pc] = svd(Y); %Peform SVD
S = diag(S); %Extract diagonal elements corresponding to eigen values
latent = S .* S; %Calculate the eigen values
score = (pc' * Data)'; %Project the Data into PC space
load hald; %Load sample dataset in MATLAB
```
[pc,score,latent,tsquare] = princomp(ingredients); %Peform PCA

Here PCA using SVD Jolliffe (2002) was performed on the angular-velocity matrix *V* of each subject:

$$V = \mathcal{U}\Sigma S \tag{2}$$

where *U* is a 100-by-100 matrix, which has orthonormal columns so that *U*� *<sup>U</sup>* = *<sup>I</sup>*100×<sup>100</sup> (100-by-100 identity matrix); *S* is a 100-by-390 matrix, which has orthonormal rows so that *SS*� = *<sup>I</sup>*100×100; and <sup>Σ</sup> is a 100-by-100 diagonal matrix: diag{*λ*1, *<sup>λ</sup>*2, ..., *<sup>λ</sup>*100} with *<sup>λ</sup>*<sup>1</sup> ≥ *<sup>λ</sup>*<sup>2</sup> ≥ ··· ≥ *<sup>λ</sup>*<sup>100</sup> <sup>≥</sup> 0. Matrix *<sup>V</sup>* can be approximated by another matrix *<sup>V</sup>*˜ with reduced rank *<sup>m</sup>* by replacing Σ with Σ*m*, which contains only the *m* largest singular values, i.e., *λ*1, ..., *λ<sup>m</sup>* (the other singular values are replaced by zeros). The approximation matrix *V*˜ can be written in a more compact form:

$$\tilde{V} = \mathcal{U}\_{\mathfrak{m}} \operatorname{diag} \{ \lambda\_1, \dots, \lambda\_m \} \; S\_{\mathfrak{m}} \tag{3}$$

where *Um* is a 100-by-*m* matrix containing the first *m* columns of *U* and *Sm* is a *m*-by-390 matrix containing the first *m* rows of *S*. Denoting *W* = *Um* diag{*λ*1, ..., *λm*}, we have

$$
\mathcal{V} \approx \mathcal{V} = \mathcal{W} \mathcal{S}\_{\mathfrak{m}}.\tag{4}
$$

Then each row of *Sm* is called a *principal component* (PC), and *W* is called the weight matrix.

For easy comparison, let us name the elements of *Sm* in a way similar to (1):

*Sm* <sup>≡</sup> <sup>⎡</sup>

8 Will-be-set-by-IN-TECH

graphically visualized revealed anatomical implications of physiological properties of human

There are several ways to implement PCA. Two most widely used methods were shown below. First method has three steps: (1) Subtract mean from the data (2) Calculate covariance matrix (3) Compute eigen values and eigen vectors of covariance matrix. Principal components are eigen vectors. Second method uses singular value decomposition (SVD). Third method is a function readily available in Statistics Tool Box of MATLAB which

hand prehension( Vinjamuri et al. (2010a;b)).

load hald; %Load sample dataset in MATLAB

load hald; %Load sample dataset in MATLAB

Y = Data' / sqrt(N-1); %Normalized Data

latent = S .\* S; %Calculate the eigen values

load hald; %Load sample dataset in MATLAB

[u,S,pc] = svd(Y); %Peform SVD

%Fetch dimensions of the data (M Dimensions x N observations)

Data = Data - repmat(MeanofData,1,N); %Subtract Mean from the Data

[rd, rindices] = sort(-1\*latent); %Sort them in decreasing order

%Fetch dimensions of the data (M Dimensions x N observations)

[pc,score,latent,tsquare] = princomp(ingredients); %Peform PCA

where *U* is a 100-by-100 matrix, which has orthonormal columns so that *U*�

Data = Data - repmat(MeanofData,1,N); %Subtract Mean from the Data

S = diag(S); %Extract diagonal elements corresponding to eigen values

Here PCA using SVD Jolliffe (2002) was performed on the angular-velocity matrix *V* of each

(100-by-100 identity matrix); *S* is a 100-by-390 matrix, which has orthonormal rows so that *SS*� = *<sup>I</sup>*100×100; and <sup>Σ</sup> is a 100-by-100 diagonal matrix: diag{*λ*1, *<sup>λ</sup>*2, ..., *<sup>λ</sup>*100} with *<sup>λ</sup>*<sup>1</sup> ≥ *<sup>λ</sup>*<sup>2</sup> ≥ ··· ≥ *<sup>λ</sup>*<sup>100</sup> <sup>≥</sup> 0. Matrix *<sup>V</sup>* can be approximated by another matrix *<sup>V</sup>*˜ with reduced rank *<sup>m</sup>* by replacing Σ with Σ*m*, which contains only the *m* largest singular values, i.e., *λ*1, ..., *λ<sup>m</sup>* (the other singular values are replaced by zeros). The approximation matrix *V*˜ can be written in a

*V* = *U*Σ*S* (2)

*<sup>V</sup>*˜ <sup>=</sup> *Um* diag{*λ*1, ..., *<sup>λ</sup>m*} *Sm* (3)

*<sup>U</sup>* = *<sup>I</sup>*100×<sup>100</sup>

Covariance = 1 / (N-1) \* Data \* Data'; %Calculate the covariance of the Data [pc, latent] = eig(Covariance); %Find the eigenvectors and eigenvalues latent = diag(latent); %Take only diagonal elements that are eigen values

latent = latent(rindices); %Extract eigen values corresponding to indices pc = pc(:,rindices); %Extract principal components corresponding to indices

MeanofData = mean(Data,2); %Calculate Mean of the Data

score = (pc' \* Data)'; %Project the Data into PC space

MeanofData = mean(Data,2); %Calculate Mean of the Data

score = (pc' \* Data)'; %Project the Data into PC space

essentially implements first method.

Data = ingredients';

[M,N] = size(Data);

Data = ingredients';

[M,N] = size(Data);

subject:

more compact form:

$$\begin{bmatrix} s\_1^1(1) \cdots s\_1^1(39) \cdots s\_{10}^1(1) \cdots s\_{10}^1(39) \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ s\_1^m(1) \cdots s\_1^m(39) \cdots s\_{10}^m(1) \cdots s\_{10}^m(39) \end{bmatrix} \tag{5}$$

and name the elements of *W* in the following way:

$$\mathcal{W} = \begin{bmatrix} w\_1^1 & \cdots & w\_m^1 \\ \vdots & \vdots & \vdots \\ w\_1^g & \cdots & w\_m^g \\ \vdots & \vdots & \vdots \\ w\_1^{100} & \cdots & w\_m^{100} \end{bmatrix} . \tag{6}$$

According to (4), each row of *V* can be approximated by a linear combination of *m* PCs, and according to (4), (1), (5), and (6), we have

$$w\_{\dot{i}}^{\mathcal{S}}(t) \approx \sum\_{j=1}^{m} w\_{\dot{j}}^{\mathcal{S}} s\_{\dot{i}}^{\dot{j}}(t) \tag{7}$$

for *i* = 1, ..., 10, *g* = 1, ..., 100, and *t* = 1, ..., 39.

Thus the above SVD procedure has found a solution to the synergy-extraction problem: The angular-velocity profiles (obtained by rearranging all joints row-wise for the PCs)

$$\begin{bmatrix} s\_1^j(1) \ \cdots \ s\_1^j(39) \\ s\_2^j(1) \ \cdots \ s\_2^j(39) \\ \vdots \ \vdots \ \vdots \\ s\_{10}^j(1) \ \cdots \ s\_{10}^j(39) \end{bmatrix}, \quad j = 1, \ldots, m$$

can be viewed as synergies. According to (4) or (7), these synergies can serve as "building blocks" to reconstruct joint-angular-velocity profiles of hand movements.

To decide *m*, the number of PCs or synergies that we want to use in reconstruction of the testing movements, we consider the accuracy of approximation in (4) or (7). The approximation accuracy can be measured by an index defined as

$$\frac{\lambda\_1^2 + \lambda\_2^2 + \dots + \lambda\_m^2}{\lambda\_1^2 + \lambda\_2^2 + \dots + \lambda\_{100}^2}.$$

The larger this index is, the closer the approximation is. This index also provides indication of the fraction of total variance of the data matrix accounted by the PCs. To ensure satisfactory approximation, the index should be greater than some threshold. In this study, we used 95% as the threshold (a commonly used threshold Jolliffe (2002)) to determine the number of PCs or synergies (i.e. *m*). With this threshold we found the six synergies can account for 95% of variance in the postures. Fig. 4 shows six kinematic synergies obtained for subject 1 using

Application of Linear and Nonlinear Dimensionality Reduction Methods 117

Classical Multidimensional Scaling (MDS) can still be grouped under linear methods. This was introduced here to the reader to give a different perspective of dimensionality reduction in a slightly different analytical approach when compared to PCA discussed previously. The two methods PCA and MDS are unique as they perform dimensionality reduction in different ways. PCA operates on covariance matrix where as MDS operates on distance matrix. In MDS, a Euclidean distance matrix is calculated from the original matrix. This is nothing but a pairwise distance matrix between the variables in the input matrix. This method tries to preserve these pairwise distances in a low dimensional space, thus allowing for dimensionality reduction and preserving the inherent structure of the data simultaneously.

PCA and MDS were compared using a simple example in MATLAB below.

[pc,score,latent,tsquare] = princomp(ingredients); %Peform PCA

D = pdist(ingredients); %Calculate pairwise distances between ingredients

score in PCA represented the data that was projected in the PC space. Compare this to Y calculated in MDS. These are same. Similarly in place of sample dataset when the posture matrix *V* was used as input to MDS, it yielded the same synergies as PCA. This was introduced here because we build upon this method for the nonlinear methods coming up in the next

So far, we have investigated the use of linear dimensionality reduction methods (PCA and MDS) in extracting synergies. In this section we used nonlinear dimensionality reduction methods for the same purpose. The motivation to explore nonlinear methods was that physiologists who studied motor control have propounded that there were inherent nonlinearities in the human motor system. By using nonlinear methods we could probably achieve improved precision in reconstruction of natural movements. The nonlinear methods applied in this chapter are Isomap, local linear embedding (LLE), and kernel PCA (kPCA). The first two methods Isomap and LLE, are built on the framework of classical multidimensional

Isomap is similar to PCA and MDS. Although Isomap does linear estimations in the data point neighborhoods, the synergies extracted are nonlinear because these small neighborhoods are

scaling discussed in the previous section. kPCA is built on the framework of PCA.

load hald; %Load sample dataset in MATLAB

[Y, e] = cmdscale(D); %Perform Classical MDS

**5. Nonlinear dimensionality reduction methods**

stitched together without trying to maintain linearity.

PCA.

section.

**5.1 Isomap**

**4.2 Multidimensional scaling**

Fig. 4. Six kinematic synergies obtained for subject 1 using PCA. Each synergy is about 0.45 s in duration (39 samples at 86 Hz). Abbreviations: T, thumb; I, index finger; M, middle finger; R, ring finger; P, pinky finger; MCP, metacarpophalangeal joint; IP, interphalangeal joint; PIP, proximal IP joint.

The larger this index is, the closer the approximation is. This index also provides indication of the fraction of total variance of the data matrix accounted by the PCs. To ensure satisfactory approximation, the index should be greater than some threshold. In this study, we used 95% as the threshold (a commonly used threshold Jolliffe (2002)) to determine the number of PCs or synergies (i.e. *m*). With this threshold we found the six synergies can account for 95% of variance in the postures. Fig. 4 shows six kinematic synergies obtained for subject 1 using PCA.
