**7. Random frameworks**

In feature selection, the random subspace method can improve the performance by combining many classifiers which corresponds to each random feature subset. In this section, the random method is applied to 2DPCA in various ways to improve its performance.

### **7.1 Two-dimensional random subspace analysis (2DRSA)**

The main disadvantage of 2DPCA is that it needs many more coefficients for image representation than PCA. Many works try to solve this problem. In Yang et al. (2004), PCA is used after 2DPCA for further dimensional reduction, but it is still unclear how the dimension of 2DPCA could be reduced directly. Many methods to overcome this problem were proposed by applied the bilateral-projection scheme to 2DPCA. In Zhang & Zhou (2005); Zuo et al. (2005), the right and left multiplying projection matrices are calculated independently while the iterative algorithm is applied to obtain the optimal solution of these projection matrices in Kong et al. (2005); Ye (2004). And the non-iterative algorithm for optimization was proposed in Liu & Chen (2006). In Xu et al. (2004), they proposed the iterative procedure which the right projection is calculated by the reconstructed images of the left projection and the left projection is calculated by the reconstructed images of the right projection. Nevertheless, all of above methods obtains only the local optimal solution.

Another method for dealing with high-dimensional space was proposed in Ho (1998b), called Random Subspace Method (RSM). This method is the one of ensemble classification methods, like Bagging Breiman (1996) and Boosting Freund & Schapire (1995). However, Bagging and Boosting are not reduce the high-dimensionality. Bagging randomly select a number of samples from the original training set to learn an individual classifier while Boosting specifically weight each training sample. The RSM can effectively exploit the high-dimensionality of the data. It constructs an ensemble of classifiers on independently selected feature subsets, and combines them using a heuristic such as majority voting, sum rule, etc.

There are many reasons the Random Subspace Method is suitable for face recognition task. Firstly, this method can take advantage of high dimensionality and far away from the curse of dimensionality (Ho, 1998b). Secondly, the random subspace method is useful for critical training sample sizes (Skurichina & Duin, 2002). Normally in face recognition, the dimension of the feature is extremely large compared to the available number of training samples. Thus applying RSM can avoid both of the curse of dimensionality and the SSS problem. Thirdly, The nearest neighbor classifier, a popular choice in the 2D face-recognition domain (Kong et al., 2005; Liu & Chen, 2006; Yang et al., 2004; Ye, 2004; Zhang & Zhou, 2005; Zuo et al., 2005), can be very sensitive to the sparsity in the high-dimensional space. Their accuracy is often far from optimal because of the lack of enough samples in the high-dimensional space. The RSM brings significant performance improvements compared to a single classifier Ho (1998a); Skurichina & Duin (2002). Finally, since there is no hill climbing in RSM, there is no danger of being trapped in local optima Ho (1998b).

The RSM was applied to PCA for face recognition in Chawla & Bowyer (2005). They apply the random selection directly to the feature vector of PCA for constructing the multiple subspaces. Nevertheless, the information which contained in each element of PCA feature vector is not equivalent. Normally, the element which corresponds to the larger eigenvalue, contains more useful information. Therefore, applying RSM to PCA feature vector is seldom appropriate. 18 Will-be-set-by-IN-TECH

In feature selection, the random subspace method can improve the performance by combining many classifiers which corresponds to each random feature subset. In this section, the random

The main disadvantage of 2DPCA is that it needs many more coefficients for image representation than PCA. Many works try to solve this problem. In Yang et al. (2004), PCA is used after 2DPCA for further dimensional reduction, but it is still unclear how the dimension of 2DPCA could be reduced directly. Many methods to overcome this problem were proposed by applied the bilateral-projection scheme to 2DPCA. In Zhang & Zhou (2005); Zuo et al. (2005), the right and left multiplying projection matrices are calculated independently while the iterative algorithm is applied to obtain the optimal solution of these projection matrices in Kong et al. (2005); Ye (2004). And the non-iterative algorithm for optimization was proposed in Liu & Chen (2006). In Xu et al. (2004), they proposed the iterative procedure which the right projection is calculated by the reconstructed images of the left projection and the left projection is calculated by the reconstructed images of the right projection.

Another method for dealing with high-dimensional space was proposed in Ho (1998b), called Random Subspace Method (RSM). This method is the one of ensemble classification methods, like Bagging Breiman (1996) and Boosting Freund & Schapire (1995). However, Bagging and Boosting are not reduce the high-dimensionality. Bagging randomly select a number of samples from the original training set to learn an individual classifier while Boosting specifically weight each training sample. The RSM can effectively exploit the high-dimensionality of the data. It constructs an ensemble of classifiers on independently selected feature subsets, and combines them using a heuristic such as majority voting, sum

There are many reasons the Random Subspace Method is suitable for face recognition task. Firstly, this method can take advantage of high dimensionality and far away from the curse of dimensionality (Ho, 1998b). Secondly, the random subspace method is useful for critical training sample sizes (Skurichina & Duin, 2002). Normally in face recognition, the dimension of the feature is extremely large compared to the available number of training samples. Thus applying RSM can avoid both of the curse of dimensionality and the SSS problem. Thirdly, The nearest neighbor classifier, a popular choice in the 2D face-recognition domain (Kong et al., 2005; Liu & Chen, 2006; Yang et al., 2004; Ye, 2004; Zhang & Zhou, 2005; Zuo et al., 2005), can be very sensitive to the sparsity in the high-dimensional space. Their accuracy is often far from optimal because of the lack of enough samples in the high-dimensional space. The RSM brings significant performance improvements compared to a single classifier Ho (1998a); Skurichina & Duin (2002). Finally, since there is no hill climbing in RSM, there is no danger of

The RSM was applied to PCA for face recognition in Chawla & Bowyer (2005). They apply the random selection directly to the feature vector of PCA for constructing the multiple subspaces. Nevertheless, the information which contained in each element of PCA feature vector is not equivalent. Normally, the element which corresponds to the larger eigenvalue, contains more useful information. Therefore, applying RSM to PCA feature vector is seldom appropriate.

method is applied to 2DPCA in various ways to improve its performance.

Nevertheless, all of above methods obtains only the local optimal solution.

**7.1 Two-dimensional random subspace analysis (2DRSA)**

**7. Random frameworks**

rule, etc.

being trapped in local optima Ho (1998b).

*S*1: Project image, **A**, by Eq. (10). *S*2: For *i* = 1 to the number of classifiers *S*3: Randomly select a *r* dimensional random subspace, **Z***<sup>r</sup> i* , from **Y** (*r* < *m*). *S*4: Construct the nearest neighbor classifier, **C***<sup>r</sup> i* . *S*5: End For *S*6: Combine the output of each classifiers by using majority voting.

Table 4. Two-Dimensional Random Subspace Analysis Algorithm

Different from PCA, the 2DPCA feature is a matrix form. Thus, RSM is more suitable for 2DPCA, because the column direction does not depend on the eigenvalue.

A framework of Two-Dimensional Random Subspace Analysis (2DRSA) (Sanguansat et al., n.d.) is proposed to extend the original 2DPCA. The RSM is applied to feature space of 2DPCA for generating the vast number of feature subspaces, which be constructed by an autonomous, pseudorandom procedure to select a small number of dimensions from a original feature space. For a *m* by *n* feature matrix, there are 2*<sup>m</sup>* such selections that can be made, and with each selection a feature subspace can be constructed. And then individual classifiers are created only based on those attributes in the chosen feature subspace. The outputs from different individual classifiers are combined by the uniform majority voting to give the final prediction.

The Two-Dimensional Random Subspace Analysis consists of two parts, 2DPCA and RSM. After data samples was projected to 2D feature space via 2DPCA, the RSM are applied here by taking advantage of high dimensionality in these space to obtain the lower dimensional multiple subspaces. A classifier is then constructed on each of those subspaces, and a combination rule is applied in the end for prediction on the test sample. The 2DRSA algorithm is listed in Table 4, the image matrix, **A**, is projected to feature space by 2DPCA projection in Eq. (10). In this feature space, it contains the data samples in matrix form, the *m* × *d* feature matrix, **Y** in Eq. (10). The dimensions of feature matrix **Y** depend on the height of image (*m*) and the number of selected eigenvectors of the image covariance matrix **G** (*d*). Therefore, only the information which embedded in each element on the row direction was sorted by the eigenvalue but not on the column direction. It means this method should randomly pick up some rows of feature matrix **Y** to construct the new feature matrix **Z**. The dimension of **Z** is *r* × *d*, normally *r* should be less than *m*. The results in Ho (1998b) have shown that for a variety of data sets adopting half of the feature components usually yields good performance.

#### **7.2 Two-dimensional diagonal random subspace analysis (2D**2**RSA)**

The extension of 2DRSA was proposed in Sanguansat et al. (2007b), namely the Two-Dimensional Diagonal Random Subspace Analysis. It consists of two parts i.e. DiaPCA and RSM. Firstly, all images are transformed into the diagonal face images as in Section 6.1. After that the transformed image samples was projected to 2D feature space via DiaPCA, the RSM are applied here by taking advantage of high dimensionality in these space to obtain the lower dimensional multiple subspaces. A classifier is then constructed on each of those subspaces, and a combination rule is applied in the end for prediction on the test sample. Similar to 2DRSA, the 2D2RSA algorithm is listed in Table 5.

Chawla, N. V. & Bowyer, K. (2005). Random subspaces and subsampling for 2D face recognition, *Computer Vision and Pattern Recognition*, Vol. 2, pp. 582–589. Chen, L., Liao, H., Ko, M., Lin, J. & Yu, G. (2000). A new LDA based face recognition system

Two-Dimensional Principal Component Analysis and Its Extensions 21

Freund, Y. & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning

Fukunaga, K. (1990). *Introduction to Statistical Pattern Recognition*, second edn, Academic Press. Ho, T. K. (1998a). Nearest neighbors in random subspaces, *Proceedings of the 2nd Int'l Workshop on Statistical Techniques in Pattern Recognition*, Sydney, Australia, pp. 640–648. Ho, T. K. (1998b). The random subspace method for constructing decision forests, *IEEE Trans.*

Huang, R., Liu, Q., Lu, H. & Ma, S. (2002). Solving the small sample size problem of LDA,

Kong, H., Li, X., Wang, L., Teoh, E. K., Wang, J.-G. & Venkateswarlu, R. (2005). Generalized 2D

Liu, J. & Chen, S. (2006). Non-iterative generalized low rank approximation of matrices,

Liu, J., Chen, S., Zhou, Z.-H. & Tan, X. (2010). Generalized low-rank approximations of matrices revisited, *Neural Networks, IEEE Transactions on* 21(4): 621 –632. Lu, J., Plataniotis, K. N. & Venetsanopoulos, A. N. (2003). Regularized discriminant

Nguyen, N., Liu, W. & Venkatesh, S. (2007). Random subspace two-dimensional pca for

Sanguansat, P. (2008). 2dpca feature selection using mutual information, *Computer and Electrical Engineering, 2008. ICCEE 2008. International Conference on*, pp. 578 –581. Sanguansat, P., Asdornwised, W., Jitapunkul, S. & Marukatat, S. (2006a). Class-specific

Sanguansat, P., Asdornwised, W., Jitapunkul, S. & Marukatat, S. (2006b). Two-dimensional

Sanguansat, P., Asdornwised, W., Jitapunkul, S. & Marukatat, S. (2006c). Two-dimensional

Sanguansat, P., Asdornwised, W., Jitapunkul, S. & Marukatat, S. (2007a). Image

URL: *http://portal.acm.org/citation.cfm?id=1779459.1779555*

*Convergent Technologies for the Asia-Pacific*, Taipei, Taiwan.

principal component analysis, *IEEE International Joint Conference on Neural Networks*

analysis for the small sample size problem in face recognition, *Pattern Recogn. Lett.*

face recognition, *Proceedings of the multimedia 8th Pacific Rim conference on Advances in multimedia information processing*, PCM'07, Springer-Verlag, Berlin, Heidelberg,

subspace-based two-dimensional principal component analysis for face recognition, *International Conference on Pattern Recognition*, Vol. 2, Hong Kong, China,

linear discriminant analysis of principle component vectors for face recognition, *IEICE Trans. Inf. & Syst. Special Section on Machine Vision Applications*

linear discriminant analysis of principle component vectors for face recognition, *IEEE International Conference on Acoustics, Speech, and Signal Processing*, Vol. 2, Toulouse,

cross-covariance analysis for face recognition, *IEEE Region 10 Conference on*

*Pattern Anal. and Mach. Intell.* 20(8): 832–844.

*Pattern Recognition Letters* 27: 1002–1008.

*Pattern Recognition* 3: 29–32.

*(IJCNN)* 1: 108–113.

24(16): 3079–3087.

pp. 655–664.

pp. 1246–1249.

E89-D(7): 2164–2170.

France, pp. 345–348.

pp. 23–37.

which can solve the small sample size problem, *Pattern Recognition* 33(10): 1713–1726.

and an application to boosting, *European Conference on Computational Learning Theory*,


Table 5. Two-Dimensional Diagonal Random Subspace Analysis Algorithm.

#### **7.3 Random subspace method-based image cross-covariance analysis**

As discussed in Section 6.2, not all elements of the covariance matrix is used in 2DPCA. Although, the image cross-covariance matrix can be switching these elements to formulate many versions of image cross-covariance matrix, the (*m* − 1)/*m* elements of the covariance matrix are still not advertent in the same time. For integrating this information, the Random Subspace Method (RSM) can be using here via randomly select the number of shifting *L* to construct a set of multiple subspaces. That means each subspace is formulated from difference versions of image cross-covariance matrix. And then individual classifiers are created only based on those attributes in the chosen feature subspace. The outputs from different individual classifiers are combined by the uniform majority voting to give the final prediction. Moreover, the RSM can be used again for constructing the subspaces which are corresponding to the difference number of basis vectors *d*. Consequently, the number of all random subspaces of ICCA reaches to *d* × *L*. That means applying the RSM to ICCA can be constructed more subspaces than 2DRSA. As a result, the RSM-based ICCA can alternatively be apprehended as the generalized 2DRSA.

#### **8. Conclusions**

This chapter presents the extensions of 2DPCA in several frameworks, i.e. bilateral projection, kernel method, supervised based, alignment based and random approaches. All of these methods can improve the performance of traditional 2DPCA for image recognition task. The bilateral projection can obtain the smallest feature matrix compared to the others. The class information can be embedded in the projection matrix by supervised frameworks that means the discriminant power should be increased. The alternate alignment of pixels in image can reveal the latent information which is useful for the classifier. The kernel based 2DPCA can achieve to the highest performance but the appropriated kernel's parameters and a huge of memory are required to manipulate the kernel matrix while the random subspace method is good for robustness.

#### **9. References**


20 Will-be-set-by-IN-TECH

*S*4: Randomly select a *r* dimensional random subspace,

As discussed in Section 6.2, not all elements of the covariance matrix is used in 2DPCA. Although, the image cross-covariance matrix can be switching these elements to formulate many versions of image cross-covariance matrix, the (*m* − 1)/*m* elements of the covariance matrix are still not advertent in the same time. For integrating this information, the Random Subspace Method (RSM) can be using here via randomly select the number of shifting *L* to construct a set of multiple subspaces. That means each subspace is formulated from difference versions of image cross-covariance matrix. And then individual classifiers are created only based on those attributes in the chosen feature subspace. The outputs from different individual classifiers are combined by the uniform majority voting to give the final prediction. Moreover, the RSM can be used again for constructing the subspaces which are corresponding to the difference number of basis vectors *d*. Consequently, the number of all random subspaces of ICCA reaches to *d* × *L*. That means applying the RSM to ICCA can be constructed more subspaces than 2DRSA. As a result, the RSM-based ICCA can alternatively

This chapter presents the extensions of 2DPCA in several frameworks, i.e. bilateral projection, kernel method, supervised based, alignment based and random approaches. All of these methods can improve the performance of traditional 2DPCA for image recognition task. The bilateral projection can obtain the smallest feature matrix compared to the others. The class information can be embedded in the projection matrix by supervised frameworks that means the discriminant power should be increased. The alternate alignment of pixels in image can reveal the latent information which is useful for the classifier. The kernel based 2DPCA can achieve to the highest performance but the appropriated kernel's parameters and a huge of memory are required to manipulate the kernel matrix while the random subspace method is

Belhumeur, P. N., Hespanha, J. P. & Kriegman, D. J. (1997). Eigenfaces vs. Fisherfaces:

Breiman, L. (1996). Bagging predictors, *Machine Learning* 24(2): 123–140.

Recognition using class specific linear projection, *IEEE Trans. Pattern Anal. and Mach.*

*i* .

*S*1: Transforming images into diagonal images.

*S*5: Construct the nearest neighbor classifier, **C***<sup>r</sup>*

*S*7: Combine the output of each classifiers by using

*S*2: Project image, **A**, by Eq. (10).

*<sup>i</sup>* , from **Y** (*r* < *m*).

majority voting.

**Z***r*

*S*6: End For

be apprehended as the generalized 2DRSA.

**8. Conclusions**

good for robustness.

*Intell.* 19: 711–720.

**9. References**

*S*3: For *i* = 1 to the number of classifiers

Table 5. Two-Dimensional Diagonal Random Subspace Analysis Algorithm.

**7.3 Random subspace method-based image cross-covariance analysis**


URL: *http://portal.acm.org/citation.cfm?id=1779459.1779555*


**2** 

*México* 

**Application of Principal Component Analysis to** 

**Elucidate Experimental** 

Cuauhtémoc Araujo-Andrade et al.\*

**and Theoretical Information** 

*Unidad Académica de Física, Universidad Autónoma de Zacatecas* 

Principal Component Analysis has been widely used in different scientific areas and for different purposes. The versatility and potentialities of this unsupervised method for data analysis, allowed the scientific community to explore its applications in different fields. Even when the principles of PCA are the same in what algorithms and fundamentals concerns, the strategies employed to elucidate information from a specific data set (experimental and/or

In this chapter, we will describe how PCA has been used in three different theoretical and experimental applications, to explain the relevant information of the data sets. These applications provide a broad overview about the versatility of PCA in data analysis and interpretation. Our main goal is to give an outline about the capabilities and strengths of PCA to elucidate specific information. The examples reported include the analysis of matured distilled beverages, the determination of heavy metals attached to bacterial surfaces and interpretation of quantum chemical calculations. They were chosen as representative examples of the application of three different approaches for data analysis: the influence of data pre-treatments in the scores and loadings values, the use of specific optical, chemical and/or physical properties to qualitatively discriminate samples, and the use of spatial orientations to group conformers correlating structures and relative energies. This reason fully justifies their selection as case studies. This chapter also pretends to be a reference for those researchers that, not being in the field, may use these methodologies to

theoretical), mainly depend on the expertise and needs of each researcher.

take the maximum advantage from their experimental results.

*3Centro de Investigación y Desarrollo en Criotecnología de Alimentos (CIDCA)* 

*1Unidad Académica de Física, Universidad Autónoma de Zacatecas 2Centro de Investigaciones en Óptica, A.C. Unidad Aguascalientes* 

\* Claudio Frausto-Reyes2, Esteban Gerbino3, Pablo Mobili3, Elizabeth Tymczyszyn3, Edgar L. Esparza-Ibarra1, Rumen Ivanov-Tsonchev1 and Andrea Gómez-Zavaglia3

**1. Introduction** 

*1,2México 3Argentina*

