**1. Introduction**

16 Will-be-set-by-IN-TECH

180 Principal Component Analysis

Liying, L. and Weiwei, G. (2009). The Face Detection Algorithm Combined Skin Color

Oja, E. Simplified neuron model as a principal component analyzer. *Journal of Mathematical*

Pavan Kumar, A., Kamakoti, V., and Das, S. (2007). System-on-programmable-chip

Qian Du and J.E.F. (2008). Low-Complexity Principal Component Analysis for Hyperspectral

Sajid, I., Ahmed M.M., and Taj, I. (2008). Design and Implementation of a Face Recognition

Sanger,T.D. (1989). Optimal unsupervised learning in a single-layer linear feedforward neural

Sharma, A. and Paliwal, K.K. (2007). Fast principal component analysis using fixed-point

Computer Science, pp. 1 - 3.

its Applications, pp. 126 - 130.

network. *Neural Networks*, vol. 12, pp. 459 - 473.

algorithm, *Pattern Recognition Letters*, pp. 1151 - 1155.

*Biology* 15(3), pp. 267-273.


Segmentation and PCA. International Conference on Information Engineering and

implementation for on-line face recognition. *Pattern Recognition Letters* 28(3), pp. 342

Image Compression. *International Journal of High Performance Computing Applications*.

System Using Fast PCA. IEEE International Symposium on Computer Science and

When you have obtained measures on a large number of variables, there may exist redundancy in those variables. Redundancy means that some of the variables are correlated with one another, possibly because they are measuring the same "thing". Because of this redundancy, it should be possible to reduce the observed variables into a smaller number of variables. For example, if a group of variables are strongly correlated with one another, you do not need all of them in your analysis, but only one since you can predict the evolution of all the variables from that of one. This opens the central issue of how to select or build the representative variables of each group of correlated variables. The simplest way to do this is to keep one variable and discard all others, but this is not reasonable. Another alternative is to combine the variables in some way by taking perhaps a weighted average, as in the line of the well-known Human Development Indicator published by UNDP. However, such an approach calls the basic question of how to set the appropriate weights. If one has sufficient insight into the nature and magnitude of the interrelations among the variables, one might choose weights using one's individual judgment. Obviously, this introduces a certain amount of subjectivity into the analysis and may be questioned by practitioners. To overcome this shortcoming, another method is to let the data set uncover itself the relevant weights of variables. Principal Components Analysis (PCA) is a variable reduction method that can be used to achieve this goal. Technically this method delivers a relatively small set of synthetic variables called principal components that account for most of the variance in the original dataset.

Introduced by Pearson (1901) and Hotelling (1933), Principal Components Analysis has become a popular data-processing and dimension-reduction technique, with numerous applications in engineering, biology, economy and social science. Today, PCA can be implemented through statistical software by students and professionals but it is often poorly understood. The goal of this Chapter is to dispel the magic behind this statistical tool. The Chapter presents the basic intuitions for how and why principal component analysis works, and provides guidelines regarding the interpretation of the results. The mathematics aspects will be limited. At the end of this Chapter, readers of all levels will be able to gain a better understanding of PCA as well as the when, the why and the how of applying this technique. They will be able to determine the number of meaningful components to retain from PCA, create factor scores and interpret the components. More emphasis will be placed on examples explaining in detail the steps of implementation of PCA in practice.

The Basics of Linear Principal Components Analysis 183

The variance and the standard deviation are important in data analysis because of their relationships to correlation and the normal curve. Correlation between a pair of variables measures to what extent their values co-vary. The term covariance is undoubtedly associatively prompted immediately. There are numerous models for describing the behavioral nature of a simultaneous change in values, such as linear, exponential and more. The linear correlation is used in PCA. The linear correlation coefficient for two variables *x*

1

the most widely-used type of correlation coefficient in statistics and is also called Pearson correlation or product-moment correlation. Correlation coefficients lie between -1.00 and +1.00. The value of -1.00 represents a perfect negative correlation while a value of +1.00 represents a perfect positive correlation. A value of 0.00 represents a lack of correlation. Correlation coefficients are used to assess the degree of collinearity or redundancy among variables. Notice that the value of correlation coefficient does not depend on the specific

When correlations among several variables are computed, they are typically summarized in the form of a correlation matrix. For the five variables in Table 1, we obtain the results

 X1 X2 X3 X4 X5 X1 1.00 0.94 0.77 -0.03 -0.08 X2 1.00 0.74 0.02 -0.04 X3 1.00 0.21 0.19 X4 1.00 0.95 X5 1.00

In this Table a given row and column intersect shows the correlation between the two corresponding variables. For example, the correlation between variables X1 and X2 is 0.94. As can be seen from the correlations, the five variables seem to hang together in two distinct groups. First, notice that variables X1, X2 and X3 show relatively strong correlations with one another. This could be because they are measuring the same "thing". In the same way, variables X4 and X5 correlate strongly with each another, a possible indication that they measure the same "thing" as well. Notice that those two variables show very weak

Given that the 5 variables contain some "redundant" information, it is likely that they are not really measuring five different independent constructs, but two constructs or underlying factors. What are these factors? To what extent does each variable measure each of these factors? The purpose of PCA is to provide answers to these questions. Before presenting the

mathematics of the method, let's see how PCA works with the data in Table 1.

*i*

*n*

1

*n*

(,)

*x y* *<sup>y</sup>* denote the standard deviation of *x* and *y*, respectively. This definition is

(2)

*x x y y*

*i i*

*x y*

 

and *y* is given by:

 *<sup>x</sup>* and

measurement units used.

Table 2. **Correlations among variables**

correlations with the rest of the variables.

reported in Table 2.

where

We think that the well understanding of this Chapter will facilitate that of the following chapters and novel extensions of PCA proposed in this book (sparse PCA, Kernel PCA, Multilinear PCA, …).
