**5.3 Cumulative percent of total variance accounted for**

When determining the number of meaningful components, remember that the subspace of components retained must account for a reasonable amount of variance in the data. It is usually typical to express the eigenvalues as a percentage of the total. The fraction of an eigenvalue out of the sum of all eigenvalues represents the amount of variance accounted by the corresponding principal component. The cumulative percent of variance explained by the first *q* components is calculated with the formula:

$$r\_q = \frac{\sum\_{j=1}^{q} \mathcal{A}\_j}{\sum\_{j=1}^{p} \mathcal{A}\_j} \times 100\tag{27}$$

The Basics of Linear Principal Components Analysis 193

Running a PCA has become easy with statistical software. However, interpreting the results can be a difficult task. Here are a few guidelines that should help practitioners through the

Once the analysis is complete, we wish to assign a name to each retained component that describes its content. To do this, we need to know what variables explain the components. Correlations of the variables with the principal components are useful tools that can help interpreting the meaning of components. The correlations between each variable and each

**Variables PCA 1 PCA 2**

Notes : PCA1 and PCA2 denote the first and second principal component, respectively.

X1 0.943 -0.241 X2 0.939 -0.196 X3 0.902 0.064 X4 0.206 0.963 X5 0.159 0.975

Those correlations are also known as component loadings. A coefficient greater than 0.4 in absolute value is considered as significant (see, Stevens (1986) for a discussion). We can interpret PCA1 as being highly positively correlated with variables X1, X2 and X3, and weakly positively correlated to variables X4 and X5. So X1, X2 and X3 are the most important variables in the first principal component. PCA2, on the other hand, is highly positively correlated with X4 and X5, and weakly negatively related to X1 and X2. So X4 and X5 are most important in explaining the second principal component. Therefore, the name of the first component comes from variables X1, X2 and X3 while that of the second component comes

It can be shown that the coordinate of a variable on a component is the correlation coefficient between that variable and the principal component. This allows us to plot the reduced dimension representation of variables in the plane constructed from the first two components. Variables highly correlated with a component show a small angle. Figure 2 represents this graph for our dataset. For each variable we have plotted on the horizontal dimension its loading on component 1, on the vertical dimension its loading on

The graph also presents a visual aspect of correlation patterns among variables. The cosine

*xy x y xy* , cos( , ) (28)

of the angle θ between two vectors *x* and *y* is computed as:

**6. Interpretation of principal components** 

**6.1 The visual approach of correlation** 

principal component are given in Table 4.

Table 4. **Correlation variable-component** 

from X4 and X5.

component 2.

analysis.
