**5.2 Cattell scree test**

190 Principal Component Analysis

It can be seen that the eigenvalue for component 1 is 2.653, while the eigenvalue for component 2 is 1.98. This means that the first component accounts for 2.653 units of total variance while the second component accounts for 1.98 units. The third component accounts for about 0.27 unit of variance. Note that the sum of the eigenvalues is 5, which is also the number of variables. How do we determine how many components are worth interpreting?

**Component Eigenvalue % of variance Cumulative %**  1 2.653 53.057 53.057 2 1.980 39.597 92.653 3 0.269 5.375 98.028 4 0.055 1.095 99.123 5 0.044 0.877 100.000

Several criteria have been proposed for determining how many meaningful components should be retained for interpretation. This section will describe three criteria: the Kaiser eigenvalue-one criterion, the Cattell Scree test, and the cumulative percent of variance

The Kaiser (1960) method provides a handy rule of thumb that can be used to retain meaningful components. This rule suggests keeping only components with eigenvalues greater than 1. This method is also known as the eigenvalue-one criterion. The rationale for this criterion is straightforward. Each observed variable contributes one unit of variance to the total variance in the data set. Any component that displays an eigenvalue greater than 1 is accounts for a greater amount of variance than does any single variable. Such a component is therefore accounting for a meaningful amount of variance, and is worthy of being retained. On the other hand, a component with an eigenvalue of less than 1 accounts for less variance than does one variable. The purpose of principal component analysis is to reduce variables into a relatively smaller number of components; this cannot be effectively achieved if we retain components that account for less variance than do individual variables. For this reason, components with eigenvalues less than 1 are of little use and are not retained. When a covariance matrix is used, this criterion retains components whose eigenvalue is greater than the average variance of the data (Kaiser-

However, this method can lead to retaining the wrong number of components under circumstances that are often encountered in research. The thoughtless application of this rule can lead to errors of interpretation when differences in the eigenvalues of successive components are trivial. For example, if component 2 displays an eigenvalue of 1.01 and component 3 displays an eigenvalue of 0.99, then component 2 will be retained but component 3 will not; this may mislead us into believing that the third component is meaningless when, in fact, it accounts for almost exactly the same amount of variance as the second component. It is possible to use statistical tests to test for difference between successive eigenvalues. In fact, the Kaiser criterion ignores error associated with each

Table 3. **Eigenvalues from PCA**

accounted for.

**5.1 Kaiser method** 

Guttman criterion).

The scree test is another device for determining the appropriate number of components to retain. First, it graphs the eigenvalues against the component number. As eigenvalues are constrained to decrease monotonically from the first principal component to the last, the scree plot shows the decreasing rate at which variance is explained by additional principal components. To choose the number of meaningful components, we next look at the scree plot and stop at the point it begins to level off (Cattell, 1966; Horn, 1965). The components that appear *before* the "break" are assumed to be meaningful and are retained for interpretation; those appearing *after* the break are assumed to be unimportant and are not retained. Between the components before and after the break lies a scree.

The scree plot of eigenvalues derived from Table 3 is displayed in Figure 1. The component numbers are listed on the horizontal axis, while eigenvalues are listed on the vertical axis. The Figure shows a relatively large break appearing between components 2 and 3, meaning the each successive component is accounting for smaller and smaller amounts of the total variance. This agrees with the preceding conclusion that two principal components provide a reasonable summary of the data, accounting for about 93% of the total variance.

Sometimes a scree plot will display a pattern such that it is difficult to determine exactly where a break exists. When encountered, the use of the scree plot must be supplemented with additional criteria, such as the Kaiser method or the cumulative percent of variance accounted for criterion.
