**4. Methods of ordination analysis**

Ordination serves to summarize community data (such as species abundance data) by producing a low-dimensional ordination space in which similar species and samples are plotted close together, and dissimilar species and samples are placed far apart (Peet, 1980)

such as TWINSPAN.

corresponding cut level.

other in terms of it's environmental needs.

3. *Artemisia sieberi–Zygophylom eurypterum*  4. *Zygophylom eurypterum- Artemisia sieberi* 

**4. Methods of ordination analysis** 

2. *Artemisia aucheri, Astragalus spp., Bormus tomentellus*

1. *Artemisia sieberi-Eurotia ceratoides* 

These types are as follows:

5. *Seidlitzia rosmarinus* 6. *Halocnemum strobilaceum*

ecologists.

classification methods is that, TWINSPAN creates groups and also finds indicator species for those groups, while Cluster analysis requires a before-the-fact assignment of group membership as input. In this case, will be used hierarchical clustering to identify groups for vegetation classification. TWINSPAN produces no graphical output. The biggest volume of the result is the description of each division. For each division, TWINSPAN identifies the indicator pseudo species and their signs (positive or negative for one end of the ordination or the other) and lists the samples assigned to each subgroup. Two popular agglomerative polythetic techniques are Group Average and Flexible. McCune et al. (2002) recommend Ward's method in addition. Gauch (1982a) preferred to use divisive polythetic techniques

This method works with qualitative data only. In order not to lose the information about the species abundances, the concepts of pseudo-species and pseudo-species cut levels were introduced. Each species can be represented by several pseudo-species, depending on its quantity in the sample. A pseudo-species is present if the species quantity exceeds the

TWINSPAN is a program for classifying species and samples, producing an ordered twoway table of their occurrence. The process of classification is hierarchical; samples are successively divided into categories, and species are then divided into categories on the basis of the sample classification. TWINSPAN, like DECORANA, has been widely used by

For example, TWINSPAN was performed for vegetation analysis in 270 plots using ordinal scale of Van-der-Marrel (1979). The end of results file is the two-way ordred table summarizing the classification (Fig3). The table has species (not pesudo species) as rows and samples as columns.The results of TWINSPAN classification are presented in Fig.4. According to the above-mentioned table, figure, and also eigenvalue of each division, vegetation of the study area was classified in to six main types. Each type differs from the

Ordination serves to summarize community data (such as species abundance data) by producing a low-dimensional ordination space in which similar species and samples are plotted close together, and dissimilar species and samples are placed far apart (Peet, 1980)

**Figure 3.** TWINSPAN of the vegetation cover in 270 quadrates and 9 species

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 231

Ordination methods can be divided in two main groups, direct and indirect methods. Direct methods use species and environment data in a single, integrated analysis. Indirect methods use the species data only (Fig 5). Finally, ordination techniques are used to describe relationships between species composition patterns and the underlying environmental gradients which influence these patterns. Although community ecology is a fairly young

In 1930, began to use informal ordination techniques for vegetation. Such informal and largely subjective methods became widespread in the early 1950's (Whittaker 1967). In 1951, Curtis and McIntosh developed the 'continuum index', which later lead to conceptual links between species responses to gradients and multivariate methods. Shortly thereafter, Goodall (1954) introduced the term 'ordination' in an ecological context for Principal

Each method was applied to data from a North east of Semnan (In Iran). If objective of study is examining the distribution patterns of six plant type in the rangelands, ordination could be used to determine which species are commonly found associated with one another, and how the species composition of the community changes with increase and decrease in each environment factor (Zare Chahouki et al, 2010). The objective of this method was to establish a monitoring system that may serve to identify and predict future vegetation changes and to

There are several different ordination techniques, all of which differ slightly, in the mathematical approach used to calculate species and sample similarity/dissimiarity. Rather

science, the application of quantitative methods began fairly early (McIntosh,. 1985).

**Figure 5.** Schematic comparison of Ordination techniques

assess impacts of conservation and management practices.

Components Analysis.

**Figure 4.** Schematic comparison of Ordination techniques

Ordination methods can be divided in two main groups, direct and indirect methods. Direct methods use species and environment data in a single, integrated analysis. Indirect methods use the species data only (Fig 5). Finally, ordination techniques are used to describe relationships between species composition patterns and the underlying environmental gradients which influence these patterns. Although community ecology is a fairly young science, the application of quantitative methods began fairly early (McIntosh,. 1985).

**Figure 5.** Schematic comparison of Ordination techniques

230 Multivariate Analysis in Management, Engineering and the Sciences

**Figure 4.** Schematic comparison of Ordination techniques

In 1930, began to use informal ordination techniques for vegetation. Such informal and largely subjective methods became widespread in the early 1950's (Whittaker 1967). In 1951, Curtis and McIntosh developed the 'continuum index', which later lead to conceptual links between species responses to gradients and multivariate methods. Shortly thereafter, Goodall (1954) introduced the term 'ordination' in an ecological context for Principal Components Analysis.

Each method was applied to data from a North east of Semnan (In Iran). If objective of study is examining the distribution patterns of six plant type in the rangelands, ordination could be used to determine which species are commonly found associated with one another, and how the species composition of the community changes with increase and decrease in each environment factor (Zare Chahouki et al, 2010). The objective of this method was to establish a monitoring system that may serve to identify and predict future vegetation changes and to assess impacts of conservation and management practices.

There are several different ordination techniques, all of which differ slightly, in the mathematical approach used to calculate species and sample similarity/dissimiarity. Rather

than reinventing the wheel by discussing each of these techniques. Our example study illustrates the most frequent use of ordination methods in community ecology, we will offer only a brief description of the most commonly used methods here. Further details can be found in the following.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 233

Some of this problem can be overcome by using rules to define the reference stands.

In the earliest versions of PO, these endpoints were the two samples with the highest ecological distance between them, or two samples which are suspected of being at opposite

Beals (1984) extended Bray-Curtis ordination and discussed its variants, and is thus a useful reference. The polar ordination, simplest method is to choose the pair of samples, not

ends of an important gradient (thus introducing a degree of subjectivity).

including the previous endpoints, with the maximum distance of separation.

**Figure 6.** Bray-Curtis–ordination diagram of the environmental data. For vegetation types and

variables abbreviations. (∆) is the representative of the vegetation types.

3. Distances are not metric (i.e., they are relative only)

4. No explicit statement of underlying model.

#### **Polar Ordination (PO)**

Bray and Curtis (1957) developed polar ordination, which became the first widely-used ordination technique in ecology.

Polar Ordination arranges samples with respect to poles (also termed end points or reference points) according to a distance matrix (Bray and Curtis 1957). These endpoints are two samples with the highest ecological distance between them, or two samples suspected of being at opposite ends of an important gradient. This method is especially useful for investigating ecological change (e.g., succession, recovery).

For example, Fig 6 shows ordination diagram for vegetation types and soil variables by Bray-Curtis analysis.

Endpoints for axis 1 was *Halocnemum strobilaceum*, *Artemisia aucheri-Astragalus spp-Bromus tomentellus.* Distances (ordination scores) are from *Halocnemum strobilaceum* Sum of squares of non-redundant distances in original matrix was .199621E+12. Axis 1 extracted 100.00% of the original distance matrix. Sum of squares of residual distances remaining is .672048E+05. Regression coefficient for this axis was -6.40 and Variance in distances from the first endpoint was 0.65.

Endpoints for axis 2: *Artemisia sieberi-Zygophylum eurypterum, Ar.au-As.spp-Br.to* distances (ordination scores) were from *Artemisia siberi-Zygophylum eurypterum*. Regression coefficient for this axis was -3.53. Variance in distances from the first endpoint was 0.0.

Axis 2 extracted 1.87% of the original distance matrix, Cumulative was 98.15%. Sum of squares of residual distances remaining was .948501E-01.

Polar ordination has strengths and weaknesses. The advantage of this method is that: (Beals 1984).


The weaknesses of Polar Ordination method is that: (Beals 1984).


Some of this problem can be overcome by using rules to define the reference stands.


232 Multivariate Analysis in Management, Engineering and the Sciences

investigating ecological change (e.g., succession, recovery).

squares of residual distances remaining was .948501E-01.

subjectively selecting the end points

found in the following. **Polar Ordination (PO)** 

Bray-Curtis analysis.

endpoint was 0.65.

ordination.

1984).

ordination technique in ecology.

than reinventing the wheel by discussing each of these techniques. Our example study illustrates the most frequent use of ordination methods in community ecology, we will offer only a brief description of the most commonly used methods here. Further details can be

Bray and Curtis (1957) developed polar ordination, which became the first widely-used

Polar Ordination arranges samples with respect to poles (also termed end points or reference points) according to a distance matrix (Bray and Curtis 1957). These endpoints are two samples with the highest ecological distance between them, or two samples suspected of being at opposite ends of an important gradient. This method is especially useful for

For example, Fig 6 shows ordination diagram for vegetation types and soil variables by

Endpoints for axis 1 was *Halocnemum strobilaceum*, *Artemisia aucheri-Astragalus spp-Bromus tomentellus.* Distances (ordination scores) are from *Halocnemum strobilaceum* Sum of squares of non-redundant distances in original matrix was .199621E+12. Axis 1 extracted 100.00% of the original distance matrix. Sum of squares of residual distances remaining is .672048E+05. Regression coefficient for this axis was -6.40 and Variance in distances from the first

Endpoints for axis 2: *Artemisia sieberi-Zygophylum eurypterum, Ar.au-As.spp-Br.to* distances (ordination scores) were from *Artemisia siberi-Zygophylum eurypterum*. Regression coefficient

Axis 2 extracted 1.87% of the original distance matrix, Cumulative was 98.15%. Sum of

Polar ordination has strengths and weaknesses. The advantage of this method is that: (Beals

2. It is Ideal for evaluating problems with discrete endpoints. Polar Ordination ideal for testing specific hypotheses (e.g., reference condition or experimental design) by

1. Axes are not orthogonal. With large data sets, it may be difficult to get a consistent

2. Not completely objective won't always get the same answer. However, this is a function of the decision regarding reference stands, and is really amounts to viewing the ordination from different angles, although the problem of nonorthogonal axes can cause

for this axis was -3.53. Variance in distances from the first endpoint was 0.0.

1. It is Simple, easy to understand geometric method, easily taught.

The weaknesses of Polar Ordination method is that: (Beals 1984).

considerable distortion to the ordination space.

In the earliest versions of PO, these endpoints were the two samples with the highest ecological distance between them, or two samples which are suspected of being at opposite ends of an important gradient (thus introducing a degree of subjectivity).

Beals (1984) extended Bray-Curtis ordination and discussed its variants, and is thus a useful reference. The polar ordination, simplest method is to choose the pair of samples, not including the previous endpoints, with the maximum distance of separation.

**Figure 6.** Bray-Curtis–ordination diagram of the environmental data. For vegetation types and variables abbreviations. (∆) is the representative of the vegetation types.

These patterns are consistent with others in the literature (cited and reanalyzed in Palmer 1986).

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 235

Broken-stick Eigenvalue

Var.

Factor 1 2 3 4 5 6 gr1 -0.2636 0.0012 -0.0447 -0.0562 0.3161 0.1371 gr2 -0.2589 0.0904 0.0166 -0.1657 0.2022 0.0355 clay1 0.1792 0.3148 0.1002 -0.0093 0.1005 -0.1242 clay2 0.1504 0.2595 -0.3168 -0.3208 -0.3702 -0.2055 silt1 0.2476 0.0278 -0.1910 0.3450 0.0191 0.1166 silt2 0.2691 0.0624 -0.0028 0.0323 0.0133 -0.0807 sand1 -0.2437 -0.1583 0.0828 -0.2235 -0.0573 -0.0706 sand2 -0.2356 -0.1862 0.1819 0.0264 0.1395 0.0824 lim1 0.0828 -0.3939 -0.0644 -0.0424 0.2794 0.0946 lim2 0.1606 -0.3190 0.0101 -0.1881 0.3162 0.0212 O.M1 -0.0253 0.3944 -0.0388 -0.0561 0.4768 0.0649 O.M2 -0.0768 0.2109 0.2962 0.3680 0.0688 -0.0525 A.W1 0.2440 0.1148 -0.2414 0.1038 0.2249 0.1069 A.W2 0.2353 0.1306 -0.2399 0.0725 0.3501 0.1342 gyp1 0.2662 -0.0688 0.0925 -0.0716 0.0125 -0.1236 gyp2 0.2662 -0.0688 0.0925 -0.0716 0.0125 -0.1257 EC1 0.2662 -0.0693 0.0957 -0.0628 0.0188 -0.1148 EC2 0.2653 -0.0729 0.1017 -0.0773 0.0127 -0.1281 pH1 -0.1360 -0.1130 -0.6739 0.0644 -0.1513 0.2438 pH2 -0.2205 -0.1334 -0.2747 0.3324 0.2260 -0.8329 elevat -0.1945 0.2594 0.0252 0.3383 -0.1141 0.0904 sl -0.1345 0.2559 -0.1863 -0.5878 0.1327 -0.1505

AXIS Eigenvalue % of Variance Cum.% of

\*Non-trivial principal component as based on broken-stick eigenvalues

**Table 3.** PCA applied to the correlation matrix of the environmental factors in the study area

1 13.494 61.335 61.335 3.691 2 5.512 25.053 86.388 2.691 3 1.460 6.636 93.024 2.191 4 0.968 4.398 97.422 1.857 5 0.567 2.578 100.000 1.607 6 0.000 0.000 100.000 1.407 7 0.000 0.000 100.000 1.241 8 0.000 0.000 100.000 1.098 9 0.000 0.000 100.000 0.973 10 0.000 0.000 100.000 0.862

### **Principal Components Analysis (PCA)**

Principal Components Analysis (PCA) was one of the earliest ordination techniques applied to ecological data. PCA uses a rigid rotation to derive orthogonal axes, which maximize the variance in the data set. Both species and sample ordinations result from a single analysis. Computationally, Principal components analysis is the basic eigen analysis technique. It maximizes the variance explained by each successive axis.

The sum of the eigenvalues will equal the sum of the variance of all variables in the data set. PCA is relatively objective and provides a reasonable but crude indication of relationships.

PCA was invented in 1901 by Karl Pearson (Dunn,et al,1987) Now it is mostly used as a tool in exploratory data analysis and for making predictive models.

PCA is a method that reduces data dimensionality by performing a covariance analysis between factors (Feoli and Orl¢ci. 1992).

This method is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components.

The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (uncorrelated with) the preceding components (ter Braak and Sˇmilauer, 1998).

PCA method was used to determine the association between plant communities and environmental variables, i.e. in an indirect non-canonical way (ter Braak and Loomans, 1987).

For example to determine the most effective variables on the separation of vegetation types, PCA was performed for 22 factors in six vegetation types. The results of the PCA ordination are presented in Table 3 and Fig.5. Broken-stick eigenvalues for data set indicate that the first two principal components (PC1 and PC2) resolutely captured more variance than expected by chance. The first two principal components together accounted for 86% of the total variance in data set. Therefore, 61% and 25% variance were accounted for by the first and second principal components, respectively. This means that the first principal component is by far the most important for representing the variation of the six vegetation types.

Considering the characteristics of solidarity with the components, the first component includes silt and gravel in 20-80 depth, Available moisture in 0-20 depth, sand, gypsum and EC of both the depths. The second component consists of clay in 0-20 depth and lime in both depths.


\*Non-trivial principal component as based on broken-stick eigenvalues

234 Multivariate Analysis in Management, Engineering and the Sciences

maximizes the variance explained by each successive axis.

in exploratory data analysis and for making predictive models.

important for representing the variation of the six vegetation types.

**Principal Components Analysis (PCA)** 

between factors (Feoli and Orl¢ci. 1992).

variables called principal components.

Braak and Sˇmilauer, 1998).

1987).

depths.

1986).

These patterns are consistent with others in the literature (cited and reanalyzed in Palmer

Principal Components Analysis (PCA) was one of the earliest ordination techniques applied to ecological data. PCA uses a rigid rotation to derive orthogonal axes, which maximize the variance in the data set. Both species and sample ordinations result from a single analysis. Computationally, Principal components analysis is the basic eigen analysis technique. It

The sum of the eigenvalues will equal the sum of the variance of all variables in the data set. PCA is relatively objective and provides a reasonable but crude indication of relationships.

PCA was invented in 1901 by Karl Pearson (Dunn,et al,1987) Now it is mostly used as a tool

PCA is a method that reduces data dimensionality by performing a covariance analysis

This method is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated

The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (uncorrelated with) the preceding components (ter

PCA method was used to determine the association between plant communities and environmental variables, i.e. in an indirect non-canonical way (ter Braak and Loomans,

For example to determine the most effective variables on the separation of vegetation types, PCA was performed for 22 factors in six vegetation types. The results of the PCA ordination are presented in Table 3 and Fig.5. Broken-stick eigenvalues for data set indicate that the first two principal components (PC1 and PC2) resolutely captured more variance than expected by chance. The first two principal components together accounted for 86% of the total variance in data set. Therefore, 61% and 25% variance were accounted for by the first and second principal components, respectively. This means that the first principal component is by far the most

Considering the characteristics of solidarity with the components, the first component includes silt and gravel in 20-80 depth, Available moisture in 0-20 depth, sand, gypsum and EC of both the depths. The second component consists of clay in 0-20 depth and lime in both

**Table 3.** PCA applied to the correlation matrix of the environmental factors in the study area

In the study area, environmental conditions in *Halocnemum strobilaceum* type differ from the others. With attention to the position of this type in the four quarter of the diagram, it has a high correlation with the first axis. Therefore, this type has the most relation with variables of the first axis.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 237

PCA operation can be thought of as revealing the internal structure of the data in a way which best explains the variance in the data. It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical

The one advantage of PCA is that once you have found patterns in the data, and you compress the data, ie by reducing the number of dimensions, without much loss of information and While PCA finds the mathematically optimal method (as in minimizing the squared error), it is sensitive to outliers in the data that produce large errors PCA tries to

However, in some contexts, outliers can be difficult to identify. For example in data mining algorithms like correlation clustering, the assignment of points to clusters and outliers is not

A recently proposed generalization of PCA based on Weighted PCA increases robustness by

Although it has severe faults with many community data sets, it is probably the best technique to use when a data set approximates multivariate normality. PCA is usually a poor method for community data, but it is the best method for many other kinds of

In general, once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives you the components in order of significance. Now, if you like, you can decide to ignore the components of lesser significance. You do lose some information, but if the eigenvalues are small, you don't lose much. If you leave out some components, the final data set will have less dimensions than

To be precise, if you originally have dimensions in your data, and so you calculate eigenvectors and eigenvalues, and then you choose only the first eigenvectors, then the final data set has only dimensions. What needs to be done now is you need to form a feature vector, which is just a fancy name for a matrix of vectors. This is constructed by taking the eigenvectors that you want to keep from the list of eigenvectors, and forming a matrix with

Deriving the new data set is the final step in PCA, and is also the easiest. Once we have chosen the components (eigenvectors) that we wish to keep in our data and formed a feature vector, we simply take the transpose of the vector and multiply it on the left of the original

In the case of keeping both eigenvectors for the transformation, we get the data and the plot found in Figure 5. This plot is basically the original data, rotated so that the eigenvectors are the axes. This is understandable since we have lost no information in this decomposition.

In figure 5 showed sample of PCA–ordination diagram of the vegetation types related to the

representation is not available, PCA is a powerful tool for analyzing data

known beforehand.

multivariate (Bakus, 2007).

these eigenvectors in the columns.

data set, transposed.

environmental factors.

the original.

avoid. It therefore is common practice to remove outliers before computing PCA.

assigning different weights to data objects based on their estimated relevancy.

Because of the bigger distance of *H. strobilaceum* type from the second axis, this type has a weak relation with factors such as clay and lime. *Artemisia sieberi-Eurotia ceratoides* and *Seidlitzia rosmarinus* types have inverse relation with indicator environmental characteristics of the first and second axes except for clay, sand and gravel. *A. aucheri–Astragalus. spp.- Bromus tomentellus* type has more relation with indicator characteristics of the first and second axes.

Indicator environmental factors of the first and second axes in *A. sieberi–Zygophylom eurypterum* and *Z. eurypterum-A. sieberi* types are approximately similar. *A. sieberi–Z. eurypterum* type has a direct relationship with gravel and sand, and an inverse relationship with EC, silt, available moisture and gypsum. While *A. aucheri-As. spp.-B. tomentellus* type has a direct relationship with clay and inversely related to lime.

**Figure 7.** PCA–ordination diagram of the vegetation types related to the environmental factors in the study area. For vegetation types abbreviations, see Appendix A.

PCA operation can be thought of as revealing the internal structure of the data in a way which best explains the variance in the data. It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analyzing data

236 Multivariate Analysis in Management, Engineering and the Sciences

has a direct relationship with clay and inversely related to lime.

of the first axis.

second axes.

In the study area, environmental conditions in *Halocnemum strobilaceum* type differ from the others. With attention to the position of this type in the four quarter of the diagram, it has a high correlation with the first axis. Therefore, this type has the most relation with variables

Because of the bigger distance of *H. strobilaceum* type from the second axis, this type has a weak relation with factors such as clay and lime. *Artemisia sieberi-Eurotia ceratoides* and *Seidlitzia rosmarinus* types have inverse relation with indicator environmental characteristics of the first and second axes except for clay, sand and gravel. *A. aucheri–Astragalus. spp.- Bromus tomentellus* type has more relation with indicator characteristics of the first and

Indicator environmental factors of the first and second axes in *A. sieberi–Zygophylom eurypterum* and *Z. eurypterum-A. sieberi* types are approximately similar. *A. sieberi–Z. eurypterum* type has a direct relationship with gravel and sand, and an inverse relationship with EC, silt, available moisture and gypsum. While *A. aucheri-As. spp.-B. tomentellus* type

**Figure 7.** PCA–ordination diagram of the vegetation types related to the environmental factors in the

study area. For vegetation types abbreviations, see Appendix A.

The one advantage of PCA is that once you have found patterns in the data, and you compress the data, ie by reducing the number of dimensions, without much loss of information and While PCA finds the mathematically optimal method (as in minimizing the squared error), it is sensitive to outliers in the data that produce large errors PCA tries to avoid. It therefore is common practice to remove outliers before computing PCA.

However, in some contexts, outliers can be difficult to identify. For example in data mining algorithms like correlation clustering, the assignment of points to clusters and outliers is not known beforehand.

A recently proposed generalization of PCA based on Weighted PCA increases robustness by assigning different weights to data objects based on their estimated relevancy.

Although it has severe faults with many community data sets, it is probably the best technique to use when a data set approximates multivariate normality. PCA is usually a poor method for community data, but it is the best method for many other kinds of multivariate (Bakus, 2007).

In general, once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives you the components in order of significance. Now, if you like, you can decide to ignore the components of lesser significance. You do lose some information, but if the eigenvalues are small, you don't lose much. If you leave out some components, the final data set will have less dimensions than the original.

To be precise, if you originally have dimensions in your data, and so you calculate eigenvectors and eigenvalues, and then you choose only the first eigenvectors, then the final data set has only dimensions. What needs to be done now is you need to form a feature vector, which is just a fancy name for a matrix of vectors. This is constructed by taking the eigenvectors that you want to keep from the list of eigenvectors, and forming a matrix with these eigenvectors in the columns.

Deriving the new data set is the final step in PCA, and is also the easiest. Once we have chosen the components (eigenvectors) that we wish to keep in our data and formed a feature vector, we simply take the transpose of the vector and multiply it on the left of the original data set, transposed.

In the case of keeping both eigenvectors for the transformation, we get the data and the plot found in Figure 5. This plot is basically the original data, rotated so that the eigenvectors are the axes. This is understandable since we have lost no information in this decomposition.

In figure 5 showed sample of PCA–ordination diagram of the vegetation types related to the environmental factors.

In contrast to Correspondence Analysis and related methods (see below), species are represented by arrows. This implies that the abundance of the species is continuously increasing in the direction of the arrow, and decreasing in the opposite direction.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 239

It was used to examine the relationships between the measured variables and the distribution of plant communities (Ter Braak, 1986). CCA expresses species relationships as linear combinations of environmental variables and combines the features of CA with canonical correlation analysis (Green, 1989). This provides a graphical representation of the

Canonical Correlation Analysis is presented as the standard method to relate two sets of variables (Gittins, 1985). However, the latter method is useless if there are many species compared to sites, as in many ecological studies, because its ordination axes are very

The best weight for CCA describes environment variables with the first axis shows. Species information structure using a reply CCA Nonlinear with the linear combination of variables will consider environmental characteristics of acceptable behavior characteristics of species with environment shows. CCA analysis combined with non-linear species and environmental factors shows the most important environmental variable in connection with

In Canonical Correspondence Analysis, the sample scores are constrained to be linear combinations of explanatory variables. CCA focuses more on species composition, i.e.

When a combination of environmental variables is highly related to species composition, this method, will create an axis from these variables that makes the species response curves most distinct. The second and higher axes will also maximize the dispersion of species, subject to the constraints that these higher axes are linear combinations of the explanatory

Monte Carlo permutation tests were subsequently used within canonical correspondence analysis (CCA) to determine the significance of relations between species composition and

The outcome of CCA is highly dependent on the scaling of the explanatory variables. Unfortunately, we cannot know a priori what the best transformation of the data will be, and it would be arrogant to assume that our measurement scale is the same scale used by

It is probably obvious that the choice of variables in CCA is crucial for the output. Meaningless variables will produce meaningless results. However, a meaningful variable that is not necessarily related to the most important gradient may still yield meaningful

Explanatory variables need not be continuous in CCA. Indeed, dummy variables representing a categorical variable are very useful. A dummy variable takes the value 1 if the sample belongs to that category and 0 otherwise. Dummy variables are useful if you have discrete experimental treatments, year effects, different bedrock types, or in the case of

plants and animals. Nevertheless, we must make intelligent guesses (Bakus, 2007).

relationships between species and environmental factors.

variables, and that they are orthogonal to all previous axis.

the bryophyte example, host tree species (Bakus, 2007).

environmental variables (ter Braak, 1987)

unstable in such cases.

the axes shows.

relative abundance.

results (Palmer 1988).

### **Canonical correspondence analysis (CCA)**

Canonical correspondence analysis (CCA) is a direct gradient analysis that displays the variation of vegetation in relation to the included environmental factors by using environmental data to order samples (Kent & Coker, 1992). This method combines multiple regression techniques together with various forms of correspondence analysis or reciprocal averaging (Ter Braak, 1986, 1987). The statistical significance of the relationship between the species and the whole set of environmental variables was evaluated using Monte Carlo permutation tests.

The CCA analysis method Ordination is a combination of conventional linear Environment variables with the highest value of dispersion Species shows. In other words, the best weight for CCA describes environment variables with the first axis shows. Species information structure using a reply CCA Nonlinear with the linear combination of variables will consider environmental characteristics of acceptable behavior characteristics of species with environment shows. CCA analysis combined with non-linear species and environmental factors shows the most important environmental variable in connection with the axes shows.

In ecology studies, the ordination of samples and species is constrained by their relationships to environmental variables.

The adventag of CCA Analysis is that: (Palmer, 1993)


If data sets are few, CCA triplots can get very crowded then should be separate the parts of the triplot into biplots or scatterplots (e.g. plotting the arrows in a different panel of the same figure) or rescaling the arrows so that the species and sample scores are more spread out. And we can only plotting the most abundant species (but by all means, keep the rare species in the analysis).

4. When species responses are unimodal, and by measuring the important underlying environmental variables, CCA is most likely to be useful.

And one of limitations to CCA is that correlation does not imply causation, and a variable that appears to be strong may merely be related to an unmeasured but 'true' gradient. As with any technique, results should be interpreted in light of these limitations (McCune 1999).

It was used to examine the relationships between the measured variables and the distribution of plant communities (Ter Braak, 1986). CCA expresses species relationships as linear combinations of environmental variables and combines the features of CA with canonical correlation analysis (Green, 1989). This provides a graphical representation of the relationships between species and environmental factors.

238 Multivariate Analysis in Management, Engineering and the Sciences

**Canonical correspondence analysis (CCA)** 

relationships to environmental variables.

The adventag of CCA Analysis is that: (Palmer, 1993)

means, keep the rare species in the analysis).

environmental variables, CCA is most likely to be useful.

permutation tests.

apply to CCA.

arrows (or points).

In contrast to Correspondence Analysis and related methods (see below), species are represented by arrows. This implies that the abundance of the species is continuously

Canonical correspondence analysis (CCA) is a direct gradient analysis that displays the variation of vegetation in relation to the included environmental factors by using environmental data to order samples (Kent & Coker, 1992). This method combines multiple regression techniques together with various forms of correspondence analysis or reciprocal averaging (Ter Braak, 1986, 1987). The statistical significance of the relationship between the species and the whole set of environmental variables was evaluated using Monte Carlo

The CCA analysis method Ordination is a combination of conventional linear Environment variables with the highest value of dispersion Species shows. In other words, the best weight for CCA describes environment variables with the first axis shows. Species information structure using a reply CCA Nonlinear with the linear combination of variables will consider environmental characteristics of acceptable behavior characteristics of species with environment shows. CCA analysis combined with non-linear species and environmental factors shows the most important environmental variable in connection with the axes shows. In ecology studies, the ordination of samples and species is constrained by their

1. Patterns result from the combination of several explanatory variables. And many extensions of multiple regressions (e.g. stepwise analysis and partial analysis) also

2. It is possible to test hypotheses (though in CCA, hypothesis testing is based on

3. Another advantage of CCA lies in the intuitive nature of its ordination diagram, or triplot. It is called a triplot because it simultaneously displays three pieces of information: samples as points, species as points, and environmental variables as

If data sets are few, CCA triplots can get very crowded then should be separate the parts of the triplot into biplots or scatterplots (e.g. plotting the arrows in a different panel of the same figure) or rescaling the arrows so that the species and sample scores are more spread out. And we can only plotting the most abundant species (but by all

4. When species responses are unimodal, and by measuring the important underlying

And one of limitations to CCA is that correlation does not imply causation, and a variable that appears to be strong may merely be related to an unmeasured but 'true' gradient. As with any

technique, results should be interpreted in light of these limitations (McCune 1999).

randomization procedures rather than distributional assumptions).

increasing in the direction of the arrow, and decreasing in the opposite direction.

Canonical Correlation Analysis is presented as the standard method to relate two sets of variables (Gittins, 1985). However, the latter method is useless if there are many species compared to sites, as in many ecological studies, because its ordination axes are very unstable in such cases.

The best weight for CCA describes environment variables with the first axis shows. Species information structure using a reply CCA Nonlinear with the linear combination of variables will consider environmental characteristics of acceptable behavior characteristics of species with environment shows. CCA analysis combined with non-linear species and environmental factors shows the most important environmental variable in connection with the axes shows.

In Canonical Correspondence Analysis, the sample scores are constrained to be linear combinations of explanatory variables. CCA focuses more on species composition, i.e. relative abundance.

When a combination of environmental variables is highly related to species composition, this method, will create an axis from these variables that makes the species response curves most distinct. The second and higher axes will also maximize the dispersion of species, subject to the constraints that these higher axes are linear combinations of the explanatory variables, and that they are orthogonal to all previous axis.

Monte Carlo permutation tests were subsequently used within canonical correspondence analysis (CCA) to determine the significance of relations between species composition and environmental variables (ter Braak, 1987)

The outcome of CCA is highly dependent on the scaling of the explanatory variables. Unfortunately, we cannot know a priori what the best transformation of the data will be, and it would be arrogant to assume that our measurement scale is the same scale used by plants and animals. Nevertheless, we must make intelligent guesses (Bakus, 2007).

It is probably obvious that the choice of variables in CCA is crucial for the output. Meaningless variables will produce meaningless results. However, a meaningful variable that is not necessarily related to the most important gradient may still yield meaningful results (Palmer 1988).

Explanatory variables need not be continuous in CCA. Indeed, dummy variables representing a categorical variable are very useful. A dummy variable takes the value 1 if the sample belongs to that category and 0 otherwise. Dummy variables are useful if you have discrete experimental treatments, year effects, different bedrock types, or in the case of the bryophyte example, host tree species (Bakus, 2007).

If many variables are included in an analysis, much of the inertia becomes 'explained'. Any linear transformation of variables (e.g. kilograms to grams, meters to inches, Fahrenheit to Centigrade) will not affect the outcome of CCA whatsoever.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 241

Species responses to environmental conditions cannot be inferred in a causal way from multivariate analysis or any other statistical method; however, these techniques are useful to identify spatial distribution patterns and to assess which of the included environmental variables contribute most to species variability and which factors should be experimentally

The results of CCA ordination are presented in Fig.8. Each environmental factor is an indicator of the specific habitat. *Artemisia sieberi-Eurotia ceratoides*, *A. sieberi–Zygophylum eurypterum* and *Zygophylom eurypterum- A. sieberi types* have nonlinear relation with gravel, sand, silt, clay, lime, organic matter and available moisture. Relation power depends on the relative distance between indicator points of soil characteristics and vegetation types. *H. strobilaceum* type has non linear relation with gypsum and EC in both layers that is, EC and gypsum are indicator of habitat of this type. *A. sieberi–Z. eurypterum* and *Z. eurypterum- A. sieberi* types have non linear relation with them while *A.aucheri-As.sp.* and *S. rosmarinus* types are different from each other

**Figure 8.** CCA–ordination diagram of the environmental data. For vegetation types and variables abbreviations, see Appendix A. (∆) is the representative of the vegetation types. (\*) is the representative

and they have less non linear relation with ecological factors.

tested (D ´ez et al, 2003).

of the environmental factors.

There are as many constrained axes as there are explanatory variables. The total 'explained inertia' is the sum of the eigenvalues of the constrained axes. The remaining axes are unconstrained, and can be considered 'residual'. The total inertia in the species data is the sum of eigenvalues of the constrained and the unconstrained axes, and is equivalent to the sum of eigenvalues, or total inertia, of CA. Thus, explained inertia, compared to total inertia, can be used as a measure of how well species composition is explained by the variables. Unfortunately, a strict measure of 'goodness of fit' for CCA is elusive, because the arch effect itself has some inertia associated with it (Bakus, 2007).

The ordination diagrams of canonical correlation analysis and redundancy analysis display the same data tables; the difference lies in the precise weighing of the species (ter Braak, 1987, 1990; ter Braak & Looman, 1994). Recent, good ecological examples of canonical correlations analysis, with many more sites than species, are Van der Meer (1991) and Varis (1991).

For example, according to Tables 4 and5, first axis (Eigenvalue=0.869) accounted for 98.7% variation in environmental factors data. Correlation between the first axis and species– environmental variables was 0.99 and Monte Carlo permutation test for the first axis was highly significant (P=0.01). The second axis (Eigenvalue=0.182) explained 0.4% variation in data set. Correlation between the second axis and species–environmental variables was 0.92. In addition, the Monte Carlo test for the second axis was highly significant (P=0.02).


\* Correlation between sample scores for an axis derived from the species data and the sample scores that are linear combinations of the environmental variables. Set to 0.000 if axis is not canonical.

**Table 4.** Canonical correspondence analysis for environmental data.


p = proportion of randomized runs with species-environment correlation greater than or equal to the observed Species-environment correlation; i.e., p = (1 + no. permutations >= observed)/(1 + no. permutations)

**Table 5.** Mont Carlo test result –Speacies-Enviroment

Species responses to environmental conditions cannot be inferred in a causal way from multivariate analysis or any other statistical method; however, these techniques are useful to identify spatial distribution patterns and to assess which of the included environmental variables contribute most to species variability and which factors should be experimentally tested (D ´ez et al, 2003).

240 Multivariate Analysis in Management, Engineering and the Sciences

Centigrade) will not affect the outcome of CCA whatsoever.

itself has some inertia associated with it (Bakus, 2007).

Variance in species data

If many variables are included in an analysis, much of the inertia becomes 'explained'. Any linear transformation of variables (e.g. kilograms to grams, meters to inches, Fahrenheit to

There are as many constrained axes as there are explanatory variables. The total 'explained inertia' is the sum of the eigenvalues of the constrained axes. The remaining axes are unconstrained, and can be considered 'residual'. The total inertia in the species data is the sum of eigenvalues of the constrained and the unconstrained axes, and is equivalent to the sum of eigenvalues, or total inertia, of CA. Thus, explained inertia, compared to total inertia, can be used as a measure of how well species composition is explained by the variables. Unfortunately, a strict measure of 'goodness of fit' for CCA is elusive, because the arch effect

The ordination diagrams of canonical correlation analysis and redundancy analysis display the same data tables; the difference lies in the precise weighing of the species (ter Braak, 1987, 1990; ter Braak & Looman, 1994). Recent, good ecological examples of canonical correlations

For example, according to Tables 4 and5, first axis (Eigenvalue=0.869) accounted for 98.7% variation in environmental factors data. Correlation between the first axis and species– environmental variables was 0.99 and Monte Carlo permutation test for the first axis was highly significant (P=0.01). The second axis (Eigenvalue=0.182) explained 0.4% variation in data set. Correlation between the second axis and species–environmental variables was 0.92.

Axis 1 Axis 2 Axis 3

analysis, with many more sites than species, are Van der Meer (1991) and Varis (1991).

In addition, the Monte Carlo test for the second axis was highly significant (P=0.02).

Eigenvalue 0.869 0.003 0.003

% of variance explained 98.7 0.4 0.3 Cumulative % explained 98.7 99.1 99.4 Pearson Correlation, Spp-Envt\* 0.998 0.920 0.959 Kendall (Rank) Corr., Spp-Envt 0.481 0.706 0.584 \* Correlation between sample scores for an axis derived from the species data and the sample scores that are linear

Axis Spp-Envt Corr. Mean Minimum Maximum p 1 0.998 0.838 0.195 0.996 0.0100 2 0.920 0.607 0.072 0.935 0.0200 3 0.959 0.342 0.032 0.709 0.0100

p = proportion of randomized runs with species-environment correlation greater than or equal to the observed

Species-environment correlation; i.e., p = (1 + no. permutations >= observed)/(1 + no. permutations)

combinations of the environmental variables. Set to 0.000 if axis is not canonical. **Table 4.** Canonical correspondence analysis for environmental data.

**Table 5.** Mont Carlo test result –Speacies-Enviroment

The results of CCA ordination are presented in Fig.8. Each environmental factor is an indicator of the specific habitat. *Artemisia sieberi-Eurotia ceratoides*, *A. sieberi–Zygophylum eurypterum* and *Zygophylom eurypterum- A. sieberi types* have nonlinear relation with gravel, sand, silt, clay, lime, organic matter and available moisture. Relation power depends on the relative distance between indicator points of soil characteristics and vegetation types. *H. strobilaceum* type has non linear relation with gypsum and EC in both layers that is, EC and gypsum are indicator of habitat of this type. *A. sieberi–Z. eurypterum* and *Z. eurypterum- A. sieberi* types have non linear relation with them while *A.aucheri-As.sp.* and *S. rosmarinus* types are different from each other and they have less non linear relation with ecological factors.

**Figure 8.** CCA–ordination diagram of the environmental data. For vegetation types and variables abbreviations, see Appendix A. (∆) is the representative of the vegetation types. (\*) is the representative of the environmental factors.

#### **Reciprocal Averaging (RA) - Correspondence Analysis**

RA is an ordination technique related conceptually to weighted averages. Because one algorithm for finding the solution involves the repeated averaging of sample scores and species scores (citations), Correspondence Analysis (CA) is also known as reciprocal averaging (Gittins, 1985).

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 243

Since CA is a unimodal model, species are represented by a point rather than an arrow (Figure 7). This is (under some choices of scaling; see ter Braak and Šmilauer 1998) the weighted average of the samples in which that species occurs. With some simplifying assumptions (ter Braak and Looman 1987), the species score can be considered an estimate

**Figure 9.** RA–ordination diagram of the environmental data. For vegetation types and variables abbreviations. (∆) is the representative of the vegetation types. (+) is the representative of the

environmental factors.

of the location of the peak of the species response curve (Figure 7).

RA places sampling units and species on the same gradients, and maximizes variation between species and sample scores using a correlation coefficient. It serves as a relatively objective analysis of community data.

CA is a graphical display ordination technique which simultaneously displays the rows (sites) and columns (species) of a data matrix in low dimensional space (Gittins, 1985). Row identifiers (species) plotted close together are similar in their relative profiles, and column identifiers plotted close together are correlated, enabling one to interpret not only which of the taxa are clustered, but also why they are clustered (Zhang et al,2005). Reciprocal analysis and canonical correlation analysis are linear methods. So, if well produced, their ordination diagrams are biplots or the superposition of biplots (a triplot). For illustration I use the Dune Meadow Data from Jongman et al. (1987). Reciprocal averaging is performed in PC-ORD by selecting options in program. Reciprocal averaging (RA) yields both normal and transpose ordinations automatically. Like DCA, RA ordinates both species and samples simultaneously. RA is the new technique that selects the linear combination of environmental variables that maximizes the description of the species scores. This gives the first RA axis. In RA, composite gradients are linear combinations of environmental variables, giving a much simpler analysis and the non-linearity enters the model through a unimodal model for a few composite gradients, taken care of in RA by weighted averaging. It provides a summary of the species-environment relations. This method is an ordination technique related conceptually to weighted averages. Results are generally superior to the results from PCA. However, RA axis ends are compressed relative to the middle, and the second axis is often a distortion of the first axis, resulting in an arched effect.

For example the analysis of variance showed in table.4 that there was a significant correlation among species and soil axis. The eigenvalues represent the variance in the sample scores. RA axis 1 has an eigenvalue of 0.86. RA axis 2 with an eigenvalue of 0.017 is less important. Table 6 shows the score classified site. Total variance (inertia) in the species data is 0.8887.

The results of RA ordination are presented in Fig 6. Six group sites were determined in relation to the environmental factors. Sites were determined in relation to the environmental factors.

The eigenvalue of the CA axis is equivalent to the correlation coefficient between species scores and sample scores (Gauch 1982b, Pielou 1984). It is not possible to arrange rows and/or columns in such a way that makes the correlation higher. The second and higher axes also maximize the correlation between species scores and sample scores, but they are constrained to be uncorrelated with (orthogonal to) the previous axes.

Since CA is a unimodal model, species are represented by a point rather than an arrow (Figure 7). This is (under some choices of scaling; see ter Braak and Šmilauer 1998) the weighted average of the samples in which that species occurs. With some simplifying assumptions (ter Braak and Looman 1987), the species score can be considered an estimate of the location of the peak of the species response curve (Figure 7).

242 Multivariate Analysis in Management, Engineering and the Sciences

averaging (Gittins, 1985).

data is 0.8887.

factors.

objective analysis of community data.

**Reciprocal Averaging (RA) - Correspondence Analysis** 

RA is an ordination technique related conceptually to weighted averages. Because one algorithm for finding the solution involves the repeated averaging of sample scores and species scores (citations), Correspondence Analysis (CA) is also known as reciprocal

RA places sampling units and species on the same gradients, and maximizes variation between species and sample scores using a correlation coefficient. It serves as a relatively

CA is a graphical display ordination technique which simultaneously displays the rows (sites) and columns (species) of a data matrix in low dimensional space (Gittins, 1985). Row identifiers (species) plotted close together are similar in their relative profiles, and column identifiers plotted close together are correlated, enabling one to interpret not only which of the taxa are clustered, but also why they are clustered (Zhang et al,2005). Reciprocal analysis and canonical correlation analysis are linear methods. So, if well produced, their ordination diagrams are biplots or the superposition of biplots (a triplot). For illustration I use the Dune Meadow Data from Jongman et al. (1987). Reciprocal averaging is performed in PC-ORD by selecting options in program. Reciprocal averaging (RA) yields both normal and transpose ordinations automatically. Like DCA, RA ordinates both species and samples simultaneously. RA is the new technique that selects the linear combination of environmental variables that maximizes the description of the species scores. This gives the first RA axis. In RA, composite gradients are linear combinations of environmental variables, giving a much simpler analysis and the non-linearity enters the model through a unimodal model for a few composite gradients, taken care of in RA by weighted averaging. It provides a summary of the species-environment relations. This method is an ordination technique related conceptually to weighted averages. Results are generally superior to the results from PCA. However, RA axis ends are compressed relative to the middle, and the

second axis is often a distortion of the first axis, resulting in an arched effect.

constrained to be uncorrelated with (orthogonal to) the previous axes.

For example the analysis of variance showed in table.4 that there was a significant correlation among species and soil axis. The eigenvalues represent the variance in the sample scores. RA axis 1 has an eigenvalue of 0.86. RA axis 2 with an eigenvalue of 0.017 is less important. Table 6 shows the score classified site. Total variance (inertia) in the species

The results of RA ordination are presented in Fig 6. Six group sites were determined in relation to the environmental factors. Sites were determined in relation to the environmental

The eigenvalue of the CA axis is equivalent to the correlation coefficient between species scores and sample scores (Gauch 1982b, Pielou 1984). It is not possible to arrange rows and/or columns in such a way that makes the correlation higher. The second and higher axes also maximize the correlation between species scores and sample scores, but they are

**Figure 9.** RA–ordination diagram of the environmental data. For vegetation types and variables abbreviations. (∆) is the representative of the vegetation types. (+) is the representative of the environmental factors.


However, RA axis ends are compressed relative to the middle, and the second axis is often a distortion of the first axis, resulting in an arched effect.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 245

for the analysis of community data along gradients. DCA ordinates samples and species simultaneously. It is not appropriate for the analysis of a matrix of similarity values between

Detrended Correspondence Analysis (DCA) eliminates the arch effect by detrending (Hill and Gauch 1982). There are two basic approaches to detrending: by polynomials and by segments (ter Braak and Šmilauer 1998). Detrending by polynomials is the more elegant of the two: a regression is performed in which the second axis is a polynomial function of the first axis, after which the second axis is replaced by the residuals from this regression. Similar procedures are followed for the third and higher axes. Unfortunately, results of detrending by polynomials can be unsatisfactory and hence detrending by segments is preferred. To detrend the second axis by segments, the first axis is divided up into segments, and the samples within each segment are centered to have a zero mean for the second axis (see illustrations in Gauch 1982). The procedure is repeated for different 'starting points' of the segments. Although results in some cases are sensitive to the number of segments (Jackson and Somers 1991), the default of 26 segments is usually satisfactory. Detrending of

One way to determine this relationship is to analyze the species data first by detrended correspondence analysis (DCA) and to examine the length of the maximum gradient. If the gradient exceeds 3 sd (sd¼standard deviation) (most of the species are replaced along the gradient), the data show unimodal response (Hill & Gauch, 1980). For example, in North East rangeland of Semnan, DCA axis 1 has an eigenvalue of 0.86 and a gradient length of 15.44. DCA axis 2 with an eigenvalue of 0.016 and a gradient length of 0.39 is less important. Fig 8 shows ordination diagram for vegetation types and soil variables. Table 5 shows the

N NAME AX1 AX2 AX3 RANKED 1 RANKED 2 EIG=0.861 EIG=0.017

2 Ha.sp 0 27 12 3 Ar.si-Zy.eu 1713 2Ha.st 27

4 Zy.eu-Ar.si 1704 9 14 4Zy.eu-A.si 1704 4Zy.eu-A.si 9

As.spp-B.to 1710 12 12 6Se.ro 1694 3 Ar.si-Zy.eu 8

Figure 8 is an example of ordination plots showing the sites plotted on two axes. The ordination was a detrended correspondence analysis, and the sites with the same treatment

**Table 7.** Sample Scores- Weighted are weighted mean species scores (FIRST 6 EIGENVECTORS)

6 Se.ro 1694 0 15 2Ha.st 0 6Se.ro

B.to <sup>39</sup>

B.to 1710 1 Ar.si-Er.ce 23

1 Ar.si-Er.ce 1714 23 10 1 Ar.si-Er.ce 1714 5 Ar.au-As.spp-

3 Ar.si-Zy.eu 1713 8 0 5Ar.au-As.spp-

community data (Gauch, 1982b).

higher axes proceeds by a similar process.

score classified site.

5 Ar.au-

level are outline for clarity.

**Table 6.** Sample scores - which are weighted mean species scores

Row identifiers (species) plotted close together are similar in their relative profiles, and column identifiers plotted close together are correlated, enabling one to interpret not only which of the taxa are clustered, but also why they are clustered (Bakus,2007).

Reciprocal averaging (RA) yields both normal and transpose ordinations automatically. Like DCA, RA ordinates both species and samples simultaneously. Instead of maximizing 'variance explained', CA maximizes the correspondence between species scores and sample scores.

If species scores are standardized to zero mean and unit variance, the eigenvalues also represent the variance in the sample scores (but not, as is often misunderstood, the variance in species abundance).

The CA distortion is called the arch effect, which is not as serious as the horseshoe effect of PCA because the ends of the gradients are not incurved. Nevertheless, the distortion is prominent enough to seriously impair ecological interpretation (Bakus, 2007).

In other words, the spacing of samples along an axis may not affect true differences in species composition. The problems of gradient compression and the arch effect led to the development of Detrended Correspondence Analysis.

#### **Detrended Correspondence Analysis (DCA)**

Detrended correspondence analysis (DCA), an ordination technique used to describe patterns in complex data sets, and produced the following sequence of ordination axis scores (ter Braak,1986).

DCA is an eigenvector ordination technique based on Reciprocal Averaging, correcting for the arch effect produced from RA. Hill and Gauch (1980) report DCA results are superior to those of RA. Other ecologists criticize the detrending process of DCA. DCA is widely used for the analysis of community data along gradients. DCA ordinates samples and species simultaneously. It is not appropriate for the analysis of a matrix of similarity values between community data (Gauch, 1982b).

244 Multivariate Analysis in Management, Engineering and the Sciences

distortion of the first axis, resulting in an arched effect.

3 Ar.si-Zy.eu 2441 -73 -72 5Ar.au-As.spp-

**Table 6.** Sample scores - which are weighted mean species scores

development of Detrended Correspondence Analysis.

**Detrended Correspondence Analysis (DCA)** 

5 Ar.au-

scores.

in species abundance).

scores (ter Braak,1986).

However, RA axis ends are compressed relative to the middle, and the second axis is often a

As.spp-B.to <sup>206</sup>

B.to 2435 1 Ar.si-Er.ce 0

N NAME AX1 AX2 AX3 RANKED 1 RANKED 2 EIG=0.861 EIG=0.017

2 Ha.sp -25 0 0 3 Ar.si-Zy.eu 2441 2Ha.st 55

4 Zy.eu-Ar.si 2421 -69 -25 4Zy.eu-A.si 2421 4Zy.eu-A.si 69

As.spp-B.to 2435 206 76 6Se.ro 2399 3 Ar.si-Zy.eu 73 6 Se.ro 2399 -161 131 2Ha.st -25 6Se.ro 161

Row identifiers (species) plotted close together are similar in their relative profiles, and column identifiers plotted close together are correlated, enabling one to interpret not only

Reciprocal averaging (RA) yields both normal and transpose ordinations automatically. Like DCA, RA ordinates both species and samples simultaneously. Instead of maximizing 'variance explained', CA maximizes the correspondence between species scores and sample

If species scores are standardized to zero mean and unit variance, the eigenvalues also represent the variance in the sample scores (but not, as is often misunderstood, the variance

The CA distortion is called the arch effect, which is not as serious as the horseshoe effect of PCA because the ends of the gradients are not incurved. Nevertheless, the distortion is

In other words, the spacing of samples along an axis may not affect true differences in species composition. The problems of gradient compression and the arch effect led to the

Detrended correspondence analysis (DCA), an ordination technique used to describe patterns in complex data sets, and produced the following sequence of ordination axis

DCA is an eigenvector ordination technique based on Reciprocal Averaging, correcting for the arch effect produced from RA. Hill and Gauch (1980) report DCA results are superior to those of RA. Other ecologists criticize the detrending process of DCA. DCA is widely used

1 Ar.si-Er.ce 2443 55 -97 1 Ar.si-Er.ce 2443 5 Ar.au-

which of the taxa are clustered, but also why they are clustered (Bakus,2007).

prominent enough to seriously impair ecological interpretation (Bakus, 2007).

Detrended Correspondence Analysis (DCA) eliminates the arch effect by detrending (Hill and Gauch 1982). There are two basic approaches to detrending: by polynomials and by segments (ter Braak and Šmilauer 1998). Detrending by polynomials is the more elegant of the two: a regression is performed in which the second axis is a polynomial function of the first axis, after which the second axis is replaced by the residuals from this regression. Similar procedures are followed for the third and higher axes. Unfortunately, results of detrending by polynomials can be unsatisfactory and hence detrending by segments is preferred. To detrend the second axis by segments, the first axis is divided up into segments, and the samples within each segment are centered to have a zero mean for the second axis (see illustrations in Gauch 1982). The procedure is repeated for different 'starting points' of the segments. Although results in some cases are sensitive to the number of segments (Jackson and Somers 1991), the default of 26 segments is usually satisfactory. Detrending of higher axes proceeds by a similar process.

One way to determine this relationship is to analyze the species data first by detrended correspondence analysis (DCA) and to examine the length of the maximum gradient. If the gradient exceeds 3 sd (sd¼standard deviation) (most of the species are replaced along the gradient), the data show unimodal response (Hill & Gauch, 1980). For example, in North East rangeland of Semnan, DCA axis 1 has an eigenvalue of 0.86 and a gradient length of 15.44. DCA axis 2 with an eigenvalue of 0.016 and a gradient length of 0.39 is less important. Fig 8 shows ordination diagram for vegetation types and soil variables. Table 5 shows the score classified site.


**Table 7.** Sample Scores- Weighted are weighted mean species scores (FIRST 6 EIGENVECTORS)

Figure 8 is an example of ordination plots showing the sites plotted on two axes. The ordination was a detrended correspondence analysis, and the sites with the same treatment level are outline for clarity.

One additional note, the different plots illustrate another common approach when using ordination: including only data on certain species thought to be more important as indicator species. This allows for different runs of the test to detect similarities or differences in composition based on a particular group.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 247

strengths and weaknesses. While the choice between the two is not always straightforward,

Some of the issues are relatively minor: for example, computation time is rarely an important consideration, except for the hugest data sets. Some issues are not entirely resolved: the degree to which noise affects NMDS, and the degree to which NMDS finds

Since NMDS is a distance-based method, all information about species identities is hidden once the distance matrix is created. For many, this is the biggest disadvantage of NMDS

**Figure 11.** NMS ordination of plant species and environmental factors in along the rangelands of

DCA is based on an underlying model of species distributions, the unimodal model, while NMDS is not. Thus, DCA is closer to a theory of community ecology. However, NMDS may be a method of choice if species composition is determined by factors other than position along a gradient: For example, the species present on islands may have more to do with vicariance biogeography and chance extinction events than with environmental preferences

local rather than global options still need to be determined (Bakus, 2007).

it is worthwhile outlining a few of the key differences.

(Bakus, 2007).

Semnan in Iran

(∆) is the representative of the vegetation types. (+) is the representative of the environmental factors.

**Figure 10.** DCA–ordination diagram of the environmental data. For vegetation types and variables abbreviations.

#### **Nonmetric Multidimensional Scaling (NMS)**

NMS actually refers to an entire related family of ordination techniques. These techniques use rank order information to identify similarity in a data set. NMS is a truly nonparametric ordination method which seeks to best reduce space portrayal of relationships. The verdict is still out on this type of ordination. Gauch (1982b) claims NMS is not worth the extra computational effort and that it gives effective results only for easy data sets with low diversity. Others hold NMS is extremely effective (Kenkel and Orloci, 1986, Bradfield and Kenkel, 1987).

DCA and NMDS are the two most popular methods for indirect gradient analysis. The reason they have remained side-by-side for so long is because, in part, they have different strengths and weaknesses. While the choice between the two is not always straightforward, it is worthwhile outlining a few of the key differences.

246 Multivariate Analysis in Management, Engineering and the Sciences

composition based on a particular group.

abbreviations.

Kenkel, 1987).

**Nonmetric Multidimensional Scaling (NMS)** 

One additional note, the different plots illustrate another common approach when using ordination: including only data on certain species thought to be more important as indicator species. This allows for different runs of the test to detect similarities or differences in

(∆) is the representative of the vegetation types. (+) is the representative of the environmental factors.

**Figure 10.** DCA–ordination diagram of the environmental data. For vegetation types and variables

NMS actually refers to an entire related family of ordination techniques. These techniques use rank order information to identify similarity in a data set. NMS is a truly nonparametric ordination method which seeks to best reduce space portrayal of relationships. The verdict is still out on this type of ordination. Gauch (1982b) claims NMS is not worth the extra computational effort and that it gives effective results only for easy data sets with low diversity. Others hold NMS is extremely effective (Kenkel and Orloci, 1986, Bradfield and

DCA and NMDS are the two most popular methods for indirect gradient analysis. The reason they have remained side-by-side for so long is because, in part, they have different Some of the issues are relatively minor: for example, computation time is rarely an important consideration, except for the hugest data sets. Some issues are not entirely resolved: the degree to which noise affects NMDS, and the degree to which NMDS finds local rather than global options still need to be determined (Bakus, 2007).

Since NMDS is a distance-based method, all information about species identities is hidden once the distance matrix is created. For many, this is the biggest disadvantage of NMDS (Bakus, 2007).

**Figure 11.** NMS ordination of plant species and environmental factors in along the rangelands of Semnan in Iran

DCA is based on an underlying model of species distributions, the unimodal model, while NMDS is not. Thus, DCA is closer to a theory of community ecology. However, NMDS may be a method of choice if species composition is determined by factors other than position along a gradient: For example, the species present on islands may have more to do with vicariance biogeography and chance extinction events than with environmental preferences – and for such a system, NMDS would be a better *a priori* choice. As De'ath (1999) points out, there are two classes of ordination methods - 'species composition restoration' (e.g. NMDS) and 'gradient analysis' (e.g. DCA). The choice between the methods should ultimately be governed by this philosophical distinction.

Classification and Ordination Methods as a Tool for Analyzing of Plant Communities 249

The NMDS approach can in fact be tested each time measures of re semblance or dissimilarity are used to classify OTUs, whatever the causes and origins of arrangements

In the biplots, where only the first two axes were used, all methods based upon PCA gave a fair representation of the relative numerical importance of the rare species. The weights in CCA are given by a diagonal matrix containing the square roots of the row sums of the species data table. This means that a site where many individuals have been observed contributes more to the regression than a site with few individuals. CCA should only be used when the sites have approximately the same number of individuals, or when one explicitly wants to give high weight to the richest sites. This problem of CCA was one of our incentives for looking for alternative methods for canonical ordination of community

For the analysis of sites representing short gradients, PCA may be suitable. For longer gradients, many species are replaced by others along the gradient and this generates many zeros in the species data table. Community ecologists have repeatedly argued that the Euclidean distance (and thus PCA) is inappropriate for raw species abundance data involving null abundances (e.g. Orlóci 1978; Wolda 1981; Legendre and Legendre 1998). For that reason, CCA is often the method favoured by researchers who are analysing

De-trended correspondence analysis (DCA) is perhaps the most widely used method of indirect vegetation ordination. But direct ordination of vegetation and environment is achieved with canonical correspondence analysis (CCA). CCA is a relatively new method in which the axes of a vegetative ordination are restricted to linear groups of environmental

DCA and CA analyses should be run with the 'downweight rare species' option selected. We generally do not recommend NMS with the Euclidean distance measure; it performed the worst empirically, and has no advantages over the other methods (Culman et al, 2008)

Among the widely used ordination techniques for the plant community analysis Canonical Correspondence (CA) has shown to be superior to others such as PCA (Gauch, 1982). Most community data sets are heterogeneous and contain one or more gradients with lengths of at least two or three half-changes, which makes CA results ordinarily superior to PCA results. However, with relatively homogenous data sets with short gradients, PCA maybe better (Palmer, 1993). Despite the considerable superiority of the CA over PCA, CA is not superior to DCA, which corrects its two major faults such as "arch effect" and "compression

For complex and heterogeneous data sets, DCA is distinctive in its effectiveness androbustness (Gauch, 1982). Comparative tests of different indirect ordination techniques have shown that DCA provides a good result (Cazzier & Penny, 2002). This study found

that DCA provides better results than CA results (Malik & Husein, 2006).

compositional data, despite the problem posed by rare species.

of end of first axis" (Gauch, 1982; Kent & Coker, 1992).

found (Guiller et al, 1998).

composition data.

variables (Zhang et al, 2006)

Non-metric multidimensional scaling (NMS) (PC-ORD v. 4.25, 1999) was used to identify environmental variables correlated with plant species composition. A random starting location and Sorensen's distance measurement were used with the NMS autopilot slow and thorough method. Stepwise multiple linear regression (S-PLUS, 2000) was used to select models correlating vegetation cover and structure with environmental factors. Environmental explanatory factors that were not significant contributors (as determined from using stepwise selection at α = 0.05) were excluded from the final model (Davies et al, 2007).

A Monte Carlo test of 30 runs with randomized data indicated the minimum stress of the 2 axes NMS ordination were lower than would be expected by chance ( p = 0.0968). The final stress and instability of the 2-D solution were 23.71 and 0.00001, respectively. The first ordination axis (NMS1) captured 41.9% of the variability in the dataset and the second (NMS2) captured 31.8%, leading a cumulative 73.7% of variance in dataset explained (Fig.11).
