**7. Accuracy assessment**

126 Remote Sensing of Planet Earth

Training is the identification of a sample of pixels of known class membership obtained from reference data. These training pixels are used to derive spectral signatures for classification, and signature statistics are evaluated to ensure adequate separability. Then, the pixels of the image are allocated to the class with greatest similarity to the training data metrics (Alberti et al., 2004). The training stage of a supervised classification is designed to provide the necessary information. The training sites were used to train the supervised classification algorithm for classification process. In remote sensing, the aim of the training stage has typically been the production of descriptive statistics for each class which may then be used in the determination of class membership by the selected classifier (Foody & Mathur, 2006). Obtaining enough training data has been a tough question with land cover applications. Two sets of training data were finally prepared. The first set of data was prepared for the use of the traditional method. Meanwhile, the second set of the training data was used for the advance method. The use of the different datasets for classifying same

For advanced method, knowledge of the statistical distribution is not required. Rather NNs learn it from a representative training set. In our case, the training phase of the NN was based on the back-propagation (BP) learning rule to minimize the mean square error (MSE) between the desired target vectors and the actual output vectors. Training patterns were presented to the network, and the weights of each node were adjusted so that the approximation created by the NN minimized the error between the desired output and the added output created by the network. In a network each connecting line has an associated weight. NN are trained by adjusting these input weights (connection weights), so that the calculated outputs approximate the desired. In the learning phase, input patterns from training data are fed forward through a network initiated with random synapse weights. The root-mean-square error (RMSE) is calculated between the network outputs and the desired outputs. The errors are back-propagated through the network and the synapse weights are adjusted in order to reduce the total RMSE. This process continues until a convergence criterion is satisfied (Rumelhart et *al.,* 1986). The successful generalization of the NNs used in this application is indicated by the low residual RMS errors. The training is finished when the output value is equal to the ideal output value. Mean Squares of the

area by using different classifier will be discussed in section 8.3.

network Errors (MSE) is given by the Equation 3 (Moghadassi et al., 2009):

Meanwhile, the selection of training sets were based on field surveys, reference information from SPOT-5 images and visual inspection of the image of the particular area. Only the training samples believed to be the most useful and informative were selected for the classification. Training data acquisition can be a very costly process. Training data that are not carefully selected may introduce error. Collection of training data is the crucial step for image classification and it directly influences the classification accuracy (Wang et al., 2007). Training set size can impact greatly on classification result. However, size is only one

(3)

**6. Training areas development** 

where

Target output (τi) αi is output from neuron Accuracy assessment is an important aspect of land cover mapping as a guide to map quality. The accuracy assessment sites were used to provide a statistical assessment of the accuracy produced by each of the classification mapping approaches tested for this project. The accuracy assessment sites were set aside until the map was completed and accuracy assessment was performed. This process insured that the accuracy data were completely independent of the training data (Thomas et al., 2003).

The error matrix is the standard method used to assess classification accuracy. In the error matrix, the column represents the reference data, while the rows represent the classified data (Table 3). It is typical to extract several statistics from the error matrix: overall accuracy, Kappa coefficient, producer's accuracy and user's accuracy. To conduct the accuracy assessment, a total of 500 sample plots, covering different land cover types, were randomly allocated and examined using field data, a SPOT-5 image with 5m in spatial resolution and high resolution of google earth map. Luedeling & Buerkert (2008) used the google earth map as one of their validation method. The sampling pixels used for accuracy assessment were selected using the randomly stratified sampling method. In addition, the test pixels were uniformly distributed in entire image.


Table 3. Population error matrix with pij representing the proportion of area in the mapped land cover category i and the reference land cover category j.

Overall accuracy is the simplest and one of the most popular accuracy measures and is computed by dividing the total correct (i.e., the sum of the major diagonal) by the total number of pixels in the error matrix (Congalton, 1991). Meanwhile, Rosenfield and Fitzpatricklin (1986) identified the Kappa coefficient as a suitable accuracy measure in the thematic classification for representing class accuracy. Its strength lies in the fact that it takes all the elements (diagonal and non-diagonal) of the confusion matrix into consideration, in contrast to the overall accuracy measures which only consider the diagonal element of the matrix. In addition, Two types of thematic errors can be measured in a confusion matrix. They take into account the accuracy of individual categories. One is given by the producer's

Analysis of Land Cover Classification

Overall accuracy = 64.2% Kappa coefficient = 0.479

Overall accuracy = 76.2% Kappa coefficient = 0.649

in Arid Environment: A Comparison Performance of Four Classifiers 129

land class were correctly classified whereas 19 out of 26 observations for shadow class were correctly classified. Most of the incorrect pixels were classified as mountain class. This is not

Meanwhile, ML algorithm was the second traditional method that has been tested in this project. It gave better result than MD classifier. The overall accuracy was 77.6% while the kappa coefficient had a value of 0.659. For each classes (user accuracy), urban recorded 69.8%, 83.4% for mountain, 82.9% for land, 92.6% for vegetation, 66.7% for ritual area (lowest) and 100.0% for shadow (highest). Although overall classification result was better than MD, but two classes (vegetation and ritual area) showing lower percentage than MD. In the meantime, producer accuracy is varied between 61.5% (shadow) and 100.0% (ritual area). Urban, mountain, land and vegetation classes had a value of 85.4%, 69.6%, 73.3% and 96.2% respectively. Further evaluation of the error matrix shows that 388 out of 500 points used from the same random samples were correctly classified. The classifier had some difficulty separating cleared land from land under construction (urban) and mountain from urban area, as exhibited by error matrix table that showed 68 points were wrongly classified to both classes (53 points for mountain, 15 points for land). This is understandable because their spectral characteristics are very similar. However, the result of urban class revealed that significant improvement (nearly 20%) was achieved compared to the MD classifier. In

Urb Mou Lan Veg Rit Sha Total UA (%)

Urb Mou Lan Veg Rit Sha Total UA (%)

Urb **164** 53 15 1 0 2 235 69.8 Mou 15 **126** 7 0 0 3 151 83.4 Lan 11 2 **63** 0 0 0 76 82.9 Veg 4 1 4 **18** 0 0 27 66.7 Rit 1 0 0 0 **2** 0 3 66.7 Sha 0 0 0 0 0 **8** 8 100.0

Urb **108** 82 22 1 0 0 213 50.7 Mou 26 **129** 1 5 0 1 162 79.6 Lan 26 6 **56** 0 2 0 90 62.2 Veg 0 0 0 **5** 0 0 5 100.0 Rit 0 0 0 0 **4** 0 4 100.0 Sha 1 6 0 0 0 **19** 26 73.1

Total 161 223 79 11 6 20 **500** 

Total 195 182 89 19 2 13 **500** 

PA (%) 84.1 69.2 70.8 94.7 100.0 61.5

Table 5. Error matrix derived from Maximum Likelihood classifier

PA (%) 67.1 57.9 70.9 45.7 66.7 95.0

Table 4. Error matrix derived from Minimum Distance-to-Mean classifier

surprising because most of the shadow appear within mountainous area.

accuracy, which indicates the proportion of ground base reference samples correctly assigned. It details errors of omission, i.e., when a pixel is omitted from its correct category. The other error is given by the user's accuracy, which indicates the proportion of data from the estimation map representing that category on the ground. It is a measure of errors of commission, i.e., when a pixel is committed to an incorrect category (Avelar et al., 2009).
