**4. Traditional method**

Of the many classifiers, MD and ML may be the most popular due to their simple theory and availability in almost any image processing or GIS software packages. Both of the classifiers also recognised as statistical method.

### **4.1 Minimum distance to mean (MD)**

MD is a non-parametric classifier that has no assumption of data sets for features of interest. It is computationally simple and fast, only requiring the mean vectors for each band from the training data. Candidate pixels are assigned to the class that is spectrally closer to the sample mean. This method does not consider class variability; thus, large differences in the variance of the classes often lead to misclassification (Lu et al., 2004). The minimum distance algorithm allocates a pixel by its minimum Euclidean distance to the center of each class. The pixel is assigned to the closest class, or marked as unknown if it is farther than a predefined distance from any class mean. Though if a pixel lies on the edge of a class, it might be that the value of the pixel is closer to the mean of a neighbor class and it will be assigned to the neighbor class (Avelar et al., 2009).

#### **4.2 Maximum likelihood (ML)**

ML is a parametric classifier that assumes normal spectral distribution of data within each class. An equal prior probability among the classes is also assumed. This classifier is based on the probability that a pixel belongs to a particular class. It takes the variability of classes into account by using the covariance matrix; thus, it requires more computation per pixel compare to MD. The ML classifier considers that the geometrical shape of the set of pixels belonging to a class can be described by an ellipsoid. Pixels are grouped according to their position in the influence zone of a class ellipsoid. The probability that a pixel will be a member of each class is evaluated. The pixel is assigned to the class with the highest probability value or left as unknown if the probability value lies below a pre-defined threshold (Avelar et al., 2009).

ML requires the use of training pixels for each class and is therefore dependent on the availability of enough training pixels to produce reasonable estimates of the mean class vector (class spectral signature) and covariance matrix. For each class, training pixels were collated from all images of the same resolution, giving a pooled training sample set (Lim et al., 2009). ML requires sufficient representative spectral training sample data for each class to accurately estimate the mean vector and covariance matrix needed by the classification algorithm. When the training samples are limited, then inaccurate estimation of the mean vector and covariance matrix often results in poor classification results. Traditional pixelbased classification approaches are limited as regards the analysis of heterogeneous landscapes and lead to the reported 'salt and pepper' results (Aplin et al., 1999; Lu and Weng, 2007). Therefore, the ML classifier needs more training data to characterize the classes than the other methods (Pignatti et al., 2009).
