3. Feature subset selection

Feature subset selection is the process of selecting those features that are most useful to a particular classification problem from all those available. The most popular methods for feature reduction in remote sensing are the use of the principal components transform [6]. The principal components (PC) transformation transforms the original data into a new smaller set, which are less correlated to the first data set. Therefore, a reduced number of new variables represent the information content of the original set. However, although frequently used, the PC transform is not appropriate for feature extraction in classification, because it does not consider the classes of interest, but only the data set. Therefore, it may not produce the optimum subspace for the classification. So, we are utilizing genetic algorithm (GA) for feature subset selection [3, 49–54, 57–60, 63–66].

### 3.1 Genetic algorithms

Computational studies of Darwinian evolution and natural selection have led to numerous models for computer optimization. GAs comprise a subset of these evolution-based optimization techniques focusing on the application of selection, mutation, and recombination to a population of competing problem solutions. Being a directed search rather than an exhaustive search, population members cluster near good solutions; however, the GA's stochastic component does not rule out wildly different solutions, which may turn out to be better. This has the benefit that, given enough time and a well-bounded problem, the algorithm can find a global optimum. This makes them well suited to feature selection problems, and they can find near optimum solutions using little or no prior knowledge.

There are three major design decisions to consider when implementing GA to solve a particular problem. A representation for candidate solutions must be chosen and encoded on the GA chromosome, an objective (fitness) function must be specified to evaluate the quality of each candidate solution, and finally the GA run parameters must be specified, including which genetic operators to use, such as crossover, mutation, selection, and their possibilities of occurrence. Until a satisfactory solution is found, the fitness-dependent selection and application of genetic operators to generate successive generations of individuals are repeated many times.

and 11,100 as a result of crossover. If the obtained feature set X using wavelet-based technique contains 45 features for each pixel of the image of size 400 400 pixels, then the feature set X of the data is of dimension 160,000 45, where each column represents the features of the respective pixel in the data. Using GA, the feature set X of size 45 400 400 is mapped into new feature set denoted by Y of size 17 400 400. This reduction in feature set improves the overall execution speed and the classification accuracy [52]. The classification results (for both full feature

 84.2437 0.7717 81.8685 78.8469 84.534 0.7802 82.2341 79.8645 85.5042 0.7898 83.1250 80.5573 85.2941 0.7875 82.8616 80.1797 86.528 0.8125 84.5967 81.8991 85.2941 0.7855 81.7592 79.8890 84.6639 0.7772 79.9678 79.6327 85.5042 0.7900 81.0822 80.2417 85.0840 0.7855 82.3767 80.3079 85.9244 0.7958 82.0482 80.2008

Land Cover/Land Use Mapping Using Soft Computing Techniques with Optimized Features

Overall Kappa Producer User

The accuracy assessments are made using accuracy indices, namely, overall accuracy, producer accuracy, user accuracy, and kappa coefficient and are listed in

Using the features obtained, so far the classification is done using the obtained

features. We use different classifiers for the classification. The classifier is an

set and optimal feature set) are shown in Figure 5.

Bold: 17 Features is giving Max Overall Kappa Producer and User's Accuracy.

Accuracy indices for various feature sets.

Classified output using DB2 with (a) full feature set and (b) optimal feature set.

DOI: http://dx.doi.org/10.5772/intechopen.86218

Number of features Accuracy indices

algorithm that maps the input data to a specified category.

Table 1.

65

Table 1.

Figure 5.

4. Classification

In the problem of feature selection, feature subsets are represented as binary strings where a value of 1 will represent the inclusion of a particular feature in the training process and a value of 1 will represent its absence. Since a chromosome is represented through a binary string, genetic algorithm will operate on a pool of binary strings. The mutation and crossover operators operate in the following way: mutation operates on a single string and generally changes a bit at random. Thus, a string 10,010 may, as a consequence of random mutation, get changed to 10,110. Crossover on two parent strings produces two offsprings. With a randomly chosen crossover position 2, the two strings 01101 and 11,000 yield the offspring 01000

Land Cover/Land Use Mapping Using Soft Computing Techniques with Optimized Features DOI: http://dx.doi.org/10.5772/intechopen.86218

Figure 5.

utilized [2]. The architecture of CNN is shown in Figure 4. In convolution layer, the features are extracted using different filters to input image. The ReLU layer handles the output from convolutional layer by figuring out the negative pixel value into zero, retaining the dimensionality of the matrix unchanged. Pooling helps in retaining the most important information while reducing the size of feature map. Each training sample is applied with the same processes and thus resulting in

Feature subset selection is the process of selecting those features that are most useful to a particular classification problem from all those available. The most popular methods for feature reduction in remote sensing are the use of the principal components transform [6]. The principal components (PC) transformation transforms the original data into a new smaller set, which are less correlated to the first data set. Therefore, a reduced number of new variables represent the information content of the original set. However, although frequently used, the PC transform is not appropriate for feature extraction in classification, because it does not consider the classes of interest, but only the data set. Therefore, it may not produce the optimum subspace for the classification. So, we are utilizing genetic algorithm (GA)

Computational studies of Darwinian evolution and natural selection have led to

numerous models for computer optimization. GAs comprise a subset of these evolution-based optimization techniques focusing on the application of selection, mutation, and recombination to a population of competing problem solutions. Being a directed search rather than an exhaustive search, population members cluster near good solutions; however, the GA's stochastic component does not rule out wildly different solutions, which may turn out to be better. This has the benefit that, given enough time and a well-bounded problem, the algorithm can find a global optimum. This makes them well suited to feature selection problems, and

they can find near optimum solutions using little or no prior knowledge.

There are three major design decisions to consider when implementing GA to solve a particular problem. A representation for candidate solutions must be chosen and encoded on the GA chromosome, an objective (fitness) function must be specified to evaluate the quality of each candidate solution, and finally the GA run parameters must be specified, including which genetic operators to use, such as crossover, mutation, selection, and their possibilities of occurrence. Until a satisfactory solution is found, the fitness-dependent selection and application of genetic operators to generate successive generations of individuals are repeated

In the problem of feature selection, feature subsets are represented as binary strings where a value of 1 will represent the inclusion of a particular feature in the training process and a value of 1 will represent its absence. Since a chromosome is represented through a binary string, genetic algorithm will operate on a pool of binary strings. The mutation and crossover operators operate in the following way: mutation operates on a single string and generally changes a bit at random. Thus, a string 10,010 may, as a consequence of random mutation, get changed to 10,110. Crossover on two parent strings produces two offsprings. With a randomly chosen crossover position 2, the two strings 01101 and 11,000 yield the offspring 01000

different feature sets.

3.1 Genetic algorithms

many times.

64

3. Feature subset selection

Land Use Change and Sustainability

for feature subset selection [3, 49–54, 57–60, 63–66].

Classified output using DB2 with (a) full feature set and (b) optimal feature set.


#### Table 1.

Accuracy indices for various feature sets.

and 11,100 as a result of crossover. If the obtained feature set X using wavelet-based technique contains 45 features for each pixel of the image of size 400 400 pixels, then the feature set X of the data is of dimension 160,000 45, where each column represents the features of the respective pixel in the data. Using GA, the feature set X of size 45 400 400 is mapped into new feature set denoted by Y of size 17 400 400. This reduction in feature set improves the overall execution speed and the classification accuracy [52]. The classification results (for both full feature set and optimal feature set) are shown in Figure 5.

The accuracy assessments are made using accuracy indices, namely, overall accuracy, producer accuracy, user accuracy, and kappa coefficient and are listed in Table 1.
