**4.3. Strategy 3: different training sets using random sampling with replacement over weighted data**

This ensemble strategy can be implemented by weighted resampling of the dataset serially by focusing on difficult examples which are not correctly classified in the previous steps (i.e., *boosting*; **Figure 2c**). Boosting helps to decrease the bias of otherwise stable learners such as linear classifiers or univariate decision trees also known as decision stumps.

### **4.4. Strategy 4: different algorithms**

The other ensemble strategy (i.e., *voting*; **Figure 2d**) is to use different learning algorithms to train different base learners on the same dataset. So, the ensemble includes diverse algorithms that each takes a completely different approach. The main idea behind this kind of ensemble learning is taking advantage of classification algorithms' diversity to face complex data.

### **4.5. Characteristic of different ensemble classifiers**

Although ensemble classifiers have a common goal to construct multiple, diverse and predictive models and finally to combine their outputs, each strategy is carried out in different ways using different training sets, combiner or inducer. **Table 2** summarizes the properties of different ensemble strategies, the popular algorithms under each category and pros and cons of each ensemble classifier.

a satisfactory architecture, computational cost, complex nature of environmental data, and

**Algorithm Training set Classifiers Combiner Inducer Ensemble** 

Majority voting

Majority voting

Weighted majority voting

Weighted majority voting

Metalearning

Metalearning

Majority voting

Majority voting

Single inducer

Single inducer

Single inducer

Single inducer

Single inducer

Inducer independent

Inducer dependent (decision tree)

Inducer independent

Inducer independent

Inducer independent

Inducer independent

independent

independent

**Table 2.** Characteristic of different ensemble classifiers.

**strategy**

2

3

Multiinducer 1, 4 Good

Multiinducer 1, 4 Predictions

Multiinducer 4 Increase

1 Minimizes variance

3 Boosts the

performance of the weak learners

performance

are graded

predictive accuracy

understand and implement

4 Simple to

**Advantage Weakness**

Ensemble Methods in Environmental Data Mining http://dx.doi.org/10.5772/intechopen.74393

> A relatively large ensemble size—loss of cooperation with each other

9

Degrades with noise

Storage and time complexity

Storage and time complexity

How classifiers are selected

Limited to a single algorithm performance

• *Selecting ensemble strategy*: it is a difficult work to determine the best ensemble strategy in terms of accuracy, scalability, computational cost, usability, compactness, and speed of classification. Environmental researchers should know how to construct an ensemble model and be aware of alternative strategies and advantages/disadvantages of them. To overcome this problem, environmental data mining is mostly addressed to computer and

• *Determining a satisfactory architecture*: there are two levels of problems in designing ensemble architecture. First, it is necessary to determine the optimal ensemble size. There are three approaches for determining the ensemble size: (i) preselection of the ensemble size, (ii) selection of the ensemble size while training, and (iii) postselection of the ensemble size (pruning). Second, how are learning algorithms and their respective parameters selected to construct the best ensemble? The best values for the input parameters of the algorithms should be determined through a number of tries. These problems are fundamentally different and should be solved separately to improve classification accuracy. Furthermore, it is necessary to update the model when new environmental data is acquired, allowing the

finally post processing:

Bagging Random

Boosting Weighted

AdaBoost Weighted

Stacking Resampling

Grading Resampling

Random forest

resampling

resampling

resampling

and k-folding

and k-folding

Voting Same dataset Inducer

Voting Same dataset Inducer

Random resampling + feature subset

environmental scientists working together.

up-to-date model to change over time.

**Figure 2.** Different ensemble strategies: (a) bagging, (b) random forest, (c) AdaBoost, and (d) voting.

### **4.6. Challenges of ensemble learning in environmental data mining**

Even ensemble-based environmental data mining is helpful based on the advantages indicated in Section 3; there are also challenges that could be overcome when you are aware. Challenges can be grouped under five main titles: selecting ensemble strategy, determining


**Table 2.** Characteristic of different ensemble classifiers.

**4.6. Challenges of ensemble learning in environmental data mining**

8 Data Mining

**Figure 2.** Different ensemble strategies: (a) bagging, (b) random forest, (c) AdaBoost, and (d) voting.

Even ensemble-based environmental data mining is helpful based on the advantages indicated in Section 3; there are also challenges that could be overcome when you are aware. Challenges can be grouped under five main titles: selecting ensemble strategy, determining a satisfactory architecture, computational cost, complex nature of environmental data, and finally post processing:


• *Computational cost*: increasing the number of classifiers usually increases computational cost. To overcome this problem, users may predefine a suitable ensemble size limit, or classifiers can be trained in parallel.

**5.2. Comparison of ensemble strategies**

Classification accuracies, precision, recall, and f-measure values for the applied algorithms were obtained using tenfold cross validation. Comparison of the classification accuracies of the applied algorithms for each dataset is displayed in **Figure 3**. Four weak learners (support vector machine (SVM), naive Bayes (NB), decision tree (DT) applied with C4.5 algorithm, and K-nearest neighbor (KNN)) and four ensemble learners (bagging, random forest (RF), AdaBoost, and voting) were used to construct classification models from environmental data. The base classifiers for the ensemble learners were selected as the one which gave the best

The experimental results were obtained with optimum parameters (given in **Table 4**) using grid search. The best parameters of SVM were found for the complexity parameter, *C* for the

respectively. To model DT, confidence factor, *C*, for pruning and the minimum number of objects, *M*, for leaf were obtained in the intervals of [0.05–0.95] and [1–10]. The number of neighbors, *N* for KNN classifier, was selected in the range of [1, 25]. For RF classifier, the number of randomly chosen attributes, *K*, and the number of iterations to be performed, *I*, were found in the intervals [0–15] and [10–100], respectively. The number of ensemble classifiers for bagging is 10 for each dataset. Weight threshold for weight pruning, *P*, and the number of iterations to be performed, *I*, were selected in the interval [10–100] for AdaBoost classifier. Voting was performed using the optimum parameters of SVM, NB, DT, KNN, and RF classifiers.

The objective of this experiment is to remark the success of the ensemble strategies in terms of classification accuracy concerning environmental data. According to the experimental results, it is apparent that the number of correctly classified instances is increased if ensemble strategies are applied. Especially, AdaBoost classifier provides significant performance gain compared to other models. SVM has superiority over other single learners; hence, most of the

for *k* ϵ {−3, …, 3}], and [1–10],

Ensemble Methods in Environmental Data Mining http://dx.doi.org/10.5772/intechopen.74393 11

classification accuracy among the applied weak learners for the respective dataset.

exponent value, *E* for polykernel parameters in the interval [10<sup>k</sup>

ensemble models selected it as the base learner.

**Figure 3.** Comparison of single and ensemble classifiers in terms of classification accuracies.

