**2. Active machine learning**

computer (machine) to learn from data. Over the past decades, learning algorithms have found widespread applications in numerous areas such as computer vision, object recognition, web search, natural language processing, emotion recognition, etc. The performance of different

Supervised machine learning learns from the available data (experience), which is given in the form of training data (instances). The knowledge induced from the data can then be used for descriptive or predictive purposes. Supervised learning problems can be categorized as either classification or regression, depending on the output label of the data [2]. Classification problems assign a discrete class label to input instance, while regression problems have continuous numeric values. Classification is a function that assigns a new object (or instance) as belonging to one of the predefined classes [1]. The goal of classification is to accurately predict

**Figure 1** shows the workflow in supervised learning. We can see that there are two different steps: training and prediction. During training, a feature extractor is used to convert each input (training data instance) to a feature set. These feature sets with labels are fed into the learning algorithm to generate a classifier model. During prediction, the same feature extractor is used to convert unseen inputs to feature sets that are then fed into the model, which

ML with many other disciplines of AI are gaining popularity, and they have been used in numerous fields and industries, including finance, healthcare, education, and psychology. Since learning is an important aspect of intelligent behavior, ML can be used commonly for data analysis in psychology and cognitive science. Recently, ML methods have been investi-

In ML, active learning refers to an approach that selects the queries (instances) for labeling from a large pool of unlabeled data [5, 6]. In most cases, an active learning algorithm outperformed random sampling method and reduced number of instances that necessary to achieve similar performance. Active learning is often used for problems where it is difficult (expensive

ML algorithms strongly depends on the size and structure of dataset of the domain.

the target class for each instance in the data.

gated in experimental psychology and human categorization.

and/or time-consuming) to obtain labeled training data [7, 8].

**Figure 1.** The architecture of a supervised classification.

generates predicted labels [3, 4].

18 Active Learning - Beyond the Future

Semi-supervised learning (or SSL) has attracted a highly considerable amount of interest in ML. SSL techniques allow classifiers or learners to learn from labeled and unlabeled data at the same time [13, 14]. Typically, they are used when we have a small-size labeled dataset with a large-size unlabeled dataset. **Figure 2** intuitively shows the difference between supervised and semi-supervised learning. Actually most real-world learning scenarios are SSL. During the last two decades, SSL methods such as active learning, co-training, and co-testing have significantly improved learning performance in various applications.

When a machine learning model is trained, learning is performed on a random subset of all the available sets of labeled training data. We will refer to this mode of learning as passive learning (PL). In PL mode of learning, the classifier (learner) does not participate interactively with the

**Figure 2.** Supervised vs. semi-supervised learning.

teacher [13]. A passive learner receives a random dataset from the world and then produces a classifier or model. Thus, PL is more straightforward and easier to implement.

Active machine learning (AML) is a popular research area in ML [5, 8, 13]. It allows selection of the most informative instances in training dataset of the domain for manual labeling. AML aims to produce a highly accurate classifier using as few labeled instances as possible, thereby minimizing the cost of obtaining labeled data [5]. With this assumption, AML is a specialized version of SSL, and they aim to reduce manual labeling workload. We provide a comparison of AML and PL in **Table 1** [5, 13]. Although most research in AL has tended to focus on binary classification problems and achieve high classification accuracy, some studies have addressed multicategory classification [15]. In AML, the classifier is initially trained on a small set of instances (labeled pool *L*), and it chooses informative instances from an unlabeled pool *U* of data to request labels from the expert or oracle (e.g., a human annotator) that can upon request provide a label for any instance in this pool [16], and this is repeated till convergence. This way, only a small quantity of unlabeled instances needs to be labeled to get very good performance. **Figure 3** illustrates the AML cycle.

AML is an important technique in machine learning, because labeled data is often more difficult and expensive to obtain than unlabeled data [17]. For example, if one is classifying web pages into categories based on the content, labeled data would likely be collected by hand, while unlabeled data could be found from the Internet automatically. Multiple studies proposed several AML algorithms and applied them to many applications. They have shown that when AML is used, ML models require significantly less training data and can still perform well without loss of accuracy [4, 8].

**3. AML selection strategies**

**Figure 3.** Pool-based AML cycle.

**4. Human AML**

modified based on the error rate of individual classifier.

There are a number of AML query selection strategies, which have been presented by Settles [5]: (1) Uncertainty sampling is the simplest and the most commonly used strategy. Uncertainty sampling focuses on selecting the instance that the classifier is most uncertain about to label. This strategy can be divided into two categories: maximum entropy of the estimated label and minimum margin (distance of an instance to the decision boundary). (2) Expected error reduction, which aims to query instance that minimizes the expected error of the classifier. (3) Query by Committee (QBC) in which the most informative instance is the one that a committee of classifiers finds most disagreement. Bagging and boosting are used to generate committees of classifiers from the same dataset. They aim to combine a set of weak classifiers to create a single strong classifier. While bagging creates each base classifier independently, boosting allows these classifiers to influence each other during training process [18]. Boosting is an iterative process that initially assigns equal weight to each of the training samples; then the weights are

Human Active Learning

21

http://dx.doi.org/10.5772/intechopen.81371

Actually human learns concepts in similar way of SSL, from a limited number of labeled data (e.g., parental labeling of objects) to a large amount of unlabeled data (e.g., observation of objects without naming in real life) [14]. In the ML scenario, it is easy to obtain the predictions

In the real world, learners are not provided with labeled category information with every object they encounter (like in supervised category learning tasks), nor do they receive only unlabeled

of the classifier, and it is usually expensive to obtain the actual labels for instances.

Typically, AML approaches select a single unlabeled instance, which is the most informative at that iteration, and then retrain the classifier. The training process in this case is hard and time-consuming; further, repeated retraining is inefficient. Thus, a batch mode AML strategy [9, 10] that selects multiple instances each time is more appropriate under these circumstances.

Pool-based method is the most prominent technique used in AML, and most research work of AML is pool based in recent years as unlabeled data has become easier to collect [5]. Pool-based AML assumes that the model has access to the entire set of unlabeled data at selection time.


**Table 1.** Comparison of active machine learning and passive learning.

**Figure 3.** Pool-based AML cycle.

teacher [13]. A passive learner receives a random dataset from the world and then produces a

Active machine learning (AML) is a popular research area in ML [5, 8, 13]. It allows selection of the most informative instances in training dataset of the domain for manual labeling. AML aims to produce a highly accurate classifier using as few labeled instances as possible, thereby minimizing the cost of obtaining labeled data [5]. With this assumption, AML is a specialized version of SSL, and they aim to reduce manual labeling workload. We provide a comparison of AML and PL in **Table 1** [5, 13]. Although most research in AL has tended to focus on binary classification problems and achieve high classification accuracy, some studies have addressed multicategory classification [15]. In AML, the classifier is initially trained on a small set of instances (labeled pool *L*), and it chooses informative instances from an unlabeled pool *U* of data to request labels from the expert or oracle (e.g., a human annotator) that can upon request provide a label for any instance in this pool [16], and this is repeated till convergence. This way, only a small quantity of unlabeled instances needs to be labeled to get very good performance. **Figure 3** illustrates

AML is an important technique in machine learning, because labeled data is often more difficult and expensive to obtain than unlabeled data [17]. For example, if one is classifying web pages into categories based on the content, labeled data would likely be collected by hand, while unlabeled data could be found from the Internet automatically. Multiple studies proposed several AML algorithms and applied them to many applications. They have shown that when AML is used, ML models require significantly less training data and can still perform

Typically, AML approaches select a single unlabeled instance, which is the most informative at that iteration, and then retrain the classifier. The training process in this case is hard and time-consuming; further, repeated retraining is inefficient. Thus, a batch mode AML strategy [9, 10] that selects multiple instances each time is more appropriate under these circumstances. Pool-based method is the most prominent technique used in AML, and most research work of AML is pool based in recent years as unlabeled data has become easier to collect [5]. Pool-based AML assumes that the model has access to the entire set of unlabeled data at

No control over training instances Selects training instances from a pool of unlabeled data

(queries)

(iterative process)

Learner sees one or a subset instances at a time

classifier or model. Thus, PL is more straightforward and easier to implement.

the AML cycle.

20 Active Learning - Beyond the Future

selection time.

a classifier (batch process)

well without loss of accuracy [4, 8].

**PL AML**

One classifier induced Many Simple stopping criteria Complex

**Table 1.** Comparison of active machine learning and passive learning.

Examine the entire training data before inducing

Large number of required training instances Relatively small
