**4. Human AML**

Actually human learns concepts in similar way of SSL, from a limited number of labeled data (e.g., parental labeling of objects) to a large amount of unlabeled data (e.g., observation of objects without naming in real life) [14]. In the ML scenario, it is easy to obtain the predictions of the classifier, and it is usually expensive to obtain the actual labels for instances.

In the real world, learners are not provided with labeled category information with every object they encounter (like in supervised category learning tasks), nor do they receive only unlabeled information (like in unsupervised category learning tasks). People use labeled (with feedback) and unlabeled (no feedback) information when learning categories. In supervised learning, individuals learn the categories by correcting their performance based on the feedback they receive. The feedback people receive about categories may be either true or false. However, unsupervised learning category gives no feedback (information) about the category of an object. The individual learns from his/her experiences with different category objects, without receiving any feedback. Actually, humans categorize real-world categories most similarly to an SSL technique.

Gibson et al. [19] have used the equivalences between models found in human categorization and machine learning research to explain how the SSL techniques can be applied to human learning. In human AML, participants are usually first shown a small number of labeled instances, followed by a large set of unlabeled instances [20]. A set of experiments conducted by Gibson et al.[19] showed that SSL models are useful for explaining human behavior when they used both labeled and unlabeled data.

the PL condition, the sequence of data was generated randomly by the experiment. They also included a "Machine-Yoked" condition where participants saw sequences of observations

Human Active Learning

23

http://dx.doi.org/10.5772/intechopen.81371

The closest research to investigation by Castro et al. is [24] and [25] using a slightly modified procedure. They have successfully shown that learners benefit from the selection in category learning. Gureckis and Markant [25] concluded that AML can be superior because it allows humans to use their prior experience and current hypotheses to select the most helpful

The work [24] by the same authors examined the interaction of self-directed information selection and category learning. Self-directed learning in humans can be inspired by "active learning" research in the ML literature [25, 26]. In this study, participants learned about two categories of "antennas" that varied along two dimensions (circles that differed in size and the orientation of a central line segment) and received one of the two television stations (CH1 or CH2). They compared active learning (or self-directed learning) condition, in which participants designed stimuli to learn about, with passive condition in which instances were generated from predefined distributions. Their results showed that for simple one-dimensional rules, active learners acquired the correct category rule faster than passive learners. Also, the AML advantage only

Sim et al. [27] showed that school-age children learn more effectively when they are allowed to make decisions about what information they wish to gather than others who could only observe samples that were randomly generated for them. This results lead to the conclusion that children are capable of learning from the data they generate by themselves even at an early age. This result also suggests that the children's information gathering was informed by uncertainty and previous feedback, leading them to sample items that were near the true category boundary. This result was successfully replicated in [24]. Adams and Kachergis proved the effectiveness of AML for preschoolers, and they use an informative sampling strategy in an active category learning task. The authors suggest that children's performance in the AML

Kachergis et al. [28] in their paper investigated whether AML is better than PL in a crosssituational word learning context. They also investigated the strategies and found that most

Researches in computer science on computationally efficient AML have inspired new theo-

created by active learners but did not have control over the sequence.

**Figure 4.** Example stimuli used in the experiment, with corresponding values.

held for the less complex, rule-based category.

task is related to their early math and preliteracy skills.

retical approaches to inquiry behavior in humans.

learners use immediate repetition to disambiguate pairings.

instances (e.g., asking a question about something that is especially confusing).

Unlike machines, a human learner gets tired as she/he answers questions, so finding out whether she/he knows a concept or not (i.e., getting the labels) is usually an expensive task. Therefore, using AML for human learning may help it become more efficient and effective and hence reduce the cost of teaching [21]. It might be costlier to teach an example to a human than it is to teach a computer program.

Although AML has been studied in different domains, such as video annotation and web page classification, its applications to human learning have been studied very little. There are a few empirical studies on the applications of AML to the human learning domain. The first such study was by [22], which showed that humans can use unlabeled data in addition to labeled data in categorization tasks. The authors in [23] proved by empirical evidence that human category learning is influenced by unlabeled data in a supervised categorization task, but they did not explain how individual can select these unlabeled examples.

There are a few studies on the applications of AML to the human category learning domain. Castro et al. [12] investigated what they refer to as "human active learning." They tried to answer a research question "Can machine learning be used to enhance human learning?" in the context of human category learning. We consider [16] as the most interesting study in human AML field. Castro et al. showed that humans learn faster with greater performance when they can actively select the instances from a pool of unlabeled data instead of random sampling and their performance is nearly optimal. However, they did not address how humans choose the next best instance. Moreover, they conducted their experiments by humans in a simple binary classification task, not in a real-life situation. Participants were presented with artificial novel 3D shapes (stimuli) that varied along a single, continuous dimension (spiky to smooth) and were given feedback as to which category the stimulus belonged to (see **Figure 4**). The task for each participant was to find out the precise egg shape (category boundary) for which eggs that were any spikier would hatch into snakes, while eggs that were any smoother would hatch into birds. Authors compared the performance of three distinct conditions for each participant. In the active learning condition, the participant could choose specific observations to test their beliefs based on her previous queries and their noisy labels, whereas, in

**Figure 4.** Example stimuli used in the experiment, with corresponding values.

information (like in unsupervised category learning tasks). People use labeled (with feedback) and unlabeled (no feedback) information when learning categories. In supervised learning, individuals learn the categories by correcting their performance based on the feedback they receive. The feedback people receive about categories may be either true or false. However, unsupervised learning category gives no feedback (information) about the category of an object. The individual learns from his/her experiences with different category objects, without receiving any feedback. Actually, humans categorize real-world categories most similarly to an SSL technique. Gibson et al. [19] have used the equivalences between models found in human categorization and machine learning research to explain how the SSL techniques can be applied to human learning. In human AML, participants are usually first shown a small number of labeled instances, followed by a large set of unlabeled instances [20]. A set of experiments conducted by Gibson et al.[19] showed that SSL models are useful for explaining human behavior when

Unlike machines, a human learner gets tired as she/he answers questions, so finding out whether she/he knows a concept or not (i.e., getting the labels) is usually an expensive task. Therefore, using AML for human learning may help it become more efficient and effective and hence reduce the cost of teaching [21]. It might be costlier to teach an example to a human than it is to teach a

Although AML has been studied in different domains, such as video annotation and web page classification, its applications to human learning have been studied very little. There are a few empirical studies on the applications of AML to the human learning domain. The first such study was by [22], which showed that humans can use unlabeled data in addition to labeled data in categorization tasks. The authors in [23] proved by empirical evidence that human category learning is influenced by unlabeled data in a supervised categorization task, but they

There are a few studies on the applications of AML to the human category learning domain. Castro et al. [12] investigated what they refer to as "human active learning." They tried to answer a research question "Can machine learning be used to enhance human learning?" in the context of human category learning. We consider [16] as the most interesting study in human AML field. Castro et al. showed that humans learn faster with greater performance when they can actively select the instances from a pool of unlabeled data instead of random sampling and their performance is nearly optimal. However, they did not address how humans choose the next best instance. Moreover, they conducted their experiments by humans in a simple binary classification task, not in a real-life situation. Participants were presented with artificial novel 3D shapes (stimuli) that varied along a single, continuous dimension (spiky to smooth) and were given feedback as to which category the stimulus belonged to (see **Figure 4**). The task for each participant was to find out the precise egg shape (category boundary) for which eggs that were any spikier would hatch into snakes, while eggs that were any smoother would hatch into birds. Authors compared the performance of three distinct conditions for each participant. In the active learning condition, the participant could choose specific observations to test their beliefs based on her previous queries and their noisy labels, whereas, in

did not explain how individual can select these unlabeled examples.

they used both labeled and unlabeled data.

computer program.

22 Active Learning - Beyond the Future

the PL condition, the sequence of data was generated randomly by the experiment. They also included a "Machine-Yoked" condition where participants saw sequences of observations created by active learners but did not have control over the sequence.

The closest research to investigation by Castro et al. is [24] and [25] using a slightly modified procedure. They have successfully shown that learners benefit from the selection in category learning. Gureckis and Markant [25] concluded that AML can be superior because it allows humans to use their prior experience and current hypotheses to select the most helpful instances (e.g., asking a question about something that is especially confusing).

The work [24] by the same authors examined the interaction of self-directed information selection and category learning. Self-directed learning in humans can be inspired by "active learning" research in the ML literature [25, 26]. In this study, participants learned about two categories of "antennas" that varied along two dimensions (circles that differed in size and the orientation of a central line segment) and received one of the two television stations (CH1 or CH2). They compared active learning (or self-directed learning) condition, in which participants designed stimuli to learn about, with passive condition in which instances were generated from predefined distributions. Their results showed that for simple one-dimensional rules, active learners acquired the correct category rule faster than passive learners. Also, the AML advantage only held for the less complex, rule-based category.

Sim et al. [27] showed that school-age children learn more effectively when they are allowed to make decisions about what information they wish to gather than others who could only observe samples that were randomly generated for them. This results lead to the conclusion that children are capable of learning from the data they generate by themselves even at an early age. This result also suggests that the children's information gathering was informed by uncertainty and previous feedback, leading them to sample items that were near the true category boundary. This result was successfully replicated in [24]. Adams and Kachergis proved the effectiveness of AML for preschoolers, and they use an informative sampling strategy in an active category learning task. The authors suggest that children's performance in the AML task is related to their early math and preliteracy skills.

Kachergis et al. [28] in their paper investigated whether AML is better than PL in a crosssituational word learning context. They also investigated the strategies and found that most learners use immediate repetition to disambiguate pairings.

Researches in computer science on computationally efficient AML have inspired new theoretical approaches to inquiry behavior in humans.
