**2. Learning from demonstration**

Each neuron is formed by a body or soma, and axon that sends information and thousands of dendrites for receiving data. The more dendrite connections there are, the more learning there would be. It can be seen than at its basic functional level, learning is without a doubt a com‐ plex activity. It involves not only the brain cells and their connections but also certain factors such as attention, memory, motivation and stress. There are even different ways in which learning

In humans, learning is not only directly engaged with the amount of connection in the neurons but is also influenced by external factors relative to the subject state. There is a necessity to learn in an entity free of unnecessary passions and no underpinnings that could hinder or limit its learning ability. Machine learning is the response to this paradigm, several algorithms have arisen having as main objective; making computers learn as can be seen in [2], even they may have different particular objectives according to their taks. Machine learning algorithms have already demonstrated their competence at learning in different engineering and science problems [3–5]. However, there is still no algorithm that is more versatile and generalised, i.e., a "do-it-all" algorithm that can be used in several situations no matter the nature of the task

In machine learning, there is no single algorithm that could solve all the problems [6]. To address this, a machine learning algorithm from three principles is created: Learning from Demonstration, Reinforcement Learning and Artificial Immune System. This aims to obtain an algorithm with several advantages from the mixture of those techniques. The resulting algorithm would keep all the advantages from the techniques used plus some of the Artificial

This document is organised as follows. Section 1 contains a discussion on the theory on how learning from demonstration, reinforcement learning and Artificial Immune Systems has been used to develop CODA algorithm in order to help the reader understand how useful these

Immune System characteristics from **Table 2**. **Table 1** shows advantages of CODA.

can occur, such as empiricism, innatism and constructivism [1].

itself.

254 Recent Advances in Robotic Systems

**CODA'S advantages** Low training samples

Short training time

Produces new knowledge

Unnecessary/unused data is deleted

Reward function is task specific

Reward function assures search for maximum reward

**Table 1.** Advantages that characterise CODA algorithm.

Self-organised memory mechanism reduce searching time in repertoire

methods are for the algorithm presented in Section 3.

Defining Learning from Demonstration (LfD) should not be difficult. In [7], a simple yet complete sentence clarifies the concept: Learning from Demonstration is learning from watching a demonstration of the task to be performed. Embedded in this sentence is the main goal, which is to learn a skill from demonstrative examples. What is needed is for the computer to learn from demonstrations and this is what it is all about. Learning from demonstration is also known as "programming by demonstration", "imitation learning" or "teaching by showing". As the second name exposes, another goal is to replace certain programming that would be time-consuming and that would require a specialised person in order to modify the programs within the robot. Programming by demonstration promises an automatic selfprogramming process by showing a robot the task to perform in itself.

Learning from demonstration is not a new topic. It is a well-known discipline in robotics and has been studied in [8–12], where their approaches needed direct teaching to the entity in order for the entity to imitate certain human actions. Recent studies focus on developing the learning from demonstration theory in order to produce better systems that would lead to better demonstration and feedback, resulting in better teaching and learning [13]. Other studies propose a system using a Bayesian nonparametric reward in order to assign rewards to subgoals (more than one reward function) instead of a unique task [14]. And there are still other studies focusing on enhancing the quality of the demonstrations in order to decrease the number of demonstrations and learn more effectively [15].

Learning from demonstration is commonly applied in robotics, the robot assumes the student role and the human is the expert teacher. Thus the goal is to demonstrate the task to the robot, so it should learn from watching the demonstration, so the robot could use the skill when necessary. Once a robot has acquired certain skill it could also teach other robots.

### **2.1. Learning**

Learning supposes the generalisation of a task. Let's suppose you have to learn how a bird looks like. The first step would probably be to list down the characteristics of a bird. You should have a list describing the wings, eyes, feathers, tail and beak. Without focusing on specifics like the species, geography or colours, we can say that animals that fit the description can be called birds. The list is a guideline for you to be able to distinguish a bird from any other animal or object.

Once we have learned the characteristics that are uniquely assigned to certain things, it is now easy to distinguish objects and animals around us. They can be clustered in simple term such as, in this particular case, a bird. Our brain learns to generalise tasks in order to obtain the desired results every time the action is performed regardless of the variability of the environ‐ ment at each try.

Once a robot is able to reproduce certain skills with the least error possible or even better, nonerror, it can be said that the robot has learnt a skill correctly. Therefore, it can be said that it has certain intelligence embedded that let it learn.

Robots can be controlled by different methods and techniques that have been used for several years and produce excellent results in certain conditions and applications [16, 17]. But it is important to note that robots are also being used outside of factories in applications and environments that require high adaptability, reliability and constant learning. These may demand robots that are capable of handling uncertainty and variability in fast and dynamic environments.

It can be assured now that learning is a valuable and almost necessary skill for robots, almost since Alan Turing wrote *Computing Machinery and Intelligence,* concluding that "*We can only see a short distance ahead, but we can see plenty there that needs to be done*" [18].
