**2.1 Learning from demonstration**

This stage is divided into three steps: demonstration, segmentation, and recognition. Demonstration

Assumption-We assume 1) human teachers are well-trained, 2) behaviors of the same part in different demonstrations should have similar dynamics, and 3) the behavior sequences of the demonstrations should be composed of the same number of behaviors.

$$P\_{rs} = forward\ kinematic(\theta\_{rs})\tag{1}$$

$$p\_o(\mathbf{x}, \mathbf{y}) = \frac{\sum\_{l=0}^{t} p\_{ol}}{t+1} \tag{2}$$

$$w\_o(\mathbf{x}, \mathbf{y}) = \left\{ \sqrt{\frac{\sum\_{l=0}^{t} \left(p\_{ol}(\mathbf{x}) - p\_o(\mathbf{x})\right)^2}{t}}, \sqrt{\frac{\sum\_{l=0}^{t} \left(p\_{ol}(\mathbf{y}) - p\_o(\mathbf{y})\right)^2}{t}} \right\} \tag{3}$$

$$P\left(p\_{o(t+1)}|p\_o,\nu\_o\right) = \sqrt{\left(p\_{o(t+1)} - p\_o(\mathbf{x},\mathbf{y})\right)^T \left(p\_{o(t+1)} - p\_o(\mathbf{x},\mathbf{y})\right)}\tag{4}$$

$$P(p\_{o(t+1)}|p\_o, v\_o) > 3|v\_o|\tag{5}$$

$$m(\mathbf{c}) = \frac{\sum\_{l=1}^{n} |max(\mathbf{x}\_l) - m \ln(\mathbf{x}\_l)|}{n} \tag{6}$$

$$w(c) = \sqrt{\frac{\sum\_{l=0}^{n} \left( |max(\mathbf{x}\_l) - min(\mathbf{x}\_l)| - m(c) \right)^2}{n - 1}} \tag{7}$$

$$p(c) = v(c) / m(c) \tag{8}$$

$$p(\mathbf{s}) = 1 - p(\mathbf{c}) \tag{9}$$

$$P\_n(\mathbf{t}) = average\{P\_{demo1}(\mathbf{t}) + P\_{demo2}(\mathbf{t}) + P\_{demo3}(\mathbf{t})\}\tag{10}$$

$$
\tau \dot{\mathbf{z}} = \alpha\_{\mathbf{z}} (\beta\_{\mathbf{z}} (\mathbf{g} - \mathbf{y}) - \mathbf{z}) \tag{11}
$$

$$\mathbf{r}\mathbf{\dot{y}} = \mathbf{z} + f\tag{12}$$

Implementation of a Framework for Imitation

Demo1

Fig. 8. Recorded behavior 1.





PositionY(mm)





500 600 700

PositionX(mm)

Fig. 9. Recorded Behavior 2.

Learning on a Humanoid Robot Using a Cognitive Architecture 201

Demo2

500 550 600

400 600 800

PositionX(mm)





PositionY(mm)





Demo3

PositionX(mm)




PositionY(mm)


0

50

�� is the generated velocity correspondingly. α�, β�, and τ are constants in the equation. From the original paper of the DMP, α�, β�, and τ are chosen as 1, ¼ and 1 heurisitically to achieve the convergence.

When the position value of the Knight is given, the CEA generates a new trajectory which has similar dynamics to the demonstration of behavior 1.

After the grasping, behavior 2 is generated strictly following the obtained behavior in section.
