**4. Users, poses, gesture and robot behavior adaptation**

This section describes new users, poses, gestures and robot behaviours adaptation methods for implementing human-robot interaction. Suppose, the robot fixed with a same room with same lighting condition, in that case only the user skin color dominates on the color-based face and hands segmentation method. It is essential for the system to cope with the different persons. The new user may not be included in the system during training phase, so the person should be included using on-line registration process. The user may want to perform new gestures that is ever been used by others person or himself. In that case the system should include the new poses with minimal user interaction. The system learns new users, new poses using multi-clustering approach with minimum user interaction. To adapt to new users and new hand poses the system must be able to perceive and extract relevant properties from the unknown faces and hand poses, find common patterns among them and formulate discrimination criteria consistent with the goals of the recognition process. This form of learning is known as clustering and it is the first steps in any recognition process where discriminating features of the objects are not know in advance [Patterson, 1990]. Subsection 4.1 describes multi-clustering based learning method.

User, Gesture and Robot Behaviour Adaptation for Human-Robot Interaction 241

Step 6. If the insertion rate into the known cluster is grater than zero, then update the cluster information table that's holds the starting pointer of all clusters. Step 7. Do the step 3 to step 6 until the insertion rate (α) into the known cluster (training)

Step 8. If insertion rate is zero, then cheek the unlabeled dataset, which follows the condition

Step 11. If height of the cluster (number of member images in the cluster) is>L, then add its

Step 12. After clustering poses, the user defines the association of the clusters in the

We have already mentioned that the robot should be able to recognize and remember the people and learn about them [Aryananda, 2002]. If the new user comes in front of the robot eyes camera or system camera the system identifies the person as unknown person and asks for registration. The face is first detected from the cluttered background using multiple feature-based approaches [Hasanuzaman, 2007]. The detected face is filtered in order to remove noises and normalized so that it matches with the size and type of the training image [Hasanuzzaman, 2006]. The detected face is scaled to be a square image with 60x60 dimensions and converted to be a gray image. The face pattern is classified using the eigenface method [Turk, 1991], whether it belongs to known person or unknown person. The eigenvectors are calculated from the known persons face images for all face class and mnumber of eigenvectors corresponding to the highest eigenvalues are chosen to form principal components for each class. The Euclidean distance is determined between the weight vectors generated from the training images and the weight vectors generated from the detected face by projecting them onto the eigenspaces. If the minimal Euclidian distance is less than the predefined threshold value then person is known, otherwise unknown. For unknown person based on judge function learning process is activated and the system learned new user using multi-clustering approach [Section 4.1]. The judge function is based on the ratio of the number of unknown face to total number of detected faces for a specific time slot. The learning function develops new cluster/clusters corresponding to a new person. The user defines the person name and skin color information in the user profile knowledge base and associates with the corresponding cluster. For known user, personcentric skin color information (Y, I, Q components) is used to reduce the computational cost.

Step 10. Repeat from step 3 to step 9 until the number of unlabeled data less than N.

 <=Tf). Where, Tf is the threshold that defines for discarding the image. Step 9. If maximum number of unlabeled data (for a class)>N (predefined), then select one

image (based on minimum Euclidian distance) as a member of a new cluster. Then

knowledge base. Each pose may be associated with multiple clusters. For undefined

<Ti), then the image is recognizable and no need to include it in the cluster

<=Tc) then add the image in the neighbor cluster; increment the insertion

(4)

argmin{ }*<sup>j</sup>*

update the cluster information table.

cluster there is no association link.

**4.2 Face recognition and user adaptation** 

Step 5. Find the nearest class,

data set is zero (0).

as a new cluster.

a. If (Ti=<

b. If (

parameter.

database.

(Tc < 

Fig. 10. Multi-cluster hierarchies for object classification and learning

#### **4.1 Multi-clustering based learning method**

Figure 10 shows the conceptual hierarchy of face and hand-pose learning using multi-cluster approach. A pose Pi may include number of clusters and each cluster Cj may include number of images (X1, X2, …, Xo) as a member of that cluster. This clustering method is described using following steps:


$$\mathbf{O} \mathbf{O}\_i^j = \left(\boldsymbol{u}\_m\right)^T \left(\boldsymbol{T}\_j\right) \tag{1}$$

$$
\Omega\_{\dot{j}} = [o\_1^{\dot{j}}, o\_2^{\dot{j}}, \dots, o\_k^{\dot{j}}] \tag{2}
$$

Where, ( ) *um* is the *m-th* eigenvectors, *Tj* is the images ( 60 60 ) in the cluster database.


$$
\sigma\_j \equiv |\begin{pmatrix} \Omega & -\Omega\_j \end{pmatrix}| \tag{3}
$$

$$
\varepsilon = \arg\min \{ \varepsilon\_{\parallel} \} \tag{4}
$$

Step 5. Find the nearest class,

240 The Future of Humanoid Robots – Research and Applications

Figure 10 shows the conceptual hierarchy of face and hand-pose learning using multi-cluster approach. A pose Pi may include number of clusters and each cluster Cj may include number of images (X1, X2, …, Xo) as a member of that cluster. This clustering method is

Step 1. Generate eigenvectors from training images that includes all the known hand poses

Step 2. Select m-number of eigenvectors corresponding to the higher order of eigenvalues.

Step 3. Read the initial cluster image database (initialize with the known cluster images) and

( )( ) *<sup>j</sup> <sup>T</sup> m j i*

<sup>1</sup> <sup>2</sup> [ , ,..., ] *<sup>j</sup> j j <sup>j</sup>* 

a. Project each unlabeled image onto the eigenspaces and form feature vectors ( ) using

b. Calculate Euclidean distance to each image in the known (clustered) dataset using

Where, ( ) *um* is the *m-th* eigenvectors, *Tj* is the images ( 60 60 ) in the cluster database.

Step 4. Read the unlabeled images those should be clustered or labeled.

These selected eigenvectors are regarded as principal components. The eigenvalues

cluster information table that's hold the starting pointer of each cluster. Project each image onto the eigenspaces and form feature vectors using equation (1) and (2).

> 

*u T* (1)

(3)

*<sup>k</sup>* (2)

Fig. 10. Multi-cluster hierarchies for object classification and learning

**4.1 Multi-clustering based learning method** 

are sorted from high to low values.


described using following steps:

[Turk, 1991].

equation (1) and (2).

equation (3) and (4),


#### **4.2 Face recognition and user adaptation**

We have already mentioned that the robot should be able to recognize and remember the people and learn about them [Aryananda, 2002]. If the new user comes in front of the robot eyes camera or system camera the system identifies the person as unknown person and asks for registration. The face is first detected from the cluttered background using multiple feature-based approaches [Hasanuzaman, 2007]. The detected face is filtered in order to remove noises and normalized so that it matches with the size and type of the training image [Hasanuzzaman, 2006]. The detected face is scaled to be a square image with 60x60 dimensions and converted to be a gray image. The face pattern is classified using the eigenface method [Turk, 1991], whether it belongs to known person or unknown person. The eigenvectors are calculated from the known persons face images for all face class and mnumber of eigenvectors corresponding to the highest eigenvalues are chosen to form principal components for each class. The Euclidean distance is determined between the weight vectors generated from the training images and the weight vectors generated from the detected face by projecting them onto the eigenspaces. If the minimal Euclidian distance is less than the predefined threshold value then person is known, otherwise unknown. For unknown person based on judge function learning process is activated and the system learned new user using multi-clustering approach [Section 4.1]. The judge function is based on the ratio of the number of unknown face to total number of detected faces for a specific time slot. The learning function develops new cluster/clusters corresponding to a new person. The user defines the person name and skin color information in the user profile knowledge base and associates with the corresponding cluster. For known user, personcentric skin color information (Y, I, Q components) is used to reduce the computational cost.

User, Gesture and Robot Behaviour Adaptation for Human-Robot Interaction 243

continuously for 10 image frames that are unknown to the system that means he/she wants to use it as a gesture. In this proposed system the hand poses are classified using multicluster based learning method. For unknown pose, based on judge function learning function is activated and the system learns new pose. The learning function develops new cluster/clusters corresponding to new pose. The user defines the pose name in the knowledge base and associates with the corresponding cluster. If the pose is identified then corresponding frame will be activated. Figure 12 presents example output of learning new pose using multi-cluster approach. The system is first trained by pose 'ONE' and form one cluster for that pose. Then the system is trained by pose 'FIST' where formed another two clusters because the user uses two hands for that pose. Similarly other clusters are formed

Members of the clusters Associate

PC1 (1) ONE

PC2 (12) FISTUP

PC3 (21) FISTUP

PC4 (27) OK

PC5 (33) TWO

PC6 (39-54) TWO

The recognition of gesture is carried out in two phases. In the first phase, face and hand poses are classified from captured image frame using the method described in previous section. Then combinations of poses are analyzed to identify the occurrence of gesture. For example, if left hand palm, right hand palm and one face are present in the input image then it recognizes as "TwoHand" gesture [Figure 19(a) ] and corresponding gesture frame will be activated. Interpretation of identified gesture is user-dependent since the meaning of the gesture may differ from person to person based on their culture. For example, when user'Hasan' comes in front of 'Robovie' eyes, 'Robovie' recognizes the person and says "Hi Hasan! How are you?", then 'Hasan' raises his 'Thumb up' and "Robovie" replies to 'Hasan'"Oh! You are not fine today". In the similar situation, for another user 'Cho', 'Robovie' says, "Hi, You are fine today?". That means 'Robovie' can understand the personcentric meaning of gesture. To accommodate different user's desires, our person-centric gesture interpretation is implemented using frame-based knowledge representation approach. The user predefines these frames into the knowledge base with necessary attributes (gesture components, gesture name) for all predefined gestures. Our current system recognizes 11 static gestures. These are: 'TwoHand' (raise left hand and right hand palms), 'LeftHand' (raise left hand palm), 'RightHand' (raise right hand palm), 'One' (raise

Fig. 12. Example output of multi-clustering approach for learning new pose

Pose

corresponding to new poses.

**4.4 Gesture recognition and adaptation** 

Cluster Pointer

Fig. 11. Example output of multi-clustering algorithm for learning new user

Figure 11 shows the example output of multi-clustering approach for recognizing and learning new user. For example, the system is initially trained by the face images (100 face images with five directions) of person\_1. The system learns and remembers this person using five clusters (FC1, FC2, FC3, FC4, FC5) as shown in top five rows of Figure 11 and clusters information table (that's hold the starting position of each cluster and end position of last cluster) contents are [1, 12, 17, 27, 44, 52]. For example, in the case of face classification, if any face image matches with the known member between 12 and 16 then classified as face cluster\_2 (FC2). If the face is classified as any of these five clusters, the person is identified as person\_1. Suppose, the system is deal with new person 100 face images and it could not identify the person and activate the learning function. Then the system develops new six clusters (FC6, FC7, FC8, FC9, FC10, FC11) and updates the cluster information table ([1, 12, 17, 27, 44, 53, 68, 82, 86, 96, 106, 116]) as shown in Figure 11 (rows 6 to rows 11). The new user is registered as person\_2 and the associations with the clusters are defined. If any detected face images is classified as known cluster then corresponding person is identified.

#### **4.3 Hand pose classification and adaptation**

For machine it is difficult to understand the new poses without prior knowledge. It is essential to learn new poses based on specific judge function or predefined knowledge. The judge function determines the user intention, i.e., intention to create new gesture. The judge function is based on the ratio of the number of unknown hand poses to the total number hand poses for a specific time slot. For example, the user shows same hand pose 242 The Future of Humanoid Robots – Research and Applications

Cluster Members of the clusters Associations

Fig. 11. Example output of multi-clustering algorithm for learning new user

Figure 11 shows the example output of multi-clustering approach for recognizing and learning new user. For example, the system is initially trained by the face images (100 face images with five directions) of person\_1. The system learns and remembers this person using five clusters (FC1, FC2, FC3, FC4, FC5) as shown in top five rows of Figure 11 and clusters information table (that's hold the starting position of each cluster and end position of last cluster) contents are [1, 12, 17, 27, 44, 52]. For example, in the case of face classification, if any face image matches with the known member between 12 and 16 then classified as face cluster\_2 (FC2). If the face is classified as any of these five clusters, the person is identified as person\_1. Suppose, the system is deal with new person 100 face images and it could not identify the person and activate the learning function. Then the system develops new six clusters (FC6, FC7, FC8, FC9, FC10, FC11) and updates the cluster information table ([1, 12, 17, 27, 44, 53, 68, 82, 86, 96, 106, 116]) as shown in Figure 11 (rows 6 to rows 11). The new user is registered as person\_2 and the associations with the clusters are defined. If any detected face images is classified as known cluster then corresponding

For machine it is difficult to understand the new poses without prior knowledge. It is essential to learn new poses based on specific judge function or predefined knowledge. The judge function determines the user intention, i.e., intention to create new gesture. The judge function is based on the ratio of the number of unknown hand poses to the total number hand poses for a specific time slot. For example, the user shows same hand pose

Person-1

Person-2

FC1 (1)

FC2 (12) FC3 (17) FC4 (27) FC5 (44) FC6 (53)

FC7 (68) FC8 (82) FC9 (86) FC10 (96) FC11 (106- 117)

person is identified.

**4.3 Hand pose classification and adaptation** 

continuously for 10 image frames that are unknown to the system that means he/she wants to use it as a gesture. In this proposed system the hand poses are classified using multicluster based learning method. For unknown pose, based on judge function learning function is activated and the system learns new pose. The learning function develops new cluster/clusters corresponding to new pose. The user defines the pose name in the knowledge base and associates with the corresponding cluster. If the pose is identified then corresponding frame will be activated. Figure 12 presents example output of learning new pose using multi-cluster approach. The system is first trained by pose 'ONE' and form one cluster for that pose. Then the system is trained by pose 'FIST' where formed another two clusters because the user uses two hands for that pose. Similarly other clusters are formed corresponding to new poses.

Fig. 12. Example output of multi-clustering approach for learning new pose
