**4.4 Gesture recognition and adaptation**

The recognition of gesture is carried out in two phases. In the first phase, face and hand poses are classified from captured image frame using the method described in previous section. Then combinations of poses are analyzed to identify the occurrence of gesture. For example, if left hand palm, right hand palm and one face are present in the input image then it recognizes as "TwoHand" gesture [Figure 19(a) ] and corresponding gesture frame will be activated. Interpretation of identified gesture is user-dependent since the meaning of the gesture may differ from person to person based on their culture. For example, when user'Hasan' comes in front of 'Robovie' eyes, 'Robovie' recognizes the person and says "Hi Hasan! How are you?", then 'Hasan' raises his 'Thumb up' and "Robovie" replies to 'Hasan'"Oh! You are not fine today". In the similar situation, for another user 'Cho', 'Robovie' says, "Hi, You are fine today?". That means 'Robovie' can understand the personcentric meaning of gesture. To accommodate different user's desires, our person-centric gesture interpretation is implemented using frame-based knowledge representation approach. The user predefines these frames into the knowledge base with necessary attributes (gesture components, gesture name) for all predefined gestures. Our current system recognizes 11 static gestures. These are: 'TwoHand' (raise left hand and right hand palms), 'LeftHand' (raise left hand palm), 'RightHand' (raise right hand palm), 'One' (raise

User, Gesture and Robot Behaviour Adaptation for Human-Robot Interaction 245

shows the sample result of the user adaptation method for normal faces. In the first step, the system is trained using 60 face images of three persons and developed three clusters (top 3 rows of Figure 13) corresponding to three persons. The cluster information table contents are [1, 11, 23, 29]. For example, in this situation if any input face image matches with the known face image member between 1 and 10 then the person is identified as person 1.

Cluster Members of the clusters Association

In the second step, 20-face image sequences of another person are fed to the system as input. The minimum Euclidian distances (ED) from three known persons face images are shown using upper line graph (B\_adap) in Figure 14. The system identifies these faces as unknown person based on predefined threshold value for the Euclidian distance and activates the user learning function. The user learning function developed new cluster (4th row of Figure 13) and updated the cluster information table as [1, 11, 23, 30, 37]. After adaptation, the minimum Euclidian distance distribution line (A\_adap) in Figure 7.21 shows that, for 8 images, minimum ED is zero and those are included in the new cluster so that the system can recognize the person. This method is tested for 7 persons including 2 females, and as a result of learning, 7 clusters with different length (number of images per cluster) for

The users adaptation method is also tested for 700 five directional face images of 7 persons (sample output in Figure 11). Figure 15 shows the distribution of 41 clusters for the 700 face images of 7 persons. In the first step, the system is trained using 100 face images of person\_1 and it formed 5 clusters based on 5-directional faces. At this time, the contents of the cluster information table (that holds staring pointer of each cluster in the training database) are [1, 12, 17, 27, 44, 52]. After learning person\_2, the cluster information table contents are [1, 12, 17, 27, 44, 53, 68, 82, 86, 96, 106, 116]. Similarly, other persons are adapted. Figure 16 shows the example of errors in clustering process. In the cluster 26 up directed faces of person\_6 and frontal face person\_5 are overlapped and treated as one cluster (Figure 16 (a)). In the case of cluster 31, up directed faces of person\_5 and normal (frontal) faces of person\_6 are overlapped and grouped in the same cluster (Figure 16 (b)). This problem can be solved using narrow threshold, but in that case the number of iteration as well as discard rates of

Fig. 13. Sample outputs of the clustering method for frontal faces

different persons (as shown in Figure 13) were formed.

the images classification method will be increased.

Person 1

Person 2

Person 3

Person 4

Person 5

Person 6

Person 7

NFC1 (1)

NFC2 (11)

NFC3 (23)

NFC4 (30)

NFC5 (38)

NFC6 (46)

NFC6 (51-62)

index finger), 'Two' (form V sign using index and middle fingers), 'Three' (raise index, middle and ring fingers), 'ThumbUp' (thumb up), 'Ok' (make circle using thumb and index finger), 'FistUp' (fist up), 'PointLeft' (point left by index finger), 'PointRight' (point right by index finger).

It is possible to recognize more gestures including new poses and new rules for the gesture using this system. New poses can be included in the training image database using the interactive learning method and corresponding frame can be defined in the knowledge base to interpret the gesture. To teach the robot a new poses, the user should perform the poses several times (example 10 image frame times.). Then the learning method detects it as a new pose and creates cluster/clusters for that pose. Sequentially, it updates the knowledge base for the cluster information.
