**4. Motion classification**

164 Advances in Object Recognition Systems

important information to describe the actions and to differentiate between them. Spatiotemporal boxes containing detected STIPs are the most shining regions to describe human motion. The boxes size can be effective information to differentiate between motion

For all STIPs belonging to the same image, we determine their spatial coordinates (x1, y1) (x2, y2), ..., (xn, yn) in the image reference. The spatiotemporal boxes can be described by a rectangle between points (xLeft, yTop) and (XRight, yBottom) these coordinates are determined by

> x = min(x ,x ,...,x ) - r 1 2 <sup>n</sup> y = min(y ,y ,...,y ) - r 1 2 <sup>n</sup> x = max(x ,x ,...,x ) + r 1 2 <sup>n</sup> x = max(y ,y ,...,yn)+r 1 2

r is the extension radius of the spatiotemporal boxes. Figure 7 shows spatiotemporal boxes

Fig. 7. Spatiotemporal boxes detected on images taken from the KTH human action database

Considering motion done using full body, we classified STIPs points in two parts, High body part STIPs (H-STIP) and Low body part STIPs (L-STIP). To achieve this classification we detected the centroid of the body silhouette in all frames of the sequence. Points located above centroid are classified in H-STIP and points below centroid are classified in L-STIP as

(9)

done only by hands and the full body motion (see Figure 7).

Left Top Right Bottom

detected on images taken from the KTH human action database.

reference to the following equations.

shown in Figure 8.

To obtain fair judgement of the performances of the proposed approach, we compare our results with other human action recognition approaches using the same database. The

Non-Rigid Objects Recognition: Automatic Human Action Recognition in Video Sequences 167

The clustering algorithm K-means (MacQueen, 1967) allows to partition the set of movements into k classes {C1, C2, …, Ck}. U1 partition of the first algorithm contains two rows and n columns. While for the second algorithm U2 contains 6 rows and n columns where n is the number of video sequences. For each sequence a vector V is generated.

uu u 1,1 1,2 1,n uu u 2,1 2,2 2,n U1 ; U2

 

if P C then u 1 j i i,j

u =1, <sup>j</sup> = 1,…,N i,j

u > 0 , i = 1,…,K i,j

With K is equal to 2 for the first algorithm and 6 for the second. The first specifies that any sample movement must belong to one and only one class of the partition, while the second

The KTH human action database is the largest database available. Each video contains a single action. The database contains six types of human movements (walking, jogging, running, boxing, hand waving and hand clapping). These movements are performed several times by 25 persons in different scenarios, in external or internal environment. The database contains a total of 600 long sequences, that can be divided to more than 10 short sequences

To test the results of our approach for the recognition task, we used 25% of samples from the video database for the learning task. The 75% remaining video samples are used in the validation task of the performance of the method developed. Figure 10 shows the confusion

The confusion matrix in Figure 10 shows the performance obtained for the KTH human action database. Indeed, 450 samples were used to obtain these results (75 for each class). Each column of the matrix represents the accuracy of a class estimated, while each row

else u 0 i,j

uu u 2,1 2,2 2,n

Where u 0,1 i,j : means the belonging of the movement Pj to the class Ci.

In addition, we impose the following two constraints on each partition

K

i 1

N

i 1

specifies that a class must have at least one sample of movement.

matrix of classification results for the KTH database.

represents the accuracy of a real class.

**5. Classification results** 

of 4 seconds each one.

uu u 1,1 1,2 1,n

(11)

(10)

uu u 6,1 6,2 6,n

(12)

(13)

performance of any approach is evaluated by measuring the accuracy of motion classification using a specified algorithm. Many algorithms can be used. The more used in will be described in the following subsections.
