**3. Motion analysis approach**

To analyse motion we defined different parameters based on STIP detection using Laptev detector. The first one calculates the number of STIP in the sequence, whereas the second is the "activity" function that evaluates the evolution of STIP during the sequence and the third parameter analyses the position of STIP points comparing to reference one associated to the body in movement.

## **3.1 Number of STIPs in a sequence**

Human body movements can be differentiated by a quantitative survey on STIPs detected. Thus, an algorithm was developed with a purpose to calculate STIPs number for each sequence from different human body motion databases. This algorithm leads to interesting results. Indeed, STIPs number is high for fast movements like (running, jogging, jumping). Other movements made only by the arms (boxing, hand clapping or hand waving) lead to low STIPs number. Table 1 shows the evolution of the STIPs average number in 100 frames sequences (4 seconds of video) for each movement class. The algorithm was tested on 450 sequences from KTH database (75 for each movement).


Table 1. Number of STIPs evolution for KTH human action database.

These statistics show that STIPs number depends directly of the movement realized. Indeed, running and jumping movements have high STIPs number however boxing and hand waving have a low STIPs number. Therefore we conclude that STIPs number in a sequence is an important parameter in human movements' recognition. To emphasize this study we present in the following section the evolution of STIPs in time by the "Activity" function.

## **3.2 Activity function**

160 Advances in Object Recognition Systems

Fig. 3. Reference axes (x,y,t) representation on a video sequence from KTH human action

To analyse motion we defined different parameters based on STIP detection using Laptev detector. The first one calculates the number of STIP in the sequence, whereas the second is the "activity" function that evaluates the evolution of STIP during the sequence and the third parameter analyses the position of STIP points comparing to reference one associated

Human body movements can be differentiated by a quantitative survey on STIPs detected. Thus, an algorithm was developed with a purpose to calculate STIPs number for each sequence from different human body motion databases. This algorithm leads to interesting results. Indeed, STIPs number is high for fast movements like (running, jogging, jumping). Other movements made only by the arms (boxing, hand clapping or hand waving) lead to low STIPs number. Table 1 shows the evolution of the STIPs average number in 100 frames sequences (4 seconds of video) for each movement class. The algorithm was tested on 450

Movement STIPs average number per

Running 685 Jogging 463,33 Walking 313,33 Hand waving 145 Hand clapping 114 Boxing 82

Table 1. Number of STIPs evolution for KTH human action database.

100 frames

database

**3. Motion analysis approach** 

**3.1 Number of STIPs in a sequence** 

sequences from KTH database (75 for each movement).

to the body in movement.

Evolution of the STIPs number in a sequence is an important factor in human motion recognition. To synthesize this criterion we have used the "Activity" function. This function was defined by Laganière et al. (Laganière et al., 2008) as the number of pixels that are modified between two consecutive frames in a video sequence. Hence, frames that correspond to local maxima of the "Activity" function are the scenes of major movements. We have changed the "Activity" to fit our research, so we defined it as STIPs number in each frame of the sequence. The evolution of this number can lead us to recognize the type of movement made by detecting its local maxima which are the locations of large amounts of movement and its distribution that indicates the positions of these quantities in time scale. In Figure 4, we present the activity function applied to samples of sequences from KTH database.

Fig. 4. Application of the Activity function on samples from KTH human action database.

The curves in Figure 4 have repetitive peaks. These peaks are local maxima of the activity function and can be regarded as major movement's events in each class. From this analysis we can extract important information about the class of the movement performed. The curves obtained are so noised. This is caused by non significant STIPs detected and which appear between local maxima. To resolve this problem we applied a smoothing algorithm on curves to accentuate the peaks and eliminate the STIPs values between the local maxima. The smoothing was done on segments of frames by adding the STIPs detected

Non-Rigid Objects Recognition: Automatic Human Action Recognition in Video Sequences 163

Fig. 6. Application of the Gaussian model to activity function on sequences taken from the

Running 1 42 43 Jogging 2 29 33 Walking 3 28 33 Hand waving 5 12 14 Hand clapping 3 12,33 14 Boxing 5 7,4 11 Table 2. Number of local maxima, mean value and the global maximum value for different

The use of the activity function allows the tracking of the STIPs number in time. Its evolution has been modeled by a Gaussian model to extract its local maxima. This study can contribute to human motion recognition. Another important feature can be used. It consists

STIPs are the most significant motion locations in video sequences. Most of the STIPs are located at the most valuable human body parts such as knees, elbow joints, the moving limbs. Boxes containing STIPs called as "Spatiotemporal Boxes" can be considered as

Local maxima mean value

Global maximum value

KTH human action database

**3.3 Spatiotemporal boxes** 

Action Local maxima

action classes taken from the KTH human action database

on the spatiotemporal boxes associated to human body parts.

number

in an interval [n-2, n+2] where n is the time of the local maxima of the STIPs. Figure 5 shows the application of smoothing algorithm on the activity function curves for samples from the KTH human action database.

Fig. 5. Application of the smoothing algorithm on the activity function curves for samples from the KTH human action database.

We note that smoothing reduces the activity function noise and increases the local maxima values of the curves. To detect the locations of local maxima, a Gaussian model is fitted to the activity function. This model leads to the determination of the number of the local maxima and their time in a sequence. In addition, it contributes for motion recognition when considering the parameters of the used Gaussian model in the classification algorithm.

The value of the global maximum is deduced to detect movement with only one global maximum. Figure 6 shows the application of the Gaussian model to activity function on sequences taken from the KTH human action database (from left to right and row-wise of the Figure we have the actions of, boxing, walking, hands waving, jogging, hands clapping and running).

In Table 2, the number of local maxima is shown, their mean value and the global maximum value for different action classes taken from the KTH human action database. We note that the number of local maxima is the number of repetitions in a human movement such as walking or hand clapping. For fast movements such as running the smoothing algorithm reduces the number local maxima to one and extracts a single global maximum. The local maxima average value is a significant parameter in the classification of human movements. We note that the movements made only by arms such as: Boxing, Hand waving and Hand clapping have values lower than those achieved by the whole body such as: Running, Jogging and Walking. The global maximum can contribute to the classification since its values are different from one to another class of motion.

Fig. 6. Application of the Gaussian model to activity function on sequences taken from the KTH human action database


Table 2. Number of local maxima, mean value and the global maximum value for different action classes taken from the KTH human action database

The use of the activity function allows the tracking of the STIPs number in time. Its evolution has been modeled by a Gaussian model to extract its local maxima. This study can contribute to human motion recognition. Another important feature can be used. It consists on the spatiotemporal boxes associated to human body parts.
