**Behavior Recognition Using any Feature Space Representation of Motion Trajectories**

Shehzad Khalid *Bahria University Pakistan*

## **1. Introduction**

26 Will-be-set-by-IN-TECH

100 Recent Developments in Video Surveillance

Ohta, M. (2005). Level crossings obstacle detection system using stereo cameras, *Quarterly*

Oja, E., Ogawa, H. & Wangviwattana, J. (1991). Learning in nonlinear constrained hebbian

Stauffer, C. & Grimson, W. (2000). Learning patterns of activity using real-time tracking, *IEEE*

Taguchi, Y., Wilburn, B. & Zitnick, C. L. (2008). Stereo reconstruction with mixed pixels using

Trinh, H. (2008). Efficient stereo algorithm using multiscale belief propagation on segmented

Tsai, D. & Lai, S. (2009). Independent component analysis-based background subtraction for

Waldert, S. (2007). Real-time fetal heart monitoring in biomagnetic measurements using aadaptive real-time ica, *IEEE Trans. On Biomedical Engineering* 54(107): 1964–1874.

Yang, Q., Wang, L., Yang, R., Stewenius, H. & Niste, D. (2009). Stereo matching with

Yoda, I., Sakaue, K. & Hosotani, D. (2006). Multi-point stereo camera system for controlling

Zhang, X. & Chen, Z. (2006). An automated video object extraction system based

Zhen, T. & Zhenjiang, M. (2008). Fast background subtraction using improved gmm and graph

color-weighted correlation, hierachical belief propagation, *IEEE Trans. on PAMI*

on spatiotemporal independent component analysis and multiscale segmentation,

cut, *Congress on Image and Signal Processing, CISP*, Vol. 4, Sanya, China, pp. 181–185.

adaptive over-segmentation, *CVPR*, Anchorage, Alaska, pp. 1–8.

images, *Proceedings of the Brith Machine Vision Conference (BMVC)*.

indoor surveillance, *IEEE Trans. On Image Processing* 18(1): 158–167.

Wu, F. (1982). The potts model, *Reviews of Modern Physics* 54(1): 235–268.

*EURASIP Journal on Applied Signal Processing* 2006(2): 1–22.

safety at railroad crossings, *IEEE ICVS*.

networks, *T.Kohonen, et al .,editor, Artificial Neural Networks,Proc.ICANN*, Espoo,

*Report of RTRI* 46(2): 110–117.

31(3): 492–504.

Finland, Amsterdam, Holland, pp. 385–390.

*Trans. Pattern Anal. Mach. Intell.* 22(8): 747–757.

In recent years, there has been a growth of research activity aimed at the development of sophisticated content-based video data management techniques. This development is now especially timely given an increasing number of systems that are able to capture and store data about object motion such as those of humans and vehicles. This has acted as a spur to the development of content-based visual data management techniques for tasks such as behavior classification and recognition, detection of anomalous behavior and object motion prediction. Behavior can obviously be categorized at different levels of granularity. In far-field surveillance, we are primarily interested in trajectory-based coarse motion description involving movement direction (right/left or up/down) and motion type (walking, running or stopping). These techniques are essential for the development of next generation 'actionable intelligence' surveillance systems.

Processing of trajectory data for activity classification and recognition has gained significant interest quite recently. Various techniques have been proposed for modeling of trajectory-based motion activity patterns and using the modeled patterns for classification and anomaly detection. Much of the earlier research focus in motion analysis has been on high-level object trajectory representation schemes that are able to produce compressed forms of motion data (Aghbari et al., 2003; Chang et al., 1998; Dagtas et al., 2000; Hsu & Teng, 2002; Jin & Mokhtarian, 2004; Khalid & Naftel, 2005; Shim & Chang, 2004). This work presupposes the existence of some low-level visual tracking scheme for reliably extracting object-based trajectories (Hu, Tan, Wang & Maybank, 2004; Vlachos et al., 2002). The literature on trajectory-based motion understanding and pattern discovery is less mature but advances using Learning Vector Quantization (LVQ) (Johnson & Hogg, 1995), Self-Organising Maps (SOMs) (Hu, Xiao, Xie, Tan & Maybank, 2004; Owens & Hunter, 2000), Hidden Markov Models (HMMs) (Bashir et al., 2006; 2005b), and fuzzy neural networks (Hu, Xie, Tan & Maybank, 2004) have all been reported. These approaches are broadly categorized into statistical and neural network based approaches.

In a development of trajectory-based motion event recognition systems, there are different questions that we need to answer before proposing or selecting a pattern modeling and recognition technique. These includes:

1. What is the feature space representation of trajectories?

for time series are also applicable to motion data. For example, Discrete Fourier Transforms (DFT) (Faloutsos et al., 1994), Discrete Wavelet Transforms (DWT) (Chan & Fu, 1999), Adaptive Piecewise Constant Approximations (APCA) (Keogh et al., 2001), and Chebyshev polynomials (Cai & Ng, 2004) have been used to conduct similarity search in time series data. Previous work has also sought to represent moving object trajectories through piecewise linear or quadratic interpolation functions (Chang et al., 1998; Jeanin & Divakaran, 2001), motion histograms (Aghbari et al., 2003) or discretised direction-based schemes (Dagtas et al., 2000; Shim & Chang, 2001; 2004). Spatiotemporal representations using piecewise-defined polynomials were proposed by (Hsu & Teng, 2002), although consistency in applying a trajectory-splitting scheme across query and searched trajectories can be problematic. Affine and more general spatiotemporally invariant schemes for trajectory retrieval have also been presented (Bashir et al., 2003; 2004; Jin & Mokhtarian, 2004). The importance of selecting the most appropriate trajectory model and similarity search metric has received relatively scant

Behavior Recognition Using any Feature Space Representation of Motion Trajectories 103

Modeling of motion patterns using trajectory data to perform motion based behavior recognition and anomaly detection has gained significant interest recently. Various techniques have been proposed for modeling of motion activity patterns and using the modeled patterns for classification and anomaly detection. These approaches are broadly categorized into statistical and neural network based approaches. Almost all statistical approaches dealing with anomaly detection are based on modelling the density of training data and rejecting test patterns that fall in regions of low density. There are various approaches that use Gaussian mixture models to estimate the probability density of data (Brotherton et al., 1998; Roberts & Tarassenko, 1994; Yeung & Chow, 2002). Various techniques based on hidden Markov models (HMM) have also been proposed (Xiang & Gong, 2005; 2006; Zhang et al., 2005). (Yacoob & Black, 1999) and (Bashir et al., 2005b; 2007) have presented a framework for modeling and recognition of human motion based on a trajectory segmentation scheme. A framework is presented to estimate the multimodal probability density function (PDF), based on PCA coefficients of the sub-trajectories, using GMM. Different classes of object motion are modelled by a continuous HMM per class where the state PDFs are represented by GMMs. The proposed technique has been shown to work for sign language recognition. The proposed classification system can not handle anomalies in test data and can only classify samples from normal patterns. (Xiang & Gong, 2005; 2006) propose a framework for behavior classification and anomaly detection in video sequences. Natural grouping of behaviour patterns is learnt through unsupervised model selection and feature selection on the eigenvectors of a normalized affinity matrix. A Multi-Observation Hidden Markov Model is used for modelling the behaviour pattern. (Hu et al., 2006; 2007) and (Khalid & Naftel, 2006) models normal motion patterns by estimating single multimodal gaussian for each class. For anomaly detection in (Hu et al., 2006), the probability of a trajectory belonging to each motion pattern is calculated. If the probability of association of trajectory to the closest motion pattern is less then a threshold, the trajectory is treated as anomalous. In (Rea et al., 2004), a semantic event detection technique based on discrete HMMs is applied to snooker videos. (Zhang et al., 2005) propose a semi-supervised model using HMMs for anomaly detection. Temporal dependencies are modelled using HMMs. The probability density function of each HMM state is assumed to be a GMM. (Owens & Hunter, 2000) uses Self Organizing Feature Maps (SOFM) to learn normal trajectory patterns. While classifying trajectories, if the distance of

attention (Khalid & Naftel, 2005).


Most of the trajectory-based motion recognition system, as proposed in relevant literature (Hu, Xiao, Xie, Tan & Maybank, 2004; Hu, Xie, Tan & Maybank, 2004; Khalid & Naftel, 2005; 2006; Owens & Hunter, 2000) can operate only on feature space representation of trajectories that lies in Euclidean space with a computable mean. However, a survey of recent literature in the areas of motion feature computation for trajectory representation shows that most of the feature space representation are complex and do not lie in the Euclidean space (Bashir et al., 2006; 2005a;b; 2007; Hamid et al., 2005; Keogh et al., 2001; Xiang & Gong, 2006; Zhong et al., 2004). It is not possible to compute a mean representation of different trajectories using such complex feature spaces. They can therefore not be applied to complex feature spaces with incalculable mean. These approaches expect that the trajectories in a given motion pattern follow certain standard distribution such as Gaussian. They can not cater for multimodal complex shape distribution of trajectories within a given motion pattern which is expected in the presence of complex feature space representation of trajectories. The research presented in this chapter focuses on presenting a trajectory-based behavior recognition and anomaly detection system that have an answer to all of the above raised questions. The proposed approach does not impose any limitation on the representation of trajectories. It can operate using any trajectory representation in any feature space with a given distance function. The proposed approach can perform modeling, classification and anomaly detection in the presence of multimodal distribution of trajectories within a given motion pattern.

The remainder of the chapter is organized as follows: We review some relevant background material in section 2. In section 3, we present a framework of multimodal modeling of activity patterns using any feature space with a computable similarity function. A soft classification and anomaly detection techniques using multimodal *m*-Medoids model is presented in section 4. Comparative evaluation of currently proposed multimodal *m*-Medoids and previously proposed localized *m*-Medoids (Khalid, 2010a) based appraoch for activity classification and anomaly detection is presented in section 5. Experiments have been performed to show the effectiveness of proposed system for trajectory-based modeling, classification and anomaly detection in the presence of multimodal distribution of trajectories within a pattern, as compared to competitors. These experiments are reported in section 6. The last section summarizes the paper.

## **2. Background and related work**

Motion trajectory descriptors are known to be useful candidates for video indexing and retrieval schemes. Variety of trajectory modeling techniques have been proposed to compute the feature for trajectory representation. Most of the techniques for learning motion behaviour patterns and recognition from trajectories use discrete point sequence vectors as input to a machine learning algorithm. Related work within the data mining community on representation schemes for indexing time series data is also relevant to the parameterisation of object trajectories. An object trajectory can be defined as a set of points representing the ordered observations of the location of a moving object made at different points in time. A trajectory can therefore be represented as a time series implying that indexing techniques

2. What is the distribution of trajectories in a given feature space representation? Do we need to cater for complex shape distributions that may exit in a given motion pattern? 3. Do we expect to have a multimodal distribution of trajectories within a given pattern?

Most of the trajectory-based motion recognition system, as proposed in relevant literature (Hu, Xiao, Xie, Tan & Maybank, 2004; Hu, Xie, Tan & Maybank, 2004; Khalid & Naftel, 2005; 2006; Owens & Hunter, 2000) can operate only on feature space representation of trajectories that lies in Euclidean space with a computable mean. However, a survey of recent literature in the areas of motion feature computation for trajectory representation shows that most of the feature space representation are complex and do not lie in the Euclidean space (Bashir et al., 2006; 2005a;b; 2007; Hamid et al., 2005; Keogh et al., 2001; Xiang & Gong, 2006; Zhong et al., 2004). It is not possible to compute a mean representation of different trajectories using such complex feature spaces. They can therefore not be applied to complex feature spaces with incalculable mean. These approaches expect that the trajectories in a given motion pattern follow certain standard distribution such as Gaussian. They can not cater for multimodal complex shape distribution of trajectories within a given motion pattern which is expected in the presence of complex feature space representation of trajectories. The research presented in this chapter focuses on presenting a trajectory-based behavior recognition and anomaly detection system that have an answer to all of the above raised questions. The proposed approach does not impose any limitation on the representation of trajectories. It can operate using any trajectory representation in any feature space with a given distance function. The proposed approach can perform modeling, classification and anomaly detection in the presence of multimodal distribution of trajectories within a given motion pattern.

The remainder of the chapter is organized as follows: We review some relevant background material in section 2. In section 3, we present a framework of multimodal modeling of activity patterns using any feature space with a computable similarity function. A soft classification and anomaly detection techniques using multimodal *m*-Medoids model is presented in section 4. Comparative evaluation of currently proposed multimodal *m*-Medoids and previously proposed localized *m*-Medoids (Khalid, 2010a) based appraoch for activity classification and anomaly detection is presented in section 5. Experiments have been performed to show the effectiveness of proposed system for trajectory-based modeling, classification and anomaly detection in the presence of multimodal distribution of trajectories within a pattern, as compared to competitors. These experiments are reported in section 6. The last section

Motion trajectory descriptors are known to be useful candidates for video indexing and retrieval schemes. Variety of trajectory modeling techniques have been proposed to compute the feature for trajectory representation. Most of the techniques for learning motion behaviour patterns and recognition from trajectories use discrete point sequence vectors as input to a machine learning algorithm. Related work within the data mining community on representation schemes for indexing time series data is also relevant to the parameterisation of object trajectories. An object trajectory can be defined as a set of points representing the ordered observations of the location of a moving object made at different points in time. A trajectory can therefore be represented as a time series implying that indexing techniques

summarizes the paper.

**2. Background and related work**

for time series are also applicable to motion data. For example, Discrete Fourier Transforms (DFT) (Faloutsos et al., 1994), Discrete Wavelet Transforms (DWT) (Chan & Fu, 1999), Adaptive Piecewise Constant Approximations (APCA) (Keogh et al., 2001), and Chebyshev polynomials (Cai & Ng, 2004) have been used to conduct similarity search in time series data. Previous work has also sought to represent moving object trajectories through piecewise linear or quadratic interpolation functions (Chang et al., 1998; Jeanin & Divakaran, 2001), motion histograms (Aghbari et al., 2003) or discretised direction-based schemes (Dagtas et al., 2000; Shim & Chang, 2001; 2004). Spatiotemporal representations using piecewise-defined polynomials were proposed by (Hsu & Teng, 2002), although consistency in applying a trajectory-splitting scheme across query and searched trajectories can be problematic. Affine and more general spatiotemporally invariant schemes for trajectory retrieval have also been presented (Bashir et al., 2003; 2004; Jin & Mokhtarian, 2004). The importance of selecting the most appropriate trajectory model and similarity search metric has received relatively scant attention (Khalid & Naftel, 2005).

Modeling of motion patterns using trajectory data to perform motion based behavior recognition and anomaly detection has gained significant interest recently. Various techniques have been proposed for modeling of motion activity patterns and using the modeled patterns for classification and anomaly detection. These approaches are broadly categorized into statistical and neural network based approaches. Almost all statistical approaches dealing with anomaly detection are based on modelling the density of training data and rejecting test patterns that fall in regions of low density. There are various approaches that use Gaussian mixture models to estimate the probability density of data (Brotherton et al., 1998; Roberts & Tarassenko, 1994; Yeung & Chow, 2002). Various techniques based on hidden Markov models (HMM) have also been proposed (Xiang & Gong, 2005; 2006; Zhang et al., 2005). (Yacoob & Black, 1999) and (Bashir et al., 2005b; 2007) have presented a framework for modeling and recognition of human motion based on a trajectory segmentation scheme. A framework is presented to estimate the multimodal probability density function (PDF), based on PCA coefficients of the sub-trajectories, using GMM. Different classes of object motion are modelled by a continuous HMM per class where the state PDFs are represented by GMMs. The proposed technique has been shown to work for sign language recognition. The proposed classification system can not handle anomalies in test data and can only classify samples from normal patterns. (Xiang & Gong, 2005; 2006) propose a framework for behavior classification and anomaly detection in video sequences. Natural grouping of behaviour patterns is learnt through unsupervised model selection and feature selection on the eigenvectors of a normalized affinity matrix. A Multi-Observation Hidden Markov Model is used for modelling the behaviour pattern. (Hu et al., 2006; 2007) and (Khalid & Naftel, 2006) models normal motion patterns by estimating single multimodal gaussian for each class. For anomaly detection in (Hu et al., 2006), the probability of a trajectory belonging to each motion pattern is calculated. If the probability of association of trajectory to the closest motion pattern is less then a threshold, the trajectory is treated as anomalous. In (Rea et al., 2004), a semantic event detection technique based on discrete HMMs is applied to snooker videos. (Zhang et al., 2005) propose a semi-supervised model using HMMs for anomaly detection. Temporal dependencies are modelled using HMMs. The probability density function of each HMM state is assumed to be a GMM. (Owens & Hunter, 2000) uses Self Organizing Feature Maps (SOFM) to learn normal trajectory patterns. While classifying trajectories, if the distance of

**3.1 Step 1: Identification of m-Medoids**

3. Update responsibility matrix � as

4. Update availability matrix � as

other potential exemplar for trajectory *sa*.

5. Identify the exemplar for each sample as

samples.

1. Form the affinity matrix *<sup>A</sup>* <sup>∈</sup> *<sup>R</sup>n*×*<sup>n</sup>* defined by

The algorithm for identification of medoids using finite dimensional features in general feature space with a computable similarity matrix is based on the affinity propagation based clustering algorithm (Frey & Dueck, 2007). Let *DB*(*i*) be the classified training samples

Behavior Recognition Using any Feature Space Representation of Motion Trajectories 105

*exp* <sup>−</sup>*dist*(*sa* ,*sb* ) 2*σ*<sup>2</sup>

Here *sa* ,*sb* <sup>∈</sup> *DB*(*i*) , *<sup>σ</sup>* is the scaling parameter and *<sup>P</sup>*(*a*) is the preference parameter indicating the suitability of sample *a* to be selected as an exemplar (medoid). We set *P*(*a*) to the median of affinities of sample *a* with *n* samples. We use a dynamic value of *σ* which is set to be the 6*th* nearest neighbor of *sa* to cater for variation in local distribution of trajectory

2. Initialize availability matrix �(*a*, *b*) = 0 ∀*a*, *b*. The entry �(*a*, *b*) in availability matrix stores the suitability of trajectory *sb* to be selected by trajectory *sa* as its exemplar.

The entry �(*a*, *b*) in the responsibility matrix reflects the accumulated evidence for how well-suited trajectory *sb* to serve as an exemplar for trajectory *sa* while taking into account

�(*a*, *<sup>b</sup>*) = min{0, �(*b*, *<sup>b</sup>*) + <sup>∑</sup>∀*c s*.*t*. *<sup>c</sup>*�=*a*∧*c*�=*b*}{0, �(*c*, *<sup>b</sup>*)} *ifa* �<sup>=</sup> *<sup>b</sup>*

6. Iterate through steps 3-5 till the algorithm is converged or maximum number of learning iterations (*tmax*) is exceeded. The algorithm is considered to have converged if there is no

7. If the number of exemplars identified are smaller than the desired number of medoids, set higher values of preference and vice versa. The algorithm is repeated till the desired number of exemplars are identified. An appropriate value of preference parameter, for identification of desired number of medoids, is searched using a bisection method.

After the identification of medoids **M**(*i*) for pattern *i*, we intend to identify and pre-compute a set of possible normality ranges for a given pattern. Values of normality ranges for a given

change in exemplar identification for certain number of iterations (*tconvergance*).

8. Append exemplars *ξ<sup>a</sup>* to the list of medoids **M**(*i*) modeling the pattern *i*.

**3.2 Step 2: Computation of possible normality ranges**

�(*a*, *<sup>b</sup>*) = *<sup>A</sup>*(*a*, *<sup>b</sup>*) − *max*∀*c s*.*t*. *<sup>b</sup>*�=*<sup>c</sup>* {�(*a*, *<sup>c</sup>*), *<sup>A</sup>*(*a*, *<sup>c</sup>*)} (2)

<sup>∑</sup>∀*c s*.*t*. *<sup>a</sup>*�=*<sup>c</sup>* max{0, �(*c*, *<sup>a</sup>*)} *otherwise* (3)

*ξ<sup>a</sup>* = *argmaxb*[�(*a*, *b*) + �(*a*, *b*)] (4)

*ifa* �= *b*

*<sup>P</sup>*(*a*) *otherwise* (1)

associated to pattern *i*, the modeling algorithm comprises the following steps:

*<sup>A</sup>*(*a*, *<sup>b</sup>*) =

the trajectory to its allocated class exceeds a threshold value, the trajectory is identified as anomalous.

In our previous work (Khalid, 2010b), we have proposed *m*-Medoids based activity Modeling and Classification approach using low dimensional feature vector representation of trajectories in Euclidean Space (MC-ES). *m*-Medoids based approach models a pattern by a set of cluster centres of mutually disjunctive sub-classes (referred to as medoids) within the pattern. Once the *m*-Medoids model for all the classes have been learnt, the MC-ES approach performs classification of new trajectories and anomaly detection by checking the closeness of said trajectories to the models of different classes using hierarchical classifier. The anomaly detection module required specification of threshold which is used globally for all the patterns. However, this approach had unaddressed issues like manual specification of threshold for anomaly detection, identification of appropriate value of threshold for anomaly detection and anomaly detection of motion patterns with different scale and orientation which is used globally for all the patterns. These issues are addressed by a localized *m*-Medoids based approach (LMC-ES) as proposed in (Khalid, 2010a) which enables us to automatically select a local significance parameter for each pattern taking into consideration the distribution of individual patterns. LMC-ES can effectively handle patterns with different orientation and scale and has been shown to give superior performance than competitors including GMM, HMM and SVM based classifiers. However, there are still open issues (i) Modeling, classification and anomaly detection in the presence of multimodal distribution of trajectories within a pattern (ii) Soft classification in the presence of multimodal pattern distribution to minimize misclassification (iii) Modeling and classification in feature spaces for which we can not compute mean.

The contribution of this work is to present an extension of *m*-Medoids based modeling approach, wherein the multimodal distribution of samples in each pattern is represented using multimodal *m*-Medoids. An approach for multimodal model-based classification and anomaly detection is also presented. The presented mechanism is based on a soft classification approach which enables the proposed multimodal classifier to adapt to the multimodal distribution of samples within different patterns. The multimodal *m*-Medoids based modeling and classification is applicable to any feature spaces with a computable pairwise similarity measure.

### **3. Multimodal m-Medoids based modeling**

Given a representation of trajectories in any feature space for a given motion pattern, we wish to model the underlying distribution of trajectories within a pattern using training data. A pattern is modeled by a set of cluster centers of mutually disjunctive sub-classes (referred to as medoids) within the pattern. The proposed modeling technique referred to as *m*-Medoids modeling, models the class containing *n* members with *m* medoids known *a-priori*. Modeling of pattern using multimodal *m*-Medoids approach in general feature space is a three step process, (i) identification of *m* medoids, (ii) computation of set of possible normality ranges for the pattern and (iii) selection of customized normality range for each medoid. The resulting models of identified patterns can then be used to classify new unseen trajectory data to one of the modeled classes or identify it as anomalous if it is significantly distant from all of the modeled pattern.

#### **3.1 Step 1: Identification of m-Medoids**

4 Will-be-set-by-IN-TECH

the trajectory to its allocated class exceeds a threshold value, the trajectory is identified as

In our previous work (Khalid, 2010b), we have proposed *m*-Medoids based activity Modeling and Classification approach using low dimensional feature vector representation of trajectories in Euclidean Space (MC-ES). *m*-Medoids based approach models a pattern by a set of cluster centres of mutually disjunctive sub-classes (referred to as medoids) within the pattern. Once the *m*-Medoids model for all the classes have been learnt, the MC-ES approach performs classification of new trajectories and anomaly detection by checking the closeness of said trajectories to the models of different classes using hierarchical classifier. The anomaly detection module required specification of threshold which is used globally for all the patterns. However, this approach had unaddressed issues like manual specification of threshold for anomaly detection, identification of appropriate value of threshold for anomaly detection and anomaly detection of motion patterns with different scale and orientation which is used globally for all the patterns. These issues are addressed by a localized *m*-Medoids based approach (LMC-ES) as proposed in (Khalid, 2010a) which enables us to automatically select a local significance parameter for each pattern taking into consideration the distribution of individual patterns. LMC-ES can effectively handle patterns with different orientation and scale and has been shown to give superior performance than competitors including GMM, HMM and SVM based classifiers. However, there are still open issues (i) Modeling, classification and anomaly detection in the presence of multimodal distribution of trajectories within a pattern (ii) Soft classification in the presence of multimodal pattern distribution to minimize misclassification (iii) Modeling and classification in feature spaces for which we can

The contribution of this work is to present an extension of *m*-Medoids based modeling approach, wherein the multimodal distribution of samples in each pattern is represented using multimodal *m*-Medoids. An approach for multimodal model-based classification and anomaly detection is also presented. The presented mechanism is based on a soft classification approach which enables the proposed multimodal classifier to adapt to the multimodal distribution of samples within different patterns. The multimodal *m*-Medoids based modeling and classification is applicable to any feature spaces with a computable pairwise similarity

Given a representation of trajectories in any feature space for a given motion pattern, we wish to model the underlying distribution of trajectories within a pattern using training data. A pattern is modeled by a set of cluster centers of mutually disjunctive sub-classes (referred to as medoids) within the pattern. The proposed modeling technique referred to as *m*-Medoids modeling, models the class containing *n* members with *m* medoids known *a-priori*. Modeling of pattern using multimodal *m*-Medoids approach in general feature space is a three step process, (i) identification of *m* medoids, (ii) computation of set of possible normality ranges for the pattern and (iii) selection of customized normality range for each medoid. The resulting models of identified patterns can then be used to classify new unseen trajectory data to one of the modeled classes or identify it as anomalous if it is significantly distant from all of the

anomalous.

not compute mean.

**3. Multimodal m-Medoids based modeling**

measure.

modeled pattern.

The algorithm for identification of medoids using finite dimensional features in general feature space with a computable similarity matrix is based on the affinity propagation based clustering algorithm (Frey & Dueck, 2007). Let *DB*(*i*) be the classified training samples associated to pattern *i*, the modeling algorithm comprises the following steps:

1. Form the affinity matrix *<sup>A</sup>* <sup>∈</sup> *<sup>R</sup>n*×*<sup>n</sup>* defined by

$$A(a,b) = \begin{cases} \exp\left(\frac{-\text{dist}(s\_a, s\_b)}{2\sigma^2}\right) & \text{if } a \neq b \\ \quad P(a) & \text{otherwise} \end{cases} \tag{1}$$

Here *sa* ,*sb* <sup>∈</sup> *DB*(*i*) , *<sup>σ</sup>* is the scaling parameter and *<sup>P</sup>*(*a*) is the preference parameter indicating the suitability of sample *a* to be selected as an exemplar (medoid). We set *P*(*a*) to the median of affinities of sample *a* with *n* samples. We use a dynamic value of *σ* which is set to be the 6*th* nearest neighbor of *sa* to cater for variation in local distribution of trajectory samples.


$$\mathfrak{R}(a,b) = A(a,b) - \max\_{\forall c \text{ s.t. } b \neq c} \left\{ \mathbb{S}(a,c), A(a,c) \right\} \tag{2}$$

The entry �(*a*, *b*) in the responsibility matrix reflects the accumulated evidence for how well-suited trajectory *sb* to serve as an exemplar for trajectory *sa* while taking into account other potential exemplar for trajectory *sa*.

4. Update availability matrix � as

$$\mathbb{S}(a,b) = \begin{cases} \min\{0, \Re(b,b) + \sum\_{\forall c \text{ s.t. } c \neq a \land c \neq b} \{0, \Re(c,b)\} & \text{if } a \neq b\\ \sum\_{\forall c \text{ s.t. } a \neq c} \max\{0, \Re(c,a)\} & \text{otherwise} \end{cases} \tag{3}$$

5. Identify the exemplar for each sample as

$$\mathfrak{f}\_a = \arg\max\_b \left[ \mathfrak{J}(a, b) + \mathfrak{R}(a, b) \right] \tag{4}$$


#### **3.2 Step 2: Computation of possible normality ranges**

After the identification of medoids **M**(*i*) for pattern *i*, we intend to identify and pre-compute a set of possible normality ranges for a given pattern. Values of normality ranges for a given

5. Increment false negative count *FN*(*r*), corresponding to closest medoid *Mr*, each time

Behavior Recognition Using any Feature Space Representation of Motion Trajectories 107

7. Calculate Significance Parameter Validity Index (*SPV I*) to check the effectiveness of

where *β* is a scaling parameter to adjust the sensitivity of proposed classifier to false

The space complexity of the proposed modeling algorithm in general feature space is *O*(3 ∗ *n*2). The time complexity of our algorithm is the sum of time complexities of the three steps and is equivalent to *<sup>O</sup>*(*<sup>ω</sup>* <sup>∗</sup> (*n*<sup>2</sup> <sup>+</sup> *<sup>n</sup>*<sup>2</sup> <sup>∗</sup> *log*(*n*))) + *<sup>O</sup>*((#*medoids* <sup>∗</sup> *log*(#*medoids*))) + *<sup>O</sup>*(|*DB*<sup>|</sup>

• *<sup>O</sup>*(*n*<sup>2</sup> <sup>∗</sup> *log*(*n*)) is the time complexity of message passing to compute availability and

• *ω* is the number of times the modeling algorithm is repeated to identify *m* medoids. It has

• |*DB*| ∗ *m* is the time complexity for selecting customized normality range for each medoid

Once the *m*-Medoids based model for all the classes have been learnt, the classification of new trajectories is performed by checking the closeness of said trajectory to the models of different classes. The classification of unseen samples to known classes and anomaly detection

1. Identify *k* nearest medoids, from the entire set of medoids (**M**) belonging to different

*Dist*(Q,*R*) <sup>≤</sup> Dist(Q,S)<sup>∧</sup> <sup>|</sup>**C**|=k } (11)

*k*-NM (*Q*, **M**, *k*) = {**C** ∈ **M**|∀*R* ∈ **C**, *S* ∈ **M** − **C**,

*SPV I*(*k*, *τ*) = *β* × *FP*(*k*)+(1 − *β*) × *FN*(*k*) 0 ≤ *β* ≤ 1 ∀*k* (9)

is the dynamic significance parameter that have a different normality range for

<sup>=</sup> *arg min <sup>τ</sup> SPV I*(*τ*, *<sup>k</sup>*) <sup>∀</sup>*Mk* <sup>∈</sup> **<sup>M</sup>**(*c*) (10)

2 ∗

when the sample is misclassified to pattern *c*. 6. Iterate through steps 2-5 for all the samples in *DB*.

current value of *τ* for a particular medoid using:

 *τ* (*c*,*k*)

each medoid depending on the local density.

**4. Classification and anomaly detection**

is performed using following steps:

classes, to unseen sample *Q* as:

• *O*(*n*2) is the time complexity of affinity matrix computation

been observed that the value of *ω* normally lies in the range 3-10. • *m* ∗ *log*(*m*) is the time complexity of computing possible normality range

where |*DB*| is the number of trajectories present in trajectory dataset *DB*.

8. Set *τ* = *τ* − 1.

where *τ* (*c*,*k*)

#*medoids* ∗ *log*(#*medoids*)) where

responsibility matrix

9. Iterate through step 2-8 till *τ* = 1.

positives and false negatives according to specific requirements.

10. Identify the value of significance parameter for a given medoid as:

pattern is determined by the inter-medoid distances within a given pattern. Hence, different patterns will have different set of possible normality ranges depending on the distribution of samples, and in turn medoids, within a pattern. In this step, a set of possible normality ranges **D**(*c*) for the pattern *c* is computed as follows:

1. Identify the closest pair of medoids (*i*, *j*) (indexed by (*p*, *q*)) from **M**(*c*) as follows:

$$\mathbf{r}(p,q) = \arg\min\_{\mathbf{(i,j)}} \text{dist}(\mathbf{M}\_{\mathbf{i}\nu} \mathbf{M}\_{\mathbf{j}}) \quad \forall \mathbf{i}, \mathbf{j} \land \mathbf{i} \neq \mathbf{j} \tag{5}$$

where dist(.,.) is the distance function for a given feature space representation of trajectories.


$$\mathbf{D}\_{l}^{(c)} = dist(M\_{p}, M\_{q}) \tag{6}$$

4. Remove the closest pair of medoids using

$$\mathbf{M}^{(c)} = \mathbf{M}^{(c)} - \{M\_{p\prime}M\_q\} \tag{7}$$

5. Set *l* = *l* + 1.

6. Iterate through steps 1-5 till there are no mediods left in **M**(*c*).

#### **3.3 Step 3: Selection of customized normality range for each medoid**

After the identification of medoids and a set of possible normality ranges for a given pattern, we select different normality range for each medoid depending on the distribution of samples from the same and different patterns around a given medoid. The normality range is selected to minimize false positives (false identification of training samples from other patterns as a normal member of pattern that is being modeled) and false negatives (classification of normal samples of the pattern being modeled as anomalies). The algorithm for selection of customized normality range for each medoid, to enable multimodal *m*-Medoid based modeling of pattern, comprises of following steps:


$$r = \arg\min\_{k} \text{dist}(\mathbf{Q}\_{\prime} M\_{k}) \quad \forall k \tag{8}$$

where *Q* is the test sample.


$$SPVI(k,\tau) = \beta \times FP(k) + (1-\beta) \times FN(k) \quad \text{ @ } \le \beta \le 1 \quad \forall k \tag{9}$$

where *β* is a scaling parameter to adjust the sensitivity of proposed classifier to false positives and false negatives according to specific requirements.

8. Set *τ* = *τ* − 1.

6 Will-be-set-by-IN-TECH

pattern is determined by the inter-medoid distances within a given pattern. Hence, different patterns will have different set of possible normality ranges depending on the distribution of samples, and in turn medoids, within a pattern. In this step, a set of possible normality ranges

where dist(.,.) is the distance function for a given feature space representation of

After the identification of medoids and a set of possible normality ranges for a given pattern, we select different normality range for each medoid depending on the distribution of samples from the same and different patterns around a given medoid. The normality range is selected to minimize false positives (false identification of training samples from other patterns as a normal member of pattern that is being modeled) and false negatives (classification of normal samples of the pattern being modeled as anomalies). The algorithm for selection of customized normality range for each medoid, to enable multimodal *m*-Medoid based

1. Initialize significance parameter *τ* with the number of possible normality ranges for

2. Sequentially input labeled training instances belonging to all classes and identify the

3. Perform an anomaly test using the anomaly detection system, as proposed in section 4, assuming a one class classifier containing only pattern *c* represented by medoids set **M**(*c*)

4. Increment false positive count *FP*(*r*), corresponding to closest medoid *Mr*, each time when

the sample is a normal member of pattern *c* but is identified as anomalous.

(*p*, *q*) = *arg min*(*i*,*j*)*dist*(*Mi*, *Mj*) ∀*i*, *j* ∧ *i* � *j* (5)

*<sup>l</sup>* = *dist*(*Mp*, *Mq*) (6)

**<sup>M</sup>**(*c*) <sup>=</sup> **<sup>M</sup>**(*c*) − {*Mp*, *Mq*} (7)

*r* = *arg mink dist*(*Q*, *Mk*) ∀*k* (8)

1. Identify the closest pair of medoids (*i*, *j*) (indexed by (*p*, *q*)) from **M**(*c*) as follows:

**D**(*c*)

6. Iterate through steps 1-5 till there are no mediods left in **M**(*c*).

**3.3 Step 3: Selection of customized normality range for each medoid**

**D**(*c*) for the pattern *c* is computed as follows:

3. Populate the distance array at index *l* using

4. Remove the closest pair of medoids using

modeling of pattern, comprises of following steps:

pattern *c* as computed in Step 2.

where *Q* is the test sample.

using the current value of *τ*.

closest medoid, indexed by *r*, using:

trajectories. 2. Set *l* = 1.

5. Set *l* = *l* + 1.


$$\widehat{\tau\_{\tau(c,k)}} = \arg\min \tau\_{\tau} \quad SPVI(\tau, k) \quad \forall M\_k \in \mathbf{M}^{(c)} \tag{10}$$

where *τ* (*c*,*k*) is the dynamic significance parameter that have a different normality range for each medoid depending on the local density.

The space complexity of the proposed modeling algorithm in general feature space is *O*(3 ∗ *n*2). The time complexity of our algorithm is the sum of time complexities of the three steps and is equivalent to *<sup>O</sup>*(*<sup>ω</sup>* <sup>∗</sup> (*n*<sup>2</sup> <sup>+</sup> *<sup>n</sup>*<sup>2</sup> <sup>∗</sup> *log*(*n*))) + *<sup>O</sup>*((#*medoids* <sup>∗</sup> *log*(#*medoids*))) + *<sup>O</sup>*(|*DB*<sup>|</sup>

#*medoids* ∗ *log*(#*medoids*)) where


#### **4. Classification and anomaly detection**

Once the *m*-Medoids based model for all the classes have been learnt, the classification of new trajectories is performed by checking the closeness of said trajectory to the models of different classes. The classification of unseen samples to known classes and anomaly detection is performed using following steps:

1. Identify *k* nearest medoids, from the entire set of medoids (**M**) belonging to different classes, to unseen sample *Q* as:

$$\begin{aligned} k\text{-NM}\left(Q,\mathbf{M},k\right) = \{\mathbf{C}\in\mathbf{M} \,|\forall R\in\mathbf{C}, S\in\mathbf{M}-\mathbf{C},\\ \text{Dist}(\mathbf{Q}, \mathbf{R}) \le \text{Dist}(\mathbf{Q}, \mathbf{S}) \land \mid \mathbf{C}\mid = \mathbf{k}\} \end{aligned} \tag{11}$$

2 ∗

where **M** is the set of all medoids from different classes and **C** is the ordered set of *k* closest medoids starting from the nearest medoid.


$$d = \mathbf{D}\_l^{(c)} \tag{12}$$

6. Test sample *Q* is considered to be a valid member of class *c* if:

$$\text{Dist}(Q\_\prime M\_r) \le d \tag{13}$$

Fig. 1. *m*-Medoids based modeling of patterns using (a) LMC-ES framework (b) MMC-GFS

Behavior Recognition Using any Feature Space Representation of Motion Trajectories 109

region, else it is marked as anomalous. Visualization of LMC-ES and MMC-GFS based

Multimodal modeling using MMC-GFS frameworks caters for the multimodal distribution within a pattern. On the other hand, LMC-ES framework always assumes a unimodal distribution within a pattern and hence can not cater for the dynamic distribution of samples within a pattern. It is apparent from Fig. 1 that MMC-GFS frameworks have generated more accurate models that have accommodated the variation in sample density within a given pattern. LMC-ES framework performs a hard classification of unseen sample. A sample is classified to a pattern represented by the majority of medoids from a set of *k* nearest medoids. The sample may not lie in the normality region of a pattern to which it is classified and hence

modeling is provided in Fig. 1(a) and Fig. 1(b) respectively.

framework.


The time complexity of MMC-GFS based classification and anomaly detection algorithm is *O*(|**M**|) + *O*(*k*) for anomalous samples where |*M*| is the total number of medoids belonging to all classes. However, for most of the normal samples the time complexity is *O*(|**M**|). The time complexity can be further reduced by using efficient indexing structre like kd-trees to index |**M**| medoids for efficient *k*-NM search.

#### **5. Relative merits of** *m***-Medoids based modeling and classification algorithms**

In this section, we provide a comparative evaluation of the proposed multimodal *m*-Medoids (MMC-GFS) and localized *m*-Medoids (Khalid, 2010a) based frameworks (LMC-ES) for modeling, classification and anomaly detection. These frameworks can be characterized in terms of the following attributes:


For the ease in understanding of the comparative analysis, simulation of the working of proposed modeling and classification algorithms for arbitrary shaped patterns having multimodal distributions is presented in Fig. 1. In the left image of Fig. 1, each point represents the training sample and instances belonging to the same class are represented with same color and marker. Squares superimposed on each group of instances represent the medoids used for modeling the pattern. Normality region generated using different frameworks for classification and anomaly detection is depicted in the right image of Fig. 1. Test sample is considered to be a normal member of the class if it lies within the normality

medoids starting from the nearest medoid.

5. Identify the normality threshold *d* w.r.t. the medoid *r* using **D**(*c*) as:

7. If the condition specified in eq. (13) is not satisfied, increment the index *ı* by 1.

6. Test sample *Q* is considered to be a valid member of class *c* if:

• Ability to deal with multimodal distribution within a pattern

• Ability to deal with variety of feature space representation of trajectories • Time complexity of generating *m*-Medoids based model of known patterns

2. Initialize nearest medoid index *ı* to 1.

index |**M**| medoids for efficient *k*-NM search.

terms of the following attributes:

normality

3. Set *r* to the id of *ı*

*τ* (*c*,*r*) .

4. Set *l* =

where **M** is the set of all medoids from different classes and **C** is the ordered set of *k* closest

*d* = **D**(*c*)

8. Iterate steps 3-7 till *ı* gets equivalent to *k*. If the test trajectory *Q* has not been identified as a valid member of any class, it is considered to be an outlier and deemed anomalous. The time complexity of MMC-GFS based classification and anomaly detection algorithm is *O*(|**M**|) + *O*(*k*) for anomalous samples where |*M*| is the total number of medoids belonging to all classes. However, for most of the normal samples the time complexity is *O*(|**M**|). The time complexity can be further reduced by using efficient indexing structre like kd-trees to

**5. Relative merits of** *m***-Medoids based modeling and classification algorithms**

In this section, we provide a comparative evaluation of the proposed multimodal *m*-Medoids (MMC-GFS) and localized *m*-Medoids (Khalid, 2010a) based frameworks (LMC-ES) for modeling, classification and anomaly detection. These frameworks can be characterized in

• Time complexity of classification and anomaly detection using learned models of

For the ease in understanding of the comparative analysis, simulation of the working of proposed modeling and classification algorithms for arbitrary shaped patterns having multimodal distributions is presented in Fig. 1. In the left image of Fig. 1, each point represents the training sample and instances belonging to the same class are represented with same color and marker. Squares superimposed on each group of instances represent the medoids used for modeling the pattern. Normality region generated using different frameworks for classification and anomaly detection is depicted in the right image of Fig. 1. Test sample is considered to be a normal member of the class if it lies within the normality

• Scalability of modeling mechanism to cope with increasing number of training data

*th* nearest medoid and *c* to the index of its corresponding class.

*<sup>l</sup>* (12)

*Dist*(*Q*, *Mr*) ≤ *d* (13)

Fig. 1. *m*-Medoids based modeling of patterns using (a) LMC-ES framework (b) MMC-GFS framework.

region, else it is marked as anomalous. Visualization of LMC-ES and MMC-GFS based modeling is provided in Fig. 1(a) and Fig. 1(b) respectively.

Multimodal modeling using MMC-GFS frameworks caters for the multimodal distribution within a pattern. On the other hand, LMC-ES framework always assumes a unimodal distribution within a pattern and hence can not cater for the dynamic distribution of samples within a pattern. It is apparent from Fig. 1 that MMC-GFS frameworks have generated more accurate models that have accommodated the variation in sample density within a given pattern. LMC-ES framework performs a hard classification of unseen sample. A sample is classified to a pattern represented by the majority of medoids from a set of *k* nearest medoids. The sample may not lie in the normality region of a pattern to which it is classified and hence

**6. Experimental results**

to competitive techniques.

**6.1 Experimental datasets**

1.

In this section, we present some results to analyze the performance of the proposed multimodal *m*-Medoids based modeling, classification and anomaly detection as compared

Behavior Recognition Using any Feature Space Representation of Motion Trajectories 111

Experiments are conducted on synthetic SIM2 and real life LAB (Khalid, 2010a;b; Khalid & Naftel, 2006), HIGHWAY (Khalid & Naftel, 2006) and ASL (Bashir et al., 2006; 2005a;b; 2007; Khalid, 2010b; Khalid & Naftel, 2006) datasets. Details of these datasets can be found in Table

**trajectories**

**Extraction method**

object and storing motion coordinates.

vehicles using PTMS(Melo et al., 2004)

coordinates of the mass of right hand from files containing complete sign information.

arbitrary # Simulation. Y

moving

tracking algorithm.

6650 Extracting (*x*, *y*)

152 Tracking

355 Tracking

**Labelled (Y/N)**

Y

Y

Y

**Dataset Description # of**

comprising of two dimensional coordinates.

in the laboratory controlled environment for testing purposes. Trajectories can be categorised into 4 classes.

dataset generated by tracking vehicles in a highway traffic surveillance sequence.

of signers as different words are signed. Dataset consists of signs for 95 different word classes with 70 samples per word.

Table 1. Overview of datasets used for experimental evaluation

**6.2 Experiment 1: Evaluation of** *m***-Medoids based frameworks for classification and**

The purpose of this experiment is to evaluate the performance of proposed MMC-GFS and LMC-ES based frameworks for classification of unseen data samples to one of the known patterns. The effectiveness of the proposed frameworks to perform anomaly detection is also demonstrated here. The experiment has been conducted on simulated SIM2 dataset. Training

SIM2 Simulated dataset

LAB Realistic dataset generated

HIGHWAY Realistic vehicle trajectory

ASL Trajectories of right hand

**anomaly detection**

deemed anomalous. However, it is likely that it may still fall in the normality region of the second closest but less dense pattern having larger normality range. The hardness of LMC-ES based classification algorithm will result in the misclassification of such samples. However, MMC-GFS based classification and anomaly detection algorithm does not give a hard decision and checks for the membership of test trajectories w.r.t. different patterns until it is identified as a valid member of some pattern or it has been identified as anomalous w.r.t. *k* nearest medoids. This relatively softer approach enables the MMC-GFS based classification algorithm to adapt to the multimodal distribution of samples within different patterns. This phenomena is highlighted in Fig. 2. The samples, represented by 'x' marker, will be classified to blue pattern but is marked as anomalous using LMC-ES classifier as it falls outside the normality range of dense medoids belonging to the closest pattern. On the other hand, soft classification technique as proposed in MMC-GFS frameworks will correctly classify the sample as normal members of green pattern. Another benefit of MMC-GFS framework is that it can be applied to any feature space representation of trajectories with a given distance function. On the other hand, LMC-ES can only operate in feature spaces with a computable mean.

Fig. 2. Scenario for evaluating the adaptation of classification algorithms as proposed in different *m*-Medoids based frameworks.

Algorithms to generate *m*-Medoids model, as proposed in LMC-ES framework, is efficient and scalable to large datasets. On the other hand, the modeling algorithm of MMC-GFS is not scalable to very large datasets due to the requirement of affinity matrix computation. The space and time complexity is quadratic which is problematic for patterns with large number of training sample. However, this problem can be easily catered by splitting the training sample into subsets and selecting candidate medoids in each subset using algorithm as specified in section 3.1. The final selection of medoids can be done by applying the same algorithm again but now using the candidate medoids instead of all the training sample belonging to a given pattern. The classification algorithm of MMC-GFS framework is more efficient as compared to LMC-ES framework. This efficiency gain is due to the non-iterative unmerged anomaly detection with respect to a given medoid. The anomaly detection is done by applying a single threshold to the distance of the test sample from its *ı th* closest medoid as specified in eq. (13). On the other hand, LMC-ES implements iterative merged anomaly detection, which is more accurate but time consuming as compared to the modeling algorithm proposed in MMC-GFS framework. The time complexity of merged anomaly detection is *O*(*m* ∗ *log*(*m*) − *τ* ∗ *log*(*τ*)).
