**A Construction Method for Automatic Human Tracking System with Mobile Agent Technology**

Hiroto Kakiuchi, Kozo Tanigawa, Takao Kawamura and Kazunori Sugahara *Melco Power Systems Co., Ltd/Graduate School of Tottori University Japan* 

## **1. Introduction**

20 Will-be-set-by-IN-TECH

20 Recent Developments in Video Surveillance

Cossalter, M., Valenzise, G., Tagliasacchi, M. & Tubaro, S. (2010). Joint Compressive Video Coding and Analysis, *IEEE Transactions on Multimedia* 12(3): 168–183. Doucet, A., de Freitas, N. & Gordon, N. (2001). *Sequential Monte Carlo Methods in Practice*,

Duarte, M., Davenport, M., Takhar, D., Laska, J., Kelly, K. & Baraniuk, R. (2008). Single-Pixel Imaging via Compressive Sampling, *IEEE Signal Processing Magazine* 25(2): 83–91. Elgammal, A., Duraiswami, R., Harwood, D. & Davis, L. (2002). Background and foreground

Isard, M. & Blake, A. (1996). Contour tracking by stochastic propagation of conditional

Poor, H. V. (1994). *An Introduction to Signal Detection and Estimation, Second Edition*,

Reddy, D., Sankaranarayanan, A., Cevher, V. & Chellappa, R. (2008). Compressed sensing

Romberg, J. (2008). Imaging via Compressive Sampling, *IEEE Signal Processing Magazine*

Sankaranarayanan, A. & Chellappa, R. (2008). Optimal Multi-View Fusion of Object Locations,

Sankaranarayanan, A., Veeraraghavan, A. & Chellappa, R. (2008). Object Detection,

Stauffer, C. & Grimson, W. (1999). Adaptive background mixture models for real-time

Vaswani, N. (2008). Kalman filtered compressed sensing, *IEEE International Conference on Image*

Wang, E., Silva, J. & Carin, L. (2009). Compressive particle filtering for target tracking, *IEEE*

Warnell, G., Reddy, D. & Chellappa, R. (2012). Adaptive Rate Compressive Sensing for

Willett, R., Marcia, R. & Nichols, J. (2011). Compressed sensing for practical optical imaging

Yilmaz, A., Javed, O. & Shah, M. (2006). Object Tracking: A Survey, *ACM Computing Surveys*

Background Subtraction, *IEEE International Conference on Acoustics, Speech, and Signal*

tracking, *IEEE Conference on Computer Vision and Pattern Recognition*.

density, *European Conference on Computer Vision* pp. 343–356.

*IEEE Workshop on Motion and Video Computing* pp. 1–8.

*Workshop on Statistical Signal Processing*, pp. 233–236.

systems: a tutorial, *Optical Engineering* 50(7).

modeling using nonparametric kernel density estimation for visual surveillance,

for multi-view tracking and 3-D voxel reconstruction, *IEEE International Conference*

Tracking and Recognition for Multiple Smart Cameras, *Proceedings of the IEEE*

Springer.

Springer-Verlag.

25(2): 14–20.

96(10): 1606–1624.

*Processing* (1): 893–896.

*Processing* .

38(4).

*Proceedings of the IEEE* 90(7): 1151–1163.

*on Image Processing* (4): 221–224.

Human tracking systems that can track a specific person is being researched and developed aggressively, since the system is available for security and a flexible service like investigation of human behaviour. For example, Terashita, Kawaguchi and others propose the method for tracking an object captured by simple active video camera (Terashita et al. 2009; Kawaguchi et al. 2008), and Yin and others propose the solution of the problem of the blurring of the active video camera (Yin et al. 2008). Tanizawa and others propose a mobile agent framework that can become the base of a human tracking system (Tanizawa et al. 2002). These are component technology, and they are available in construction of the human tracking system. On the other hand, Tanaka and others propose a human tracking system using the information from video camera and sensor (Tanaka et al. 2004), and Nakazawa and others propose a human tracking system using recognition technique which recognizes same person using multiple video cameras at the same time (Nakazawa et al. 2001). However, although these proposed systems are available as human tracking system, the systems are constructed under fixed camera position and unchanged photography range of camera. On the other hand, there are several researches to track people with active cameras. Wren and others propose a class of hybrid perceptual systems that builds a comprehensive model of activity in a large space, such as a building, by merging contextual information from a dense network of ultra-lightweight sensor nodes with video from a sparse network of high-capability sensors. They explore the task of automatically recovering the relative geometry between an active camera and a network of one-bit motion detectors. Takemura and others propose a view planning of multiple cameras for tracking multiple persons for surveillance purposes (Takemura et al. 2007). They develop a multi-start local search (MLS) based planning method which iteratively selects fixation points of the cameras by which the expected number of tracked persons is maximized. Sankaranarayanan and others discuss the basic challenges in detection, tracking, and classification using multiview inputs (Sankaranarayanan et al. 2008). In particular, they discuss the role of the geometry induced by imaging with a camera in estimating target characteristics. Sommerlade and others propose a consistent probabilistic approach to control multiple, but diverse active cameras concertedly observing a scene (Sommerlade et al. 2010). The cameras react to objects moving about, arbitrating conflicting interests of target resolution and trajectory accuracy, and the cameras anticipate the appearance of new targets. Porikli and others propose an automatic

A Construction Method for Automatic Human Tracking System with Mobile Agent Technology 23

compares it with the features already stored by the agent. If the features are equivalent, the

The processing flow of the proposed system is also shown in Fig. 1. (i) First, a system user selects an entity on the screen of the agent monitoring terminal, and extracts the feature information of the entity to be tracked. (ii) Next, the feature information is used to generate a mobile agent per target which is registered into the agent management server. (iii) Then the mobile agent is launched from the terminal to the first feature extraction server. (iv) When the mobile agent catches the target entity on the feature extraction server, the mobile agent transmits information such as the video camera number, the discovery time, and the mobile agent identifier to the agent management server. (v) Finally, the mobile agent deploys a copy of itself to the neighbor feature extraction servers and waits for the person to appear. If the mobile agent identifies the person, the mobile agent notifies the agent management server of the information, removes the original and other copy agents, and deploys the copy of itself to the neighbor feature extraction servers again. Continuous

Fig. 1. System configuration and processing flow.

Fig. 2. System architecture.

entity is located by the mobile agent.

tracking is realized by repeating the above flow.

object tracking and video summarization method for multi-camera systems with a large number of non-overlapping field-of-view cameras is explained (Porikli et al. 2003). In this framework, video sequences are stored for each object as opposed to storing a sequence for each camera.

Thus, these studies are efficient as the method to track targets. In the automatic human tracking system, tracking function must be robust even if the system loses a target person. Present image processing is not perfect because a feature extraction like ``SIFT`` (Lowe. 2004) has high accuracy but takes much processing time. The trade-off of accuracy and processing time is required for such a feature extraction algorithm. In addition, the speed a person walks is various and the person may be unable to be captured correctly in cameras. Therefore, it is necessary to re-detect a target person as tracking function even if the system loses the target. In this chapter, a construction method of human tracking system including the detection method is proposed for realistic environment using active camera like the above mentioned. And the system constructed by the method can continuously track plural people at the same time. The detection methods compensate for the above weakness of feature extraction as a function of system. The detection methods also utilize "neighbor node determination algorithm" to detect the target efficiently. The algorithm can determine neighbor camera/server location information without the location and view distance of video camera. Neighbor camera/servers are called "neighbor camera node/nodes" in this chapter. The mobile agent (Lange et al. 1999; Cabri et al. 2000; Valetto et al. 2001; Gray et al. 2002; Motomura et al. 2005; Kawamura et al. 2005) can detect the target person efficiently with knowing the neighbor camera node location information.In this chapter, the algorithm which can determine the neighbor node even if the view distance of video camera changes is also proposed.
