**2.1 Architecture**

In the same way as video analysis systems which have two main architectures, i.e. centralized and decentralized architecture (Senior 2009), object video retrieval for surveillance systems has also two main modes: late fusion and early fusion modes. In the late fusion mode (cf. Fig. 2), the object detection and tracking are performed on the video stream of each camera. Then, the object matching compares the query and the detected objects for each camera. The matching results are fused to form the retrieval results. In the early fusion mode (cf. Fig. 3), the data fusion is done in the object detection and tracking module. We can see that the object retrieval method in this early fusion mode has more opportunities to obtain a good result because if an object is not totally observed by a camera, it may be well captured by other cameras. Most of the state of the art work belongs to the early fusion mode. However, the fusion strategy is not explicitly discussed except in the work of Calderara et al. (Calderara, Cucchiara et al. 2006).

40 Recent Developments in Video Surveillance

Fig. 1. Indexing and retrieval facility in a surveillance system. Videos coming from cameras will be interpreted by the video analysis module. There are two modes for using the analysed results: (1) the corresponding alarms are sent to security staffs to inform them about the situation; (2) the analysed results are stored in order to be used in the future.

This section aims to give an overview of existing approaches for object retrieval in

In the same way as video analysis systems which have two main architectures, i.e. centralized and decentralized architecture (Senior 2009), object video retrieval for surveillance systems has also two main modes: late fusion and early fusion modes. In the late fusion mode (cf. Fig. 2), the object detection and tracking are performed on the video stream of each camera. Then, the object matching compares the query and the detected objects for each camera. The matching results are fused to form the retrieval results. In the early fusion mode (cf. Fig. 3), the data fusion is done in the object detection and tracking module. We can see that the object retrieval method in this early fusion mode has more opportunities to obtain a good result because if an object is not totally observed by a camera, it may be well captured by other cameras. Most of the state of the art work belongs to the early fusion mode. However, the fusion strategy is not explicitly discussed except in the

**2. Object retrieval for surveillance videos** 

work of Calderara et al. (Calderara, Cucchiara et al. 2006).

surveillance videos.

**2.1 Architecture** 

Fig. 2. Late fusion object retrieval approach: the object detection and tracking is performed on video stream of each camera. Then, the object matching compares the query and the detected objects of each camera. The matching result is then fused to form the retrieval results.

Fig. 3. Early fusion object retrieval approach.
