Definition 2: Object representation

In surveillance applications, one object is in general detected and tracked in a number of frames. In other words, a set of object blobs is dened for an object. Therefore, an object can be represented as:

$$O = \{B\_i\}, i \in \mathbb{1}, N \tag{1}$$

Appearance-Based Retrieval for Tracked Objects in Surveillance Videos 45

Fig. 6. Examples of object detection quality (a) The object is not present in the blob; (b) The object is partially present in the blob; (c) and (d) The object is totally present in the blob.

However, the obtained results in several video surveillance benchmarks show that current achievement on object tracking is still limited (object ID persistence and object ID confusion metrics are generally much greater than 1). Fig. 7 shows an example of the object ID persistence problem: two tracked objects created for one sole ground-truth object, therefore object ID persistence is equal to 2. Fig. 8 illustrates an example of object ID confusion: three ground-truth objects IDs associated to one sole detected object (object ID confusion = 3).

Fig. 7. An example of the object ID persistence problem: two tracked objects created for one

Based on the above-mentioned analysis, the main challenge in surveillance object indexing and retrieval is the poor quality of object detection and tracking. An object indexing and retrieval algorithm is robust if it can work with different quality of the object detection and

With the object representation as defined in Eq. 1, we believe that object indexing and retrieval methods can address the poor quality of object detection and tracking problem if

they have an effective object signature building and a robust object matching.

sole ground-truth object (object ID persistence = 2).

tracking.

where O is object, Bi is the ith object blob, N is the total number of blobs of object O.

It is worth noting that object blobs can be non-consecutive since an object may not be detected in certain frames and the value of N varies depending on the object life time in the scene. Fig. 5 gives an example of an object that is represented by its blobs. As we can notice, with poor object detection, several object blobs do not cover well the object appearance.

Fig. 5. An object is represented by its blobs.
