**7. Acknowledgement**

This work is supported in part by National Natural Science Foundations of China (NSFC) No.60903121, No.61173109, and Foundation of Microsoft Research Asia.

## **8. Reference**

98 Video Compression

1. Text detection and determining the starting/ending frames of each text line for the output video sequence. The starting and ending frames of a text line is determined by text tracking and matching. This method is with low computational costs and robust to

2. Video text segmentation. In the video text segmentation, text regions can be extracted

3. Global motion estimation. Texts are added during editing, the text regions are not undergoing global motions. So the text recovered regions can be viewed as outliers in

Text occluded region recovery using GM/LM information. From the estimated global

pixels in the text occluded region of current frame. The corresponding diagram of a pixel in text occluded region is shown in Fig.7. TORR is carried out bi-directional and iteratively. The bi-directional approach means that a pixel in text occluded region of current frame *j* can be recovered by forward previous frame *i* and backward replacement from its next frame *k* (with *i*<*j*<*k*). From Fig.7 we find that the first pixel can be recovered (denoted by the solid lines) from its previous frame *i* and cannot recovered (denoted by the dash lines) from its next frame *k*. However, for the second pixel, its replacement in frame *i* is also in text occluded region. Moreover, its replacement in frame *k* is in local motion region (LMR). So the above two directional replacement are both invalid. Thus iteratively carrying out TORR is needed for the video frame. The iteration stops when all pixels in TORR are recovered. Alternatively, the replacement can be carried out by using more than one frame. It is likely that the second pixel in frame *j* can find correct replacement in its previous frames *i*-*n* or *k*+*n*

Fig.8 and Fig.9 show the subjective text occluded region recovery results. The text occluded frames in Fig.8(a) and Fig.9 (a) are from MPEG-7 test video sequences *News*1 and a documentary film of National Geography *Foxes of the Kalahari*. Fig.8 (a) and Fig.9 (a) are the video frames with detected text lines. Fig.8 (b) and Fig.9 (b) show video frames after carrying out TORR using the GM/LM based method. From the recovery results we find that the detail information of the anchorperson is kept well. This further shows the effectiveness

In this chapter, a systematic review of the pixel domain based global motion estimation approaches is presented. With respect to its shortcomings in noise filtering and computational cost, the improvement approaches including hierarchical global motion estimation, partial pixel set based global motion estimation and compressed domain based global motion estimation are provided. Four global motion based applications including GMC/LMC in MPEG-4 video coding standard, global motion based sport video shot classification, GM/LM based error concealment and text occluded region recovery are described. The applications show the effectiveness of global motion based approaches.

1 (,)

*i TR x y* 

*i i*

, where *N* is the total

*N*

the missing and false detections [23].

motion parameters **m** and the text occluded regions

of our GM/LM based text occluded recovery method.

global motion estimation.

(with *n*>0).

**6. Conclusion** 

by the foreground and background integrated method [23].


**Part 3** 

**Quality** 


**Part 3** 

100 Video Compression

[17] Y. Tan, D. Saur, S. Kulkarni, P. Ramadge, "Rapid estimation of camera motion from

[18] X. Qian, H. Wang, G. Liu, and X. Hou, "HMM Based Soccer Video Event Detection Using Enhanced Mid-Level Semantic", Multimedia Tools and Applications, 2011. [19] J. Sub and Y. Ho, "Error concealment based on directional interpolation," IEEE Transactions on Consumer Electronics, vol. 43, pp. 295-302, August 1997. [20] M. Chen, C. Chen and M. Chi, "Temporal error concealment algorithm by recursive

[21] X. Qian, G. Liu, and H. Wang, "Recovering Connected Error Region Based on Adaptive

[22] X. Qian, and G.Liu, "An Effective GM/LM Based Video Error Concealment", Signal

[23] X. Qian, G. Liu, H. Wang, and R. Su, "Text detection, localization and tracking in

Image and Video Processing, 2012, vol.6,no.1, pp.9-17.

Video Techn., no.1,vol. 10, 2000,pp. 133-146.

11, Nov. 2005, pp. 1385-1393.

pp.683-695, 2009.

pp.752-768.

compressed video with application to video annotation," IEEE Trans. Circuits Syst.

block-matching principle", IEEE Trans. Circuits Syst. Video Technol., Vol. 15, No.

Error Concealment Order Determination," IEEE Trans. Multimedia, vol.11, no.4,

compressed videos," Signal Processing: Image Communication, vol.22 , 2007,

**Quality** 

**1. Introduction**

Attention is so natural and so simple: every human, every animal and even every tiny insect is perfectly able to pay attention. In reality as William James, the father of psychology said: "Everybody knows what attention is". It is precisely because everybody "knows" what attention is that few people tried to analyze it before the 19th century. Even though the study of attention was initially developed in the field of psychology, it quickly spread into new domains such as neuroscience to understand its biological mechanisms and, most recently, computer science to model attention mechanisms. There is no common definition of attention, and one can find variations depending on the domain (psychology, neuroscience, engineering, . . . ) or the approach which is taken into account. But, to remain general, human attention can be defined as the natural capacity to selectively focus on part of the incoming stimuli, discarding less "interesting" signals. The main purpose of the attentional process is to make best use of the parallel processing resources of our brains to identify as quickly as possible

Matei Mancas, Dominique De Beul, Nicolas Riche and Xavier Siebert

**Human Attention Modelization** 

*IT Department, Faculty of Engineering (FPMs),* 

**and Data Reduction** 

*University of Mons (UMONS), Mons* 

**6**

*Belgium* 

This natural tendency in data selection shows that raw data is not even used by our multi-billion cells brain which prefers to focus on restricted regions of interest instead of processing the whole data. Human attention is thus the first natural compression algorithm. Several attempts towards the definition of attention state that it is very closely related to data compression and focus resources on the less redundant, thus less compressible data. Tsotosos suggested in Itti et al. (2005) that the one core issue which justifies attention regardless the discipline, methodology or intuition is "information reduction". Schmidhuber (2009) stated that . . . "we pointed out that a surprisingly simple algorithmic principle based on the notions of data compression and data compression progress informally explains fundamental aspects of attention, novelty, surprise, interestingness . . . ". Attention modeling in engineering and computer science domain has very wide applications such as machine vision, audio processing, HCI (Human Computer Interfaces), advertising assessment, robotics and, of

In section 2, an introduction to the notions of saliency and attention will be given and the main computational models working on images, video and audio signals will be presented. In section 3 the ideas which either aims at replacing or complementing classical compression algorithms are reviewed. Saliency-based techniques to reduce the spatial and/or temporal resolution of non-interesting events are listed in section 4. Finally, in section 5, a discussion on

the use of attention-based methods for data compression will conclude the chapter.

those parts of our environment that are key to our survival.

course, data reduction and compression.
