**3. Global motion models**

Global motion can be represented by global motion models with several parameters. The simplest global motion model is translation with only two parameters. The complex global motion model is quadric model with 12 parameters. Generally, higher order models have more parameters to be estimated, which can represent more complex motions. The lower order models are special cases of the higher ones. The widely used global motion model is perspective model with 8 parameters, which is expressed as follows

shooting style which has some relationship with video contents [18]. The global motion

From the definition, we find that the global motions have certain consistence for the whole frame as shown in Fig.1. The global motion in Fig.1 (a) is a zoom out and that in Fig.1 (b) is a translation respectively. From Fig.1 (a), we find that the motion direction is from outer to inner regions, which means that the coordinates of a current frame *t* can be generated in the inner regions of the reference frame *v* (*t* > *v*). In Fig.l, the motion vectors in the motion field

*Global motion vector is the motion vector calculated from the estimated global motion parameters.* Global motion vector (,) *GMVx GMVy t t* for the current pixel with its coordinates (,) *t t x y* is

> *GMVx x x GMVy y y*

where (,) *t t x y* are the warped coordinates in the reference frame by the global motion

(a)Zoom-out (b) Translation

Global motion can be represented by global motion models with several parameters. The simplest global motion model is translation with only two parameters. The complex global motion model is quadric model with 12 parameters. Generally, higher order models have more parameters to be estimated, which can represent more complex motions. The lower order models are special cases of the higher ones. The widely used global motion model is perspective model with 8 parameters, which is expressed as

Fig. 1. Global motion fields. (a) Zoom-out and (b) Translation.

*ttt ttt*

(1)

information is especially useful in sport video content analysis [13]-[18].

correspond to the global motion vectors at the coordinates.

parameters from the coordinate (,) *t t x y* .

**3. Global motion models** 

follows

determined as

$$\begin{cases} \mathbf{x}' = \frac{m\_0 \mathbf{x} + m\_1 \mathbf{y} + m\_2}{m\_6 \mathbf{x} + m\_7 \mathbf{y} + \mathbf{1}}\\ \mathbf{y}' = \frac{m\_3 \mathbf{x} + m\_4 \mathbf{y} + m\_5}{m\_6 \mathbf{x} + m\_7 \mathbf{y} + \mathbf{1}} \end{cases} \tag{2}$$

where (,) *x y* and (,) *x y* are the coordinates in the current and the reference image respectively, with the set of parameters 0 7 **m** [ ,, ] *m m* denoting the global motion parameters to be estimated. If *m*6=*m*7=0, then it is an affine model with 6 parameters. Then Eq.(2) can be simplified as follows

$$\begin{cases} \mathbf{x'} = m\_0 \mathbf{x} + m\_1 \mathbf{y} + m\_2 \\ \mathbf{y'} = m\_3 \mathbf{x} + m\_4 \mathbf{y} + m\_5 \end{cases} \tag{3}$$

When *m*0= *m*4=1 and ,*m*1=*m*3=*m*6=*m*7=0, then the perspective model is actually simplified into a translation model as follows

$$\begin{cases} \mathbf{x'} = \mathbf{x} + m\_2 \\ \mathbf{y'} = \mathbf{y} + m\_5 \end{cases} \tag{4}$$

### **4. Global Motion Estimation (GME) approaches**

Intuitively, global motion estimation can be carried out in pixel domain. In the pixel domain based approaches, all the pixels are involved in the estimation of global motion parameters. There are two shortcomings in pixel domain based approach: 1) it is very computational intensive; 2) it is often sensitive to noises (local object motions).

In order to improve the convergence and speed up the calculation, coarse to fine searching approach is often adopted. Moreover, the subset of pixels having the largest gradient magnitude is adopted to estimate the global motion parameters [6]. Sub-point based global motion estimation approaches are very effective in reducing computational costs. To guarantee the accuracy of global motion estimation, how to determine the optimal sub-sets are the key steps. Except the pixel domain based global motion estimation, compressed domain based global motion estimation approaches are also very popular.

Robust global motion estimation usually carries out by identifying the pixels (blocks or regions) that undergo local motions. Fig.2 shows the global motion and local motions. If the local motion blocks can be determined as outliers, then the global motion performance can be improved significantly.

#### **4.1 Pixel domain based GME**

In GME involving two image frames *Ik* and *Iv* (with *k*<*v*), one seeks to minimize the following sum of squared differences between *Iv* and its predicted image *Ik*(*x*(*i*, *j*), *y*(*i*, *j*)) which is obtained after transforming all the pixels in *Ik*.

$$E = \sum\_{i} \sum\_{j} e(i, j)^2 \tag{5}$$

Global Motion Estimation and Its Applications 87

robust histogram based technique is adopted to reject the pixel points with large matching

The hierarchical global motion estimation approach has following advantages: 1) estimating the coarse global motion parameters on the top layer of pyramid is effective for noise filtering; 2) computational cost of coarse global motion estimation is very low at the top layer of pyramid. This is due to the fact that only small resolution images are involved in GME and the global motion model is low order which is easy to get convergence; 3) adaptive model determination with respect to the precisions of global motion parameters, which is also helpful for reducing computational cost. In the enhanced layer, it is only need to updating global motion parameters on the basis of the parameters estimated in its previous layers. The advantages of hierarchical global motion over traditional pixel domain based global motion estimation approach can be shown by the illustrations in Fig.2

 (a) (b) (c) Fig. 2. Illustration of hierarchical global motion estimation approach. (a) Original image and its motion field, (b) and (c) correspond to the second layer and third layer pyramid images

Just as its name implies, partial pixel points based GME approaches only use sub-set of the whole pixels for estimating global motion parameters. In [6], the subset utilized for GME is selected based on gradient magnitudes information. The top 10% pixels with the largest gradient magnitudes are selected and severed as reliable points for GME. This method divides the whole image into 100 sub-regions and selects the top 10% pixels as feature points which can avoid numerical instability. This subset selection approach reduce the computational cost by reduce the number of pixels at the cost of calculating the gradient image and ranking the gradient of the whole pixels. To further reduce the computational

errors by setting their weights to be "0".

respectively.

and their motion fields.

**4.3 Partial pixel points based GME** 

where *e*(*i*, *j*) denotes the error of predicting a pixel located at (*i*, *j*) of frame *Iv*, by using a pixel at location [*x*(*i*, *j*), *y*(*i*, *j*)] of previous frame *Ik*.

$$\text{Re}(\dot{\mathbf{i}}, \dot{\mathbf{j}}) = I\_v(\dot{\mathbf{i}}, \dot{\mathbf{j}}) - I\_k(\mathbf{x}(\dot{\mathbf{i}}, \dot{\mathbf{j}}), y(\dot{\mathbf{i}}, \dot{\mathbf{j}})) \tag{6}$$

The transform mapping functions *x*(*i*, *j*) and *y*(*i*, *j*) (with respect to global motion parameters **m**) should be so chosen that *E* in Eq.(5) is minimized. The well-known Levenberg-Marquard algorithm (LMA) or lest square approach, can be utilized to find the optimal global motion parameters **m** iteratively by minimizing the energy function in Eq.(5) as follows

$$\mathbf{m}^{(v+1)} \equiv \mathbf{m}^{(v)} + \mathbf{A} \mathbf{m}^{(v)} \tag{7}$$

where **m**(*n*) and **m**(*<sup>n</sup>*) are the global motion parameters and updating vector at iteration *n* [8].

All the pixels are involved in the global motion parameters optimization in the traditional LMA algorithm [9]. This is very computational intensive. It is impractical for real-time applications. Moreover, the local motions in video frame may also bias the global motion parameters' estimation precision. Thus improvements are carried out by utilizing hierarchical global motion estimation, partial pixel set and compressed domain based approaches.

#### **4.2 Hierarchical global motion estimation**

In MPEG-4, GME is performed by a hierarchical approach to reduce computational costs [1]. It is an improvement of pixel domain based approach which consists of following three steps. Firstly, spatial pyramid frames are constructed. Secondly, global motion parameters with the coarsest global motion model are estimated at the top layer of the pyramid images. Then, the estimated global motion parameters at the coarsest level are projected to its next high resolution level to get the refined global motion parameters. Finally, the refined global motion parameters are iteratively updated using a least-square based approach and the process continues until convergence [1]. Fig. 2 shows the illustration of hierarchical global motion estimation approach. The original image and its motion field, the second layer and third layer pyramid images and their motion fields are shown in Fig.2 (a), (b) and (c) respectively. In Fig.2 local motion region (LMR) and global motion region (GMR) of each layer are labeled out respectively.

Global motion parameters at each layer are estimated by minimizing the sum of weighted squared errors over all corresponding pairs of pixels (,) *x y i i* and (,) *i i x y* within the current image *f* and the reference image *R* as follow.

$$E = \sum\_{i} \sum\_{j} w(i, j)e(i, j)^2 \tag{8}$$

$$E = \sum\_{i} \sum\_{j} w(i, j) \left[ R(\mathbf{x}(i, j), y(i, j)) - f(i, j) \right]^2 \tag{9}$$

where *wi j* (, ) is the corresponding weight of the pixel at coordinate (*i*,*j*) with *wi j* ( , ) {0,1} . You know, local object motion may create outliers and therefore bias the estimation performance of the global motion parameters. To reduce the influence of such outliers, a

where *e*(*i*, *j*) denotes the error of predicting a pixel located at (*i*, *j*) of frame *Iv*, by using a

The transform mapping functions *x*(*i*, *j*) and *y*(*i*, *j*) (with respect to global motion parameters **m**) should be so chosen that *E* in Eq.(5) is minimized. The well-known Levenberg-Marquard algorithm (LMA) or lest square approach, can be utilized to find the optimal global motion

 **m**(*n*+1) **= m**(*n*)**+m**(*<sup>n</sup>*) (7)

All the pixels are involved in the global motion parameters optimization in the traditional LMA algorithm [9]. This is very computational intensive. It is impractical for real-time applications. Moreover, the local motions in video frame may also bias the global motion parameters' estimation precision. Thus improvements are carried out by utilizing hierarchical global motion estimation, partial pixel set and compressed domain based

In MPEG-4, GME is performed by a hierarchical approach to reduce computational costs [1]. It is an improvement of pixel domain based approach which consists of following three steps. Firstly, spatial pyramid frames are constructed. Secondly, global motion parameters with the coarsest global motion model are estimated at the top layer of the pyramid images. Then, the estimated global motion parameters at the coarsest level are projected to its next high resolution level to get the refined global motion parameters. Finally, the refined global motion parameters are iteratively updated using a least-square based approach and the process continues until convergence [1]. Fig. 2 shows the illustration of hierarchical global motion estimation approach. The original image and its motion field, the second layer and third layer pyramid images and their motion fields are shown in Fig.2 (a), (b) and (c) respectively. In Fig.2 local motion region (LMR) and global motion region (GMR) of each

Global motion parameters at each layer are estimated by minimizing the sum of weighted squared errors over all corresponding pairs of pixels (,) *x y i i* and (,) *i i x y* within the current

where *wi j* (, ) is the corresponding weight of the pixel at coordinate (*i*,*j*) with *wi j* ( , ) {0,1} . You know, local object motion may create outliers and therefore bias the estimation performance of the global motion parameters. To reduce the influence of such outliers, a

*i j*

*i j*

<sup>2</sup> (, )(, )

<sup>2</sup> ( , )[ ( ( , ), ( , )) ( , )]

*E wi jei j* (8)

*E wi j Rxi j yi j f i j* (9)

parameters **m** iteratively by minimizing the energy function in Eq.(5) as follows

( , ) ( , ) ( ( , ), ( , )) *v k ei j I i j I xi j yi j* (6)

are the global motion parameters and updating vector at iteration *n* [8].

pixel at location [*x*(*i*, *j*), *y*(*i*, *j*)] of previous frame *Ik*.

where **m**(*n*)

approaches.

and **m**(*n*)

**4.2 Hierarchical global motion estimation** 

layer are labeled out respectively.

image *f* and the reference image *R* as follow.

robust histogram based technique is adopted to reject the pixel points with large matching errors by setting their weights to be "0".

The hierarchical global motion estimation approach has following advantages: 1) estimating the coarse global motion parameters on the top layer of pyramid is effective for noise filtering; 2) computational cost of coarse global motion estimation is very low at the top layer of pyramid. This is due to the fact that only small resolution images are involved in GME and the global motion model is low order which is easy to get convergence; 3) adaptive model determination with respect to the precisions of global motion parameters, which is also helpful for reducing computational cost. In the enhanced layer, it is only need to updating global motion parameters on the basis of the parameters estimated in its previous layers. The advantages of hierarchical global motion over traditional pixel domain based global motion estimation approach can be shown by the illustrations in Fig.2 respectively.

Fig. 2. Illustration of hierarchical global motion estimation approach. (a) Original image and its motion field, (b) and (c) correspond to the second layer and third layer pyramid images and their motion fields.
