**1.1.1 Block matching algorithm**

Block Matching Algorithm (BMA) (IEG Richardson 2003) is the most popular motion estimation algorithm. BMA calculates motion vector for an entire block of pixels instead of individual pixels. The same motion vector is applicable to all the pixels in the block. This reduces computational requirement and also results in a more accurate motion vector since the objects are typically a cluster of pixels. BMA algorithm is illustrated in figure 1.

#### Fig. 1. Block Matching Algorithm

The current frame is divided into pixel blocks and motion estimation is performed independently for each pixel block. Motion estimation is done by identifying a pixel block from the reference frame that best matches the current block, whose motion is being estimated. The reference pixel block is generated by displacement from the current block's location in the reference frame. The displacement is provided by the Motion Vector (MV). MV consists of is a pair (x, y) of horizontal and vertical displacement values. There are various criteria available for calculating block matching.

The reference pixel blocks are generated only from a region known as the search area. Search region defines the boundary for the motion vectors and limits the number of blocks to evaluate. The height and width of the search region is dependent on the motion in video sequence. The available computing power also determines the search range. Bigger search region requires more computation due to increase in number of evaluated candidates. Typically the search region is kept wider (i.e. width is more than height) since many video sequences often exhibit panning motion. The search region can also be changed adaptively depending upon the detected motion. The horizontal and vertical search range, Sx & Sy, define the search area (+/-Sx and +/- Sy) as illustrated in figure 1.

#### **1.1.2 Full search block matching**

58 Video Compression

there is also the phase plane correlation technique, which generates motion vectors via correlation between current frame and reference frame. However, the most popular technique is Block Matching methodology which is the prime topic of discussion here.

Block Matching Algorithm (BMA) (IEG Richardson 2003) is the most popular motion estimation algorithm. BMA calculates motion vector for an entire block of pixels instead of individual pixels. The same motion vector is applicable to all the pixels in the block. This reduces computational requirement and also results in a more accurate motion vector since

The current frame is divided into pixel blocks and motion estimation is performed independently for each pixel block. Motion estimation is done by identifying a pixel block from the reference frame that best matches the current block, whose motion is being estimated. The reference pixel block is generated by displacement from the current block's location in the reference frame. The displacement is provided by the Motion Vector (MV). MV consists of is a pair (x, y) of horizontal and vertical displacement values. There are

The reference pixel blocks are generated only from a region known as the search area. Search region defines the boundary for the motion vectors and limits the number of blocks to evaluate. The height and width of the search region is dependent on the motion in video sequence. The available computing power also determines the search range. Bigger search region requires more computation due to increase in number of evaluated candidates. Typically the search region is kept wider (i.e. width is more than height) since many video sequences often exhibit panning motion. The search region can also be changed adaptively depending upon the detected motion. The horizontal and vertical search range, Sx & Sy,

the objects are typically a cluster of pixels. BMA algorithm is illustrated in figure 1.

**1.1.1 Block matching algorithm** 

Fig. 1. Block Matching Algorithm

various criteria available for calculating block matching.

define the search area (+/-Sx and +/- Sy) as illustrated in figure 1.

Full search block matching algorithm (Alois, 2009) evaluates every possible pixel block in the search region. Hence, it can generate the best block matching motion vector. This type of BMA can give least possible residue for video compression. But, the required computations are prohibitively high due to the large amount of candidates to evaluate in a defined search region. The number of candidates to evaluate are ((2\*Sx) +1)\*((2\*Sy) +1) which is predominantly high compared to any of the search algorithms. There are several other fast block-matching algorithms, which reduce the number of evaluated candidates yet try to keep good block matching (Yu-Wen, 2006) accuracy. Note that since these algorithms test only limited candidates, they might result in selecting a candidate corresponding to local minima, unlike full search, which always results in global minima. Some of the algorithms are listed below.

#### **1.1.3 Fast search algorithms**

There are many other block matching algorithms (Nuno, 2002) and their variants available, but differs in the manner how they select the candidate for comparison and what is the motion vector resolution. Although, the full search algorithm is the best one in terms of the quality of the predicted image and its resolution of the motion vector it is very computationally intensive. With the realization that motion estimation is the most computationally intensive operation in the coding and transmitting of video streams, people started looking for more efficient algorithms. However, there is a trade-off between the efficiency of the algorithm and the quality of the prediction image. Keeping this trade-off in mind a lot of algorithms have been developed. These algorithms are called Sub-Optimal (Alois, 2009) because although they are computationally more efficient than the Full search, they do not give as good a quality as in the full search.

#### **1.1.4 Three step search**

In a three-step search (TSS) algorithm (Alan Bovik, 2009), the first iteration evaluates nine candidates as shown in figure 2. The candidates are centered on the current block's position. The step size for the first iteration is typically set to half the search range. These algorithms operate by calculating the energy measure (e.g. SAD) at a subset of locations within the search window as illustrated (TSS, sometimes described as N-Step Search) in Figure.2. SAD is calculated at position (0, 0) (the centre of the Figure) and at eight locations ±2*N*−1 (for a search window of ± (2*N* −1) samples). The first nine search locations are numbered '1'. The search location that gives the smallest SAE is chosen as the new search centre and a further eight locations are searched, this time at half the previous distance from the search centre (numbered '2' in the figure). Once again, the 'best' location is chosen as the new search origin and the algorithm is repeated until the search distance cannot be subdivided further. This is the last iteration of the three-step search algorithm. The best matching candidate from this iteration is selected as the final candidate. The motion vector corresponding to this candidate is selected for the current block. The number of candidates evaluated during three-step search is very less compared to the full search algorithm. The TSS is considerably simpler than Full Search (8*N* + 1 search compared with (2*N*+1 −1)2 searches for Full Search) but the TSS (and other fast search algorithms) do not usually perform as well as Full Search.

H.264 Motion Estimation and Applications 61

During each ite ration, a set of three neighboring candidates along the x-axis are tested in Fig.2. The three-candidate set is shifted towards the best matching candidate, with the best matching candidate forming the centre of the set for the next iteration. The process stops if the best matching candidate happens to be the centre of the candidate set. The location of this candidate on the x-axis is used as the x-component of the motion vector. The search now continues parallel to the y-axis. A procedure similar to x-axis search is followed to estimate y-component of the motion vector. One-step at a time search on average tests less number of

Integer pixel motion estimation (also called as full search method) is carried out in the process of motion estimation that is mainly used to reduce the duplication (redundant data) among adjacent frames. But in practice, the distance of real motion is not always made by multiplier (which is constant) at the sampling interval.The actual motion in the video sequence can be much finer. Hence, the resulting object might not lie on the integer pixel (Iain Richardson, 2010) grid. To get a better match, the motion estimation needs to be performed on a sub-pixel grid. The sub-pixel grid can be either at half pixel resolution or

Therefore it is advantageous to use the subpixel motion estimation technique to ensure high compression with high PSNR ratio of reconstructed image. The motion vector can be calculated at 1/2, 1/4, 1/8 subpixel (Young et.al 2010) positions. The motion vector is to be calculated at 1/4 pixel gives more detailed information than at 1/2 pixel position. Since, the image has been enlarged, interpolation must be implemented to compensate for the pixel

candidates. However, the motion vector accuracy is poor.

quarter pixel resolution.

value in case of enlargement.

Fig. 3. 6-tap Directional interpolation filter for luma

**1.1.7 Sub-pixel motion estimation (Fractional Pel Motion Estimation)** 

Fig. 2. Fast Search Algorithms
