**2. Related works**

to the new high efficiency video coding (HEVC) [1, 2]. HEVC is developed in 2013 by the joint collaborative team on video coding (ISO/IEC) Moving Picture Experts Group (MPEG) and the International ITU-T Video Coding Experts Group (VCEG). It is urbanized to overcome the enormous amount of UHD video contents. Compared to the earlier H.264/AVC [3] standard and at the identical visual quality, HEVC guarantees a high encoding performance, reaching 50% of bitrate [4]. Facing to this immense huge encoding performance, a huge computational complexity is obtained. Motion estimation (ME) represents the large part of encoding process that occupies around 70% of the total time of inter prediction, as Jungho [5]

This large consumption is principally due to the new hierarchy of the block

macroblocks in the earlier standard of compression. Each picture frame is divided into square forms, called coding units (CUs) [6], where 64 64 represents the maximum size, and recursively subdivided into 8 8 blocks. Prediction and

coding based on coding tree units (CTU). This new concept is analog to

indicates in **Figure 1**.

*Digital Imaging*

**Figure 1.**

**Figure 2.**

**12**

*CTU tree structure in the HEVC standard.*

*Encoding time distribution.*

Aiming to optimize the HEVC encoder complexity, several works have been proposed to reduce the test zonal search (TZS) motion estimation algorithm. Some works are interested in hardware solutions, and others are focused on software optimizations.

In [11], using sequential and parallel techniques, two hardware diamond architectures for HEVC video coding are proposed. These architectures achieve an encoding in full HD at 30 frames per second using a Virtex-7 field programmable gate array (FPGA) design.

Authors in [12] have proposed a hardware parallel sum of absolute difference (SAD) design for gray-scale images to reduce motion estimation time for block size of 4 4 pixels. A multiplier is exploited for addition as a partial product reduction (PPR). Results obtained on Virtex-2 Xilinx FPGA show that the maximum frequency obtained is 133.2 MHz for 4 4 block size. Nalluri et al., in [13, 14], have proposed two other SAD architectures on FPGA Xilinx Virtex without and with parallelism. The proposed parallel architecture has accelerated the SAD calculation by 3.9 compared to the serial SAD architecture. In [15], authors have proposed two implementations of the SAD and SSD algorithms using NVIDIA GeForce GTX480 with CUDA language in order to reduce the ME run-time. The proposed architecture saved about 32% of encoding time for class E video sequences with nonsignificant degradation in the PSNR and the bitrate.

Regarding software solutions, the 8-point square and the 8-point diamond have been replaced by Nalluri et al. [16] with a 6-point hexagonal in the TZS ME algorithm, and 50% in encoding time is saved without degradation in bitrate and PSNR. To replace the TZS algorithm, in [17, 18], authors proposed small diamond pattern search (SDPS), large diamond pattern search (LDPS), and horizontal diamond search (HDS). Experiments using HM8.0 showed that these algorithms allow a reduction of 49% of motion estimation calculation time with nonsignificant increase in bitrate and slight degradation in video quality.

In [19], Liquan et al. have proposed a fast mode decision algorithm by skipping some depths. The proposed work allows saving about 21.5% of encoding time with a slight bitrate increase and a negligible efficiency loss coding. The algorithm proposed by Qin [20] uses the ECU algorithm according to an adaptive MSE threshold value. This work ensures time saving without degradation in the quality. Podder [21] has also proposed an interesting software method to reduce the ME time. Based on human visual features (HVF), an efficient decision of the appropriate block partitioning mode has been obtained. This work allows saving 41.44% of the execution time for SCVS video sequences. In the work published in [22], a fast HEVC ME based on DS and three fast mode decisions, ECU, ESD, and CBF modes, have been presented. Simulation results show a reduction of 56.75% in the complexity of HEVC in terms of execution time, accompanied with slight degradation in video quality and bitrate, when comparing the HM.16.2 executed on an Intel® Core TM i7–3770 @3.4 GHz processor. Authors in [22] have tested just one sequence from each class with just two quantification parameters (QPs), QP = 22 and 37, to evaluate the use of the fast modes.

The median computation is done via the following equation.

*Fast Motion Estimation's Configuration Using Diamond Pattern and ECU, CFM,…*

**3.2 Initial grid search**

**Figure 4.**

**Figure 5.**

**15**

*pattern stride length equal to 4.*

*MV adjacent of a current PU.*

file through the "Diamondsearch" variable.

*DOI: http://dx.doi.org/10.5772/intechopen.86792*

*Median A*ð Þ¼ *; B;C A* þ *B* þ *C* � Minð*A;* Minð Þ *B;C* Þ � Maxð Þ *A;* Maxð Þ *B;C* (1)

The first search is performed by the determination of the search pattern and the "searchrange." As it is detailed in **Figure 5(a)** and **(b)**, the main goal of this stage is

Thus, these two search patterns are referred to the eight points for each round.

to localize the search window via a pattern of square or a diamond forms.

The distance corresponding to the minimum distortion point is saved in the "BestDistance" variable. Currently, diamond search pattern is used as default, but the square pattern search can also be used by modifying the HEVC configuration

*Diamond/square search pattern. (a) Diamond search pattern stride length equal to 4. (b) Square search*

By analyzing all these previous works, we can note that using fast mode decision algorithms represents an interesting technique in order to reduce the HEVC computational complexity.
