**Abstract**

The high performance of the high efficiency video coding (HEVC) video standard makes it more suitable for high-definition resolutions. Nevertheless, this encoding performance is coupled with a tremendous encoding complexity compared to the earlier H264 video codec. The HEVC complexity is mainly a return to the motion estimation (ME) module that represents the important part of encoding time which makes several researches turn around the optimization of this module. Some works are interested in hardware solutions exploiting the parallel processing of FPGA, GPU, or other multicore architectures, and other works are focused on software optimizations by inducing fast mode decision algorithms. In this context, this article proposes a fast HEVC encoder configuration to speed up the encoding process. The fast configuration uses different options such as the early skip detection (ESD), the early CU termination (ECU), and the coded block flag (CBF) fast method (CFM) modes. Regarding the algorithm of ME, the diamond search (DS) is used in the encoding process through several video resolutions. A time saving around 46.75% is obtained with an acceptable distortion in terms of video quality and bitrate compared to the reference test model HM.16.2. Our contribution is compared to other works for better evaluation.

**Keywords:** HEVC, motion estimation, early skip detection (ESD), early CU termination (ECU), coded block flag (CBF) fast method (CFM)

### **1. Introduction**

The fast multimedia technology development and network communications makes ultrahigh-definition (HD) and HD video contents widely used in our daily life. This fast jump to use high video resolutions in which many provide some problems in terms of memory storage cost and transmission bandwidth gives birth to the new high efficiency video coding (HEVC) [1, 2]. HEVC is developed in 2013 by the joint collaborative team on video coding (ISO/IEC) Moving Picture Experts Group (MPEG) and the International ITU-T Video Coding Experts Group (VCEG). It is urbanized to overcome the enormous amount of UHD video contents. Compared to the earlier H.264/AVC [3] standard and at the identical visual quality, HEVC guarantees a high encoding performance, reaching 50% of bitrate [4]. Facing to this immense huge encoding performance, a huge computational complexity is obtained. Motion estimation (ME) represents the large part of encoding process that occupies around 70% of the total time of inter prediction, as Jungho [5] indicates in **Figure 1**.

transform blocks (PUs and TUs) are in each CU, where PU represents the principal

When reducing the time essential for the search algorithm, the ME computational complexity will be automatically reduced. Furthermore, when using different fast mode decision algorithms based on early termination, the ME computational complexity will be reduced, which primes to the entire HEVC execution time

**Figure 2** shows the CTU tree structure in the HEVC standard where LCU represents the large coding unit and SCU represents the small coding unit.

*Fast Motion Estimation's Configuration Using Diamond Pattern and ECU, CFM,…*

It is within this context that this article presents a fast encoding algorithm principally based on the early skip detection (ESD), the coded block flag (CBF) fast method (CFM), and the early CU termination (ECU) modes [7–9] to decrease the

The remainder of this paper is structured as follows: the next section details some works on the HEVC fast motion estimation algorithms. Section 3 provided an overview of the motion estimation algorithm. Section 4 highlights the proposed fast configuration for the HEVC encoder. Experimental results for the fast HEVC configuration compared to the results obtained with the original HM16.2 reference software [10] are discussed in Section 5. Finally, in Section 6, conclusions and some

Aiming to optimize the HEVC encoder complexity, several works have been proposed to reduce the test zonal search (TZS) motion estimation algorithm. Some works are interested in hardware solutions, and others are focused on software

In [11], using sequential and parallel techniques, two hardware diamond archi-

Authors in [12] have proposed a hardware parallel sum of absolute difference (SAD) design for gray-scale images to reduce motion estimation time for block size of 4 4 pixels. A multiplier is exploited for addition as a partial product reduction (PPR). Results obtained on Virtex-2 Xilinx FPGA show that the maxi-

Regarding software solutions, the 8-point square and the 8-point diamond have

been replaced by Nalluri et al. [16] with a 6-point hexagonal in the TZS ME algorithm, and 50% in encoding time is saved without degradation in bitrate and PSNR. To replace the TZS algorithm, in [17, 18], authors proposed small diamond pattern search (SDPS), large diamond pattern search (LDPS), and horizontal diamond search (HDS). Experiments using HM8.0 showed that these algorithms allow a reduction of 49% of motion estimation calculation time with nonsignificant

increase in bitrate and slight degradation in video quality.

tectures for HEVC video coding are proposed. These architectures achieve an encoding in full HD at 30 frames per second using a Virtex-7 field programmable

mum frequency obtained is 133.2 MHz for 4 4 block size. Nalluri et al., in [13, 14], have proposed two other SAD architectures on FPGA Xilinx Virtex without and with parallelism. The proposed parallel architecture has accelerated the SAD calculation by 3.9 compared to the serial SAD architecture. In [15], authors have proposed two implementations of the SAD and SSD algorithms using NVIDIA GeForce GTX480 with CUDA language in order to reduce the ME run-time. The proposed architecture saved about 32% of encoding time for class E video sequences with nonsignificant degradation in the PSNR and

unit in the ME process.

*DOI: http://dx.doi.org/10.5772/intechopen.86792*

HEVC encoding complexity.

prospects are given.

**2. Related works**

optimizations.

the bitrate.

**13**

gate array (FPGA) design.

reduction.

This large consumption is principally due to the new hierarchy of the block coding based on coding tree units (CTU). This new concept is analog to macroblocks in the earlier standard of compression. Each picture frame is divided into square forms, called coding units (CUs) [6], where 64 64 represents the maximum size, and recursively subdivided into 8 8 blocks. Prediction and

**Figure 1.** *Encoding time distribution.*

**Figure 2.** *CTU tree structure in the HEVC standard.*

*Fast Motion Estimation's Configuration Using Diamond Pattern and ECU, CFM,… DOI: http://dx.doi.org/10.5772/intechopen.86792*

transform blocks (PUs and TUs) are in each CU, where PU represents the principal unit in the ME process.

**Figure 2** shows the CTU tree structure in the HEVC standard where LCU represents the large coding unit and SCU represents the small coding unit.

When reducing the time essential for the search algorithm, the ME computational complexity will be automatically reduced. Furthermore, when using different fast mode decision algorithms based on early termination, the ME computational complexity will be reduced, which primes to the entire HEVC execution time reduction.

It is within this context that this article presents a fast encoding algorithm principally based on the early skip detection (ESD), the coded block flag (CBF) fast method (CFM), and the early CU termination (ECU) modes [7–9] to decrease the HEVC encoding complexity.

The remainder of this paper is structured as follows: the next section details some works on the HEVC fast motion estimation algorithms. Section 3 provided an overview of the motion estimation algorithm. Section 4 highlights the proposed fast configuration for the HEVC encoder. Experimental results for the fast HEVC configuration compared to the results obtained with the original HM16.2 reference software [10] are discussed in Section 5. Finally, in Section 6, conclusions and some prospects are given.
