**4.3.2 Mapping GOPs from DVC to H.264**

22 Video Compression

In order to provide fast and flexible transcoding at the H.264/AVC encoder side, we have to study two issues: firstly, how MVs generated during the SI process could help to reduce the time used in ME; secondly, taking into account that DVC and H.264/AVC can build different GOPs, how to map MVs between different GOP combinations in order to provide

Within the WZ decoding process, an important task is the SI generation stage, which is the first step in the process for generating the WZ frames from K frames. VISNET-II performs Motion Compensated Temporal Interpolation (MCTI) to estimate the SI. The first step of this method is shown in Figure 5, which consists in matching each forward frame MB with a backward frame MB inside the search area. The process checks all the possibilities inside the search area and chooses the MV that generates the lowest residual. The middle of this MV represents the displacement for the MB interpolated (more details about the SI generation

Obviously, MVs generated in the WZ decoding stage contain approximated information about the quantity of movement of the frame. Following this idea, the present approach proposes to reuse the MVs to accelerate the H.264/AVC encoding stage by reducing the search area of the ME stage. Moreover, the present reduction is adjusted for every input DVC GOP to every H.264/AVC GOP in an efficient and dynamic way. As is shown in Figure 6, the search area for each MB is defined by a circumference with a radius dependent on the incoming SI MV (*Rmv*). This search area can oscillate between a minimum (defined by *Rmin*) and a maximum (limited by the H.264 search area). In particular, the length will vary depending on the type of frame and the length of the reference frame, as will be explained in section 4.2.2. Furthermore, a minimum area is considered since MVs are calculated from 16x16 MBs in the SI process, and H.264/AVC can even work with smaller partitions than 16x16. Besides, SI is an approximation of the frame, so some changes could occur when the fame is completely reconstructed. For these reasons, this minimum was set

**4.3 H.264/AVC transcoding approach** 

**4.3.1 Reducing motion estimation complexity** 

process in (Ascenso et al., 2005)).

Fig. 5. First step of SI generation process.

flexibility.

at 4 pixels.

One desired feature of every transcoder is flexibility. To achieve it, an important process is to perform a with care known as GOP mapping. On the second part of the transcoder, it is proposed a DVC to H.264/AVC conversion which allows every mapping combination by performing this task using techniques to improve the time spending by the transcoding process. To extract MVs, first the distance used to calculate the SI is considered. For example, Figure 7 shows the transcoding process for a DVC GOP of length 4 to a H.264/AVC pattern IPPP (baseline profile). In step 1, DVC starts to decode the frame labeled as WZ2 and the MVs generated in its SI generation are discarded because they are not closely correlated with the proper movement (low accuracy). When the WZ2 frame is reconstructed (through the entire WZ decoding algorithm, WZ'2) in step 2, the WZ decoding algorithm starts to decode frames WZ1 and WZ3 by using the reconstructed frame WZ'2. At this point, the MVs V0-2 and V2-4 generated in this second iteration of the DVC decoding algorithm are stored. These MVs will be used to reduce the H.264/AVC ME process. Notice that in the case of higher GOP sizes the procedure is the same. In other words, MVs are stored and reused when the distance between SI and the two reference frames is 1. Finally, V0-2 and V2-4 are divided into two halves because P frames have the reference frame with distance one and MVs were calculated for a distance of two during the SI process.

Fig. 7. Mapping from DVC GOP of length 4 to H.264 GOP IPPP.

Mobile Video Communications Based on Fast DVC to H.264 Transcoding 25

During the decoding process, the MVs generated by the SI generation stage were sent to the H.264/AVC encoder; hence it does not involve any increase in complexity. In the second stage, the transcoder performs a mapping from every DVC GOP to every H.264/AVC GOP using QP = 28, 32, 36 and 40. In our experiments we have chosen different H.264/AVC patterns in order to analyze the behavior for the baseline profile (IPPP GOP) and the main profile (IBBP pattern). These patterns were transcoded by the reference and the proposed transcoder. The H.264/AVC reference software used in the simulations was theJM reference software (version 17.1). As mentioned in the introduction, the framework described is focused on communications between mobile devices; therefore, a low complexity configuration must be employed. For this reason, we have used the default configuration for the H.264/AVC main and baseline profile, only turning off the RD Optimization. The reference transcoder is composed of the whole DVC decoder followed by the whole H.264/AVC encoder. In order to analyze the performance of the proposed transcoder in detail we have taken into account the two halves and global

Furthermore, the performance of the proposed DVC parallel decoding is shown in Tables 1 (for 15 and 30fps sequences). PSNR and bitrate (BR) display the quality and bitrate measured by the reference WZ decoding. To calculate the PSNR difference, the PSNR of each sequence was estimated before transcoding starts and after transcoding finishes. Then the PSNR of the proposed transcoding was subtracted from the reference one for each H.264/AVC RD point, as defined by Equation 3. However, Table 1 do not include results for ΔPSNR because the quality obtained by DVC parallel decoding is the same as the reference

Equation 4 was applied in order to calculate the Bitrate increment (ΔBR) between reference and proposed DVC decoders as a percentage. Then a positive increment means a higher bitrate is generated by the proposed transcoder. As the results of Table 1 show, when DVC decodes smaller and less complex parts, sometimes the turbo decoder (as part of the DVC decoder) converges faster with less iterations and it implies less parity bits requested and thus a bitrate reduction. However, generally speaking the turbo codec yields a better performance for longer inputs. For this reason, the bitrate is not always positive or negative. Comparing different GOP lengths, in short GOPs most of the bitrate is generated by the K frames. When the GOP length increases, the number of K frames is reduced and then WZ frames contribute to reducing the global bitrate in low motion sequences (like Hall) or increasing it in high motion sequences (Foreman or Soccer). Generally, decoding smaller pieces of frame (in parallel) works better for high motion sequences, where the bitrate is

���(�) � ��� � ������������������������

Concerning the time reduction (TR), it was estimated as a percentage by using Equation 5. In this case, negative time reduction means decoding time saved by the proposed DVC decoding. As is shown in Table 1, DVC decoding time is reduced by up to 70% on average. TR is similar for different GOP lengths, but it works better for more complex sequences.

�����������

(4)

�����(��) � ������������� � ������������ (3)

decoding, it iterates until a given threshold is reached (Brites et al., 2008).

results are also presented.

similar or even lower in some cases.

For more complex patterns, which include mixed P and B frames (main profile), this method can be extended in a similar way with some changes. Figure 8 shows the transcoding from a DVC GOP of length 4 to a H.264 pattern IBBP. MVs are also stored by always following the same procedure. However, in this case the way to apply them in H.264/AVC changes.

Fig. 8. Mapping from DVC GOP of length 4 to H.264 GOP IBBP.

For P frames, MVs are multiplied by a factor of 1.5 because MVs were calculated for a distance of 2 and P frames have their references with a distance of 3. For B frames, it depends on the position that they are allocated and it changes for backward and forward searches.

As can be observed, this procedure can be applied to both K and WZ frames. Therefore, following this method the proposed transcoder can be used for transcoding from every DVC GOP to every H.264/AVC GOP.
