**1.3.6 Slice-level parallelism**

In slice-level parallelism (Fig. 13), frames are divided in several independent slices, making the processing of macroblocks from different slices completely independent. In the H.264/AVC standard, a maximum of sixteen slices are allowed in each frame. This approach allows exploiting parallelism at a finer granularity, which is suitable, for example, for multicore computers.In H.264 and in most current hybrid video coding standards each picture is partitioned into one or more slices. Slices have been included in order to add robustness to the encoded bitstream in the presence of network transmission errors and losses.

In order to accomplish this, slices in a frame should be completely independent from each other. That means that no content of a slice is used to predict elements of other slices in the same frame, and that the search area of a dependent frame can not cross the slice boundary [10, 16]. Although supports for slices have been designed for error resilience, it can be used for exploiting TLP because slices in a frame can be encoded or decoded in parallel. The main advantage of slices is that they can be processed in parallel without dependency or ordering constraints.This allows exploitation of slice-level parallelism (Rodriguez 2006) without making significant changes to the code.

However, there are some disadvantages associated with exploiting TLP at the slice level. The first one is that the number of slices per frame (sixteen in the H.264 standard) is determined by the encoder. That poses a scalability problem for parallelization at the decoder level. If there is no control of what the encoder does then it is possible to receive sequences with few (or one) slices per frame and in such cases there would be reduced parallelization opportunities. The second disadvantage comes from the fact that in H.264 the

PSNR and bit-rate do not change and it is easy to implement, since GOPs' independency is assured with minimal changes in the code. However, the memory consumption significantly increases, since each encoder must have its own Decoded Picture Buffer (DPB), where all GOP's references are stored. Moreover, real-time encoding is hardly implemented using this approach, making it more suitable for video storage purposes.However, the main disadvantage of frame-level parallelism is that, unlike previous video standards, in H.264 B frames can be used as reference [24]. In such a case, if the decoder wants to exploit framelevel parallelism, the encoder cannot use B frames as reference. This might increase the bitrate, but more importantly, encoding and decoding are usually completely separated and

In slice-level parallelism (Fig. 13), frames are divided in several independent slices, making the processing of macroblocks from different slices completely independent. In the H.264/AVC standard, a maximum of sixteen slices are allowed in each frame. This approach allows exploiting parallelism at a finer granularity, which is suitable, for example, for multicore computers.In H.264 and in most current hybrid video coding standards each picture is partitioned into one or more slices. Slices have been included in order to add robustness to the encoded bitstream in the presence of network transmission errors and

In order to accomplish this, slices in a frame should be completely independent from each other. That means that no content of a slice is used to predict elements of other slices in the same frame, and that the search area of a dependent frame can not cross the slice boundary [10, 16]. Although supports for slices have been designed for error resilience, it can be used for exploiting TLP because slices in a frame can be encoded or decoded in parallel. The main advantage of slices is that they can be processed in parallel without dependency or ordering constraints.This allows exploitation of slice-level parallelism (Rodriguez 2006) without

However, there are some disadvantages associated with exploiting TLP at the slice level. The first one is that the number of slices per frame (sixteen in the H.264 standard) is determined by the encoder. That poses a scalability problem for parallelization at the decoder level. If there is no control of what the encoder does then it is possible to receive sequences with few (or one) slices per frame and in such cases there would be reduced parallelization opportunities. The second disadvantage comes from the fact that in H.264 the

there is no way for a decoder to enforce its preferences to the encoder.

**1.3.6 Slice-level parallelism** 

Fig. 13. Slice-level Parallelism

making significant changes to the code.

losses.

encoder can decide that the deblocking filter has to be applied across slice boundaries. This greatly reduces the speedup achieved by slice level parallelism. Another problem is load balancing wherein the slices are created with the same number of MBs, and thus can result in an imbalance at the decoder because some slices are decoded faster than others depending on the content of the slice.
