*3.4.1 Proposed architecture and performance of the DAG algorithm applied to ME blocks in OVP*

## **How to compute the execution time?**

To calculate the processing time of an instruction, since the message transfer time depends sent to the calculation of instructions executed by a processor flows. Each instruction requires multiple clock cycles, the instruction is executed in as many cycles steps. Sequential microprocessors run the following statement when they finish first. In the case of instruction parallelism, the microprocessor can process several of these steps at the same time for several different instructions because different internal resources are mobilized. In other words, the processor executes instructions in parallel and sequencially at various stages of completion. This execution queue is called a pipeline. This mechanism has been implemented for the first time in the 1960 by IBM. **Figure 10** describe the canonical example of this type of pipeline, under the form of a RISC processor, in five stages. Sequencing instructions in a processor with a 5-stage pipeline needs 9 cycles to execute 5 instructions. At t = 5, all floors have solicited the pipeline, and 5 operations occur simultaneously.

#### **How to compute the transfer time?**

Assume that the data transfer time is proportional to the size of the data exchanged (between processors), this transfer time is the time of sending data within


#### **Figure 10.**

*The canonical example of this type of pipe is that of a RISC processor (5 stages).*

*Approximation Algorithm for Scheduling a Chain of Tasks for Motion Estimation… DOI: http://dx.doi.org/10.5772/intechopen.97676*

tasks in between processors. Then "Tt" is the transfer time and "Td" is the data size. kð Þ *p*1; *p*2 is the transfer duration between p1 and p2. Tt is defined by Eq. 11:

$$\text{Tt} = \text{td} \times K(p1, p2) \tag{11}$$

The execution time of an instruction and message transfer between processors, and tasks between processors processed by the ME block are presented in **Table 3**.

**Virtual prototyping with OVP environment**

We work with a project virtual prototype in OVP simulator. The strategies from a project and simulations in OVP are illustrated in [23]. We describes the cosimulation in **Figure 11**. This figure is composed the implementation and prototype in OVP (with thread method in CPUs).

### **Affinity scheduling granularity of the ME algorithm**

The following diagram shows the flow chart of mode (16 ∗ 16) for the video coding in H 264/AVC. The difference with the flowchart of H 264/AVC standard is the interpolation technique, used instead of the affinity method granularity 1*=*2, 1*=*4 and 1*=*8 pixels. Using the interpolation technique minimizes the number of jobs and the number of level task graphs. Thereafter, the execution time of block ME is optimized for the standard video codec HEVC/H 265 and the standard H.264/AVC old report time. Thus, we have one level graph TPG-DAG. Both graphs were improving accuracy and frame quality. In this section, we consider time communications between different tasks, as independent tasks in the same processor itself or in different processors as shown in **Figure 12**. We define the communication delays between tasks and scheduling tasks lengths. **Figure 12** presents the partitioning and scheduling algorithm for the ME block in the H 265/AVC video codec and the different communications and mappings for the different processors.

**Table 4** shows the test video sequences with theoretical results. We can see that the theoretical execution time is close to the practical execution time observed in


**Table 3.** *The results of scheduling and partitioning levels of the model for the sequence test 1.*

**Figure 11.**

*Prototype and implementation in OVP.*

co-simulations of OVP. Also, we see that our scheduling and partitioning algorithm is optimal within the approximation and modeling formalisms considered.
