**Algorithm 1**. Stabilized Time-Average Penalty Function Minimization

#### **Initialize:**

1: *t* 0; 2: *Q t*½� 0; 3: Decision Action: ∀*α*½ �*t* ∈ A **Time-Average Penalty Function Minimization subject to Stability** 4: **while** *t* ≤*T* **do** // *T*: operation time 5: Observe *Q t*½ �;

```
6: T ∗ ∞;
7: for α½ �t ∈ A do
8: T V � Pð Þþ α½ �t Q t½�� ð Þ að Þ� α½ �t bð Þ α½ �t ;
9: if T ≤T ∗ then
10: T ∗T ;
11: α∗ ½ � t þ 1 α½ �t ;
12: end if
13: end for
14: end while
```
Finally, the dynamic control action decision-making *α*½ �*t* in each unit time *t* for time-average penalty function *P*ð Þ *α*½ �*t* minimization subject to queue stability can be formulated as follows based on the Lyapunov optimization theory:

$$\overline{a^\*[t+1] \leftarrow \arg\min\_{a[t] \in \mathcal{A}} [V \cdot P(a[t]) + Q[t] \cdot (a(a[t]) - b(a[t]))]} \tag{11}$$

where <sup>A</sup> is the set of all possible control actions and *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> is the optimal control action decision-making for the next time slot.

In order to verify whether (11) works correctly or not, following two example cases can be considerable:

• *Case 1:* Suppose *Q t*½ �≈ ∞. Then

$$a^\*\left[t+1\right] \leftarrow \arg\min\_{a[t]\in\mathcal{A}} \left[V \cdot P(a[t]) + Q[t] \cdot \left(a(a[t]) - b(a[t])\right)\right] \tag{12}$$

$$\approx \arg\min\_{a[t]\in\mathcal{A}} [a(a[t]) - b(a[t])].\tag{13}$$

*<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> arg max *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

*DOI: http://dx.doi.org/10.5772/intechopen.92971*

multiple little and big heterogeneous cores.

optimization-based algorithm can be used.

stability can be formulated as follows based on (11):

*<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> arg min *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

**Figure 1.**

**101**

*Mobile devices with multicore processors.*

control action decision-making for the next time slot.

*Dynamic Decision-Making for Stabilized Deep Learning Software Platforms*

**2.2 Example: multicore scheduling in mobile devices**

½ � *V* � *U*ð Þ� *α*½ �*t Q t*½�� ð Þ *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t* (16)

where <sup>A</sup> is the set of all possible control actions and *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> is the optimal

In this section, the Lyapunov optimization-based stabilized time-average opti-

As illustrated in **Figure 1**, mobile smartphone is with the processor which is equipped with multiple cores. For example, ARM big.LITTLE processors are with

In this system, the task events will be generated when users generate events, which are denoted by *a t*½ � in **Figure 1**. Then, the events will be located in the task queue (i.e., *Q t*½ � in **Figure 1**). Then, the events can be processed by the multicore processor. In this case, if many/more cores are allocated in order to process the events from the queue, the processing can be accelerated which is beneficial in terms of queue stability. However, it is not good in terms of our main objective, i.e., energy consumption minimization. On the other hand, if less cores are allocated, the processing becomes slow which is harmful in terms of queue stability but is beneficial in terms of our main objective, i.e., energy consumption minimization. Finally, the tradeoff can be observed between energy consumption minimization (i.e., our main objective) and stability. Then, it can be confirmed that Lyapunov

In order to design the dynamic core allocation decision-making, *α*½ �*t* in each unit time *t* for time-average energy consumption *E*ð Þ *α*½ �*t* minimization subject to queue

where <sup>A</sup> is the set of all possible core allocation combinations and *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> is the optimal core allocation decision-making for the next time slot. Here, it is obvious

½ � *V* � *E*ð Þþ *α*½ �*t Q t*½�� ð Þ *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t* (17)

mization algorithm is introduced with one simple toy model. In this example, dynamic core allocation decision-making algorithm is designed which is for time-

average energy consumption minimization subject to queue stability.

Then, (13) shows that control action decision-making should works as follows, i.e., (i) the arrival process should be minimized, and (ii) the departure process should be maximized. Both cases are for stabilizing the queue, i.e., it should be beneficial when *Q t*½ �≈ ∞.

• *Case 2:* Suppose *Q t*½�¼ 0. Then

$$a^\*[t+1] \leftarrow \arg\min\_{a[t]\in \mathcal{A}} \left[ V \cdot P(a[t]) + Q[t] \cdot (a(a[t]) - b(a[t])) \right] \tag{14}$$

$$\mathfrak{h} = \arg\min\_{a[t]\in\mathcal{A}} V \cdot P(a[t]). \tag{15}$$

Then, (15) shows that control action decision-making should work for minimizing the given penalty function. This is semantically reasonable because focusing on our main objective is possible because stability does not need to be considered because *Q t*½�¼ 0.

The pseudo-code of the proposed time-average penalty function minimization algorithm is presented in Algorithm 1. From line 1 to line 3, all variables and parameters are initialized. The algorithm works in each unit time as shown in line 4. In line 5, current queue-backlog *Q t*½ � is observed to be used in (11). From line 7 to line 13, the main computation procedure for (11) is described.

Up to now, the time-average penalty function minimization is considered. Based on the theory, the dynamic control action decision-making *α*½ �*t* in each unit time *t* for time-average utility function *U*ð Þ *α*½ �*t* maximization subject to queue stability can be formulated as follows:

*Dynamic Decision-Making for Stabilized Deep Learning Software Platforms DOI: http://dx.doi.org/10.5772/intechopen.92971*

$$\begin{array}{c} \hline \begin{array}{c} a^\* \left[ t+1 \right] \leftarrow \text{arg}\max\_{a[t] \in \mathcal{A}} \left[ V \cdot U(a[t]) - Q[t] \cdot \left( a(a[t]) - b(a[t]) \right) \right] \\\\ \end{array} \tag{16}$$

where <sup>A</sup> is the set of all possible control actions and *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> is the optimal control action decision-making for the next time slot.

#### **2.2 Example: multicore scheduling in mobile devices**

6: <sup>T</sup> <sup>∗</sup> <sup>∞</sup>; 7: **for** *α*½ �*t* ∈ A **do**

12: **end if** 13: **end for** 14: **end while**

cases can be considerable:

beneficial when *Q t*½ �≈ ∞.

because *Q t*½�¼ 0.

be formulated as follows:

**100**

9: **if** <sup>T</sup> <sup>≤</sup><sup>T</sup> <sup>∗</sup> **then** 10: <sup>T</sup> <sup>∗</sup><sup>T</sup> ; 11: *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> *<sup>α</sup>*½ �*<sup>t</sup>* ;

*Advances and Applications in Deep Learning*

*<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> arg min *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

• *Case 1:* Suppose *Q t*½ �≈ ∞. Then

• *Case 2:* Suppose *Q t*½�¼ 0. Then

*<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> arg min *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

*<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> arg min *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

control action decision-making for the next time slot.

8: T *V* � *P*ð Þþ *α*½ �*t Q t*½�� ð Þ *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t* ;

formulated as follows based on the Lyapunov optimization theory:

<sup>≈</sup>arg min *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

<sup>¼</sup> arg min *<sup>α</sup>*½ �*<sup>t</sup>* <sup>∈</sup> <sup>A</sup>

Finally, the dynamic control action decision-making *α*½ �*t* in each unit time *t* for time-average penalty function *P*ð Þ *α*½ �*t* minimization subject to queue stability can be

where <sup>A</sup> is the set of all possible control actions and *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> is the optimal

In order to verify whether (11) works correctly or not, following two example

Then, (13) shows that control action decision-making should works as follows, i.e., (i) the arrival process should be minimized, and (ii) the departure process should be maximized. Both cases are for stabilizing the queue, i.e., it should be

Then, (15) shows that control action decision-making should work for minimizing the given penalty function. This is semantically reasonable because focusing on our main objective is possible because stability does not need to be considered

The pseudo-code of the proposed time-average penalty function minimization

Up to now, the time-average penalty function minimization is considered. Based on the theory, the dynamic control action decision-making *α*½ �*t* in each unit time *t* for time-average utility function *U*ð Þ *α*½ �*t* maximization subject to queue stability can

algorithm is presented in Algorithm 1. From line 1 to line 3, all variables and parameters are initialized. The algorithm works in each unit time as shown in line 4. In line 5, current queue-backlog *Q t*½ � is observed to be used in (11). From line 7 to

line 13, the main computation procedure for (11) is described.

½ � *V* � *P*ð Þþ *α*½ �*t Q t*½�� ð Þ *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t* (11)

½ � *V* � *P*ð Þþ *α*½ �*t Q t*½�� ð Þ *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t* (12)

½ � *V* � *P*ð Þþ *α*½ �*t Q t*½�� ð Þ *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t* (14)

*V* � *P*ð Þ *α*½ �*t :* (15)

½ � *a*ð Þ� *α*½ �*t b*ð Þ *α*½ �*t :* (13)

In this section, the Lyapunov optimization-based stabilized time-average optimization algorithm is introduced with one simple toy model. In this example, dynamic core allocation decision-making algorithm is designed which is for timeaverage energy consumption minimization subject to queue stability.

As illustrated in **Figure 1**, mobile smartphone is with the processor which is equipped with multiple cores. For example, ARM big.LITTLE processors are with multiple little and big heterogeneous cores.

In this system, the task events will be generated when users generate events, which are denoted by *a t*½ � in **Figure 1**. Then, the events will be located in the task queue (i.e., *Q t*½ � in **Figure 1**). Then, the events can be processed by the multicore processor. In this case, if many/more cores are allocated in order to process the events from the queue, the processing can be accelerated which is beneficial in terms of queue stability. However, it is not good in terms of our main objective, i.e., energy consumption minimization. On the other hand, if less cores are allocated, the processing becomes slow which is harmful in terms of queue stability but is beneficial in terms of our main objective, i.e., energy consumption minimization. Finally, the tradeoff can be observed between energy consumption minimization (i.e., our main objective) and stability. Then, it can be confirmed that Lyapunov optimization-based algorithm can be used.

In order to design the dynamic core allocation decision-making, *α*½ �*t* in each unit time *t* for time-average energy consumption *E*ð Þ *α*½ �*t* minimization subject to queue stability can be formulated as follows based on (11):

$$a^\*[t+1] \leftarrow \arg\min\_{a[t]\in \mathcal{A}} \left[ V \cdot E(a[t]) + Q[t] \cdot (a(a[t]) - b(a[t])) \right] \tag{17}$$

where <sup>A</sup> is the set of all possible core allocation combinations and *<sup>α</sup>*<sup>∗</sup> ½ � *<sup>t</sup>* <sup>þ</sup> <sup>1</sup> is the optimal core allocation decision-making for the next time slot. Here, it is obvious

**Figure 1.** *Mobile devices with multicore processors.*

that the arrival process is not controllable (i.i.d. random events); thus, it can be ignored. Then, the final form of the dynamic decision-making algorithm can be defined as follows:

$$a^\*\left[t+1\right] \leftarrow \arg\min\_{a[t]\in\mathcal{A}} \left[V \cdot E(a[t]) - Q[t] \cdot b(a[t])\right].\tag{18}$$

Furthermore, the proposed algorithm is reliable according to the fact that the self-adaptation is for maximizing its *utility* while maintaining *stability*.

*Dynamic Decision-Making for Stabilized Deep Learning Software Platforms*

As shown in Algorithm 1, the computation procedure is iterative for solving closed-form equation, i.e., (11) and (16). Thus, the computational complexity of the proposed algorithm is polynomial time, i.e., *O N*ð Þ, where *N* is the number of the

given control actions. Thus, it guarantees low-complexity operations.

evaluation results are presented (refer to Section 3.3).

**3.1 Lyapunov control over departure processes**

queue backlog.

**Figure 2.**

**103**

*accuracy maximization subject to queue stability.*

**3. The use of Lyapunov optimization for deep learning platforms**

As explained, the Lyapunov optimization theory is a scalable, self-configurable, low-complexity algorithm which can be used in many applications. In this section, the use of Lyapunov optimization for deep learning and computer platforms is discussed in two different ways, i.e., departure process control (refer to Section 3.1) and arrival process control (refer to Section 3.2). Finally, its related performance

As illustrated in **Figure 2**, stabilized real-time computer vision platforms should be equipped with queues in order to handle bursty traffics. If the queue is busy or near-overflow, the departure process should be accelerated. Thus, the simplest model should be used for reducing the corresponding computation. On the other hand, if the queue is empty, deep learning computation accuracy can be improved with more sophisticate models because we have enough time to conduct the computation. Thus, multiple models are desired in order to select one depending on

In **Figure 2**, multiple models exist, and it can be seen that the simplest model (i.e., low-resolution model) is able to conduct fast computation, but it presents low learning accuracy. On the other hand, the most sophisticate model (i.e., highresolution model) is good for accurate learning performance, but it introduces computation delays. Thus, the tradeoff exists between performance and delays, i.e.,

*Lyapunov control over departure processes in real-time computer vision platforms for time-average learning*

*2.3.2 Low-complexity operation*

*DOI: http://dx.doi.org/10.5772/intechopen.92971*

In order to check whether the derived Eq. (18) is correct or not, two example cases can be considered, i.e., (i) *Q t*½ �≈ ∞, and (ii) *Q t*½�¼ 0:

• *Busy queue case* (*Q t*½ �≈ ∞): in this case

$$a^\*[t+1] \leftarrow \arg\min\_{a[t]\in \mathcal{A}} [V \cdot E(a[t]) - Q[t] \cdot b(a[t])],\tag{19}$$

$$\mathbf{x} = \arg\min\_{a[t] \in \mathcal{A}} \left[ -b(a[t]) \right] = \arg\max\_{a[t] \in \mathcal{A}} b(a[t]),\tag{20}$$

Thus, the departure process should be accelerated, i.e., more cores should be allocated. This is semantically true because the fast processing events from the queue is desired if overflow situations happen.

• *Busy queue case* (*Q t*½�¼ 0): In this case

$$a^\*[t+1] \leftarrow \arg\min\_{a[t]\in \mathcal{A}} [V \cdot E(a[t]) - Q[t] \cdot b(a[t])],\tag{21}$$

$$=\arg\min\_{a[t]\in\mathcal{A}} V \cdot E(a[t]),\tag{22}$$

Thus, less cores should be allocated for energy consumption minimization which is our main objective. This is semantically true because the given main objective should be desired if the system is stable, i.e., *Q t*½�¼ 0.

As discussed with examples, the proposed Lyapunov optimization-based dynamic core allocation decision-making algorithm works as desired.

#### **2.3 Discussions in stabilized control**

The proposed dynamic super-resolution model selection algorithm is beneficial in various aspects, as follows.

#### *2.3.1 Hardware/system-independent self-adaptation*

Suppose that this proposed algorithm is implemented in supercomputer-like high-performance computing machines. In this case, the processing should be fast; thus, the queue-backlog is always low. Therefore, the system has more chances to focus on our main objective, i.e., penalty function minimization or utility function maximization. On the other hand, if the hardware itself is performance/resource limited (e.g., mobile devices), then the processing speed is also limited due to the low specifications in processors. Thus, the queue-backlog can be frequently busy because it may not be able to process many data with the queue even though it utilizes the fastest model. Therefore, it can be finally observed that the proposed algorithm is self-adaptive which can adapt depending on the given hardware/ system specifications. It automatically adapts the models based on the given hardware/system; thus, it does not require system engineer's trial-and-error tuning. *Dynamic Decision-Making for Stabilized Deep Learning Software Platforms DOI: http://dx.doi.org/10.5772/intechopen.92971*

Furthermore, the proposed algorithm is reliable according to the fact that the self-adaptation is for maximizing its *utility* while maintaining *stability*.
