3. Constrained distributionally robust optimization

In the constrained case i.e., when A is a strict subset of R<sup>n</sup>þ<sup>2</sup> , algorithms (10) and (13) present some drawbacks: The trajectory a tð Þ may not be feasible, i.e., a tð Þ∉A � R<sup>þ</sup> � R even when it starts in A: In order to design feasible trajectories, projected gradient has been widely studied in the literature. However, a projection into A at each time t involves additional optimization problems and the computation of the projected gradient adds extra complexity to the algorithm. We restrict our attention to the following constraints:

$$\mathcal{A} = \left\{ a \in \mathbb{R}^n \; | \; a\_l \in \left[ \underline{a}, \overline{a}\_l \right], \; l \in \{1, \ldots, n\}, \; \sum\_{l=1}^n c\_l a\_l \le b \right\}.$$

We impose the following feasibility condition: al <sup>&</sup>lt; al, l∈f g <sup>1</sup>;…; <sup>n</sup> , cl <sup>&</sup>gt; <sup>0</sup>, <sup>P</sup><sup>n</sup> <sup>l</sup>¼<sup>1</sup> clal <sup>&</sup>lt; <sup>b</sup>: Under this setting, the constraint set A is non-empty, convex and compact.

We propose a method to compute a constrained solution that has a full support (whenever it exists). We do not use the projection operator. Indeed we transform the domain al ; al � � <sup>¼</sup> <sup>ξ</sup>ð Þ ½ � <sup>0</sup>; <sup>1</sup> where <sup>ξ</sup>ð Þ¼ xl alxl <sup>þ</sup> al ð Þ¼ 1 � xl al: ξ is a one-to-one mapping and

$$\begin{aligned} \mathbf{x}\_l &= \boldsymbol{\xi}^{-1}(\boldsymbol{a}) = \frac{\underline{a}\_l - \underline{a}\_l}{\overline{a}\_l - \underline{a}\_l} \in [0, 1]. \\\\ \sum\_{l=1}^n c\_l (\overline{a}\_l - \underline{a}\_l) \mathbf{x}\_l &\le b - \sum\_{l=1}^n c\_l \underline{a}\_l =: \hat{b} \end{aligned}$$

The algorithm (18)

regretT ≤

provides the announced convergence time bound. This completes the proof.

See Table 1 for detailed parametric functions on the bound Tδ:

Figure 2. Gradient ascent vs. risk-aware Bregman dynamics for <sup>r</sup> ¼ � <sup>1</sup> <sup>þ</sup> <sup>P</sup><sup>2</sup>

Triple exponential <sup>e</sup>�ee<sup>t</sup>

Double exponential rate e�et

Exponential rate e�<sup>t</sup>

Polynomial order k <sup>c</sup><sup>0</sup>

Table 1. Convergence rate under different set of functions.

<sup>α</sup>ðÞ¼ <sup>t</sup> <sup>t</sup> <sup>þ</sup> et

<sup>α</sup>ðÞ¼ <sup>t</sup> t, <sup>β</sup>ðÞ¼ <sup>t</sup> et

αðÞ¼ t 0, βðÞ¼ t t

, <sup>β</sup>ðÞ¼ <sup>t</sup> <sup>e</sup>et

14 Optimization Algorithms - Examples

αðÞ¼ t log k � log t, βðÞ¼ t k log t

c0 T � t<sup>0</sup> ðT t0 e

t

Convergence Error bound Time-to-reach T<sup>δ</sup>

�βð Þ<sup>s</sup> ds ≤ δ, (17)

<sup>c</sup><sup>0</sup> log log log <sup>c</sup><sup>0</sup>

c<sup>0</sup> log log <sup>c</sup><sup>0</sup>

c<sup>0</sup> log <sup>c</sup><sup>0</sup>

<sup>k</sup> c

<sup>k</sup>¼<sup>1</sup> <sup>ω</sup><sup>2</sup> k a2 k :

� �

δ � � � �

δ � �

δ

1=k 0 δ1=<sup>k</sup>

$$\begin{cases} \dot{\mathcal{Y}} = \left[\nabla^2 \underline{\mathcal{g}}\right]^{-1} \nabla\_a \underline{\mathcal{E}}\_m h(a, \omega) = \hat{\mathcal{f}}(a), \\ a\_l := \overline{a}\_l \mathbf{x}\_l + \underline{a}\_l (1 - \mathbf{x}\_l), \\ \mathbf{x}\_l = \min\left(1, \frac{e^{\mathcal{Y}\_l}}{\sum\_{k=1}^n e^{\mathcal{Y}\_k}} \frac{\hat{b}}{\left[c\_l \left(\overline{a}\_l - \underline{a}\_l\right)\right]}\right), \\ l \in \{1, \dots, n\}. \end{cases} \tag{18}$$

The next example illustrates a constrained distributionally robust optimization in wireless

Example 5 (Wireless communication). Consider a power allocation problem over n medium access

• ωll is the channel state at l: The channel state is unknown. Its true distribution is also

• al is the power allocated to channel l: It is assumed to be between al ≥ 0 and al with

It is worth mentioning that the action constraint of the power allocation problem are similar to the ones

<sup>þ</sup> : al <sup>≤</sup> al <sup>≤</sup> al;

Clearly, A is a non-empty convex compact set. The payoff function is the sum-rate r að Þ¼ ; ω

<sup>l</sup>¼<sup>1</sup> Wl log 1ð Þ <sup>þ</sup> SINRl where Wl <sup>&</sup>gt; <sup>0</sup>: The mapping að Þ ; <sup>ω</sup> <sup>↦</sup> r að Þ ; <sup>ω</sup> is continuously differentiable. • Robust optimization is too conservative: Part of the robust optimization problem [9, 7] consists of choosing the channel gain <sup>ω</sup>ll j j<sup>2</sup> <sup>∈</sup> <sup>0</sup>; <sup>ω</sup>ll ½ � were the bound <sup>ω</sup> need to be carefully designed. However

robust performance is zero. This is too conservative as several realizations of the channel may give better performance than zero. Another way is to re-design the bounds ωll and ωll: But if ωll > 0 it means that very low channel gains are not allowed, which may be too optimistic. Below we use the

• Distributional robust optimization: By means of the training sequence or channel estimation method, a certain (statistical) distribution m is derived. However m cannot be considered as the

( )

Xn l¼1

al ≤ a

:

Q

<sup>l</sup>¼<sup>1</sup> al <sup>≤</sup> <sup>a</sup> wher<sup>e</sup>

Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686 17

<sup>l</sup> <sup>0</sup>;ωll ½ �r að Þ¼ ; <sup>ω</sup> <sup>0</sup>: Hence the

<sup>0</sup> <sup>≤</sup> al <sup>&</sup>lt; al <sup>&</sup>lt; <sup>þ</sup>∞: Moreover, a total power budget constraint is imposed <sup>P</sup><sup>n</sup>

A ≔ a∈ R<sup>n</sup>

the worst case is achieved when the channel gain is zero: inf<sup>ω</sup> <sup>∈</sup>

distributional robust optimization approach which eliminates this design issue.

al <sup>ω</sup>ll j j<sup>2</sup> <sup>d</sup><sup>2</sup> srð Þ<sup>l</sup> ;st ð Þþ ð Þ<sup>l</sup> <sup>ε</sup><sup>2</sup> ð Þ<sup>o</sup>

<sup>N</sup><sup>0</sup> sr ð Þþ ð Þ<sup>l</sup> Il sr ð Þ ð Þ<sup>l</sup> ,

2

communication networks.

• N<sup>0</sup> > 0 is the background noise.

ak <sup>ω</sup>kl j j<sup>2</sup> <sup>d</sup><sup>2</sup> sr ð Þþ ð Þ<sup>l</sup> ;stð Þ<sup>k</sup> <sup>ε</sup><sup>2</sup> ð Þ<sup>o</sup>

2 :

• srð Þl is the location of the receiver of l • stð Þl is the location of the transmitter of l

• o ∈f g 2; 3; 4 is the pathloss exponent.

analyzed in Section 3. The admissible action space is

• e > 0 is the height of the transmitter antenna.

where

Il <sup>¼</sup> <sup>P</sup> k6¼l

unknown.

a > P<sup>n</sup>

P<sup>n</sup>

<sup>l</sup>¼<sup>1</sup> al <sup>≥</sup> <sup>0</sup>:

channels. The signal-to-interference-plus-noise ratio (SINR) is

SINRl ¼

• The interference on channel l is denoted Il ≥ 0: One typical model for Il is

generates a trajectory a tð Þ that satisfies the constraint.

Algorithm 4. The constrained learning pseudocode is as follows:


Proposition 7. If ^<sup>b</sup> <sup>≤</sup> minlcl al � al � � then Algorithm (18) reduces to

$$\begin{cases} \begin{aligned} a\_l &:= \overline{a}\_l \mathbf{x}\_l + \underline{a}\_l (1 - \mathbf{x}\_l), \\ \dot{\mathbf{x}}\_l &= \mathbf{x}\_l \left[ \left< e\_l, \hat{f}(a) \right> - \mathbf{1} \hat{b} \sum\_l \left< e\_l, \hat{f}(a) \right> \mathbf{x}\_l \left[ c\_l \left( \overline{a}\_l - \underline{a}\_l \right) \right] \right], \\ l \in \{1, ..., n\} \end{aligned} \end{cases} \tag{19}$$

Proof. It suffices to check that for ^<sup>b</sup> <sup>≤</sup> minlcl al � al � �, the vector <sup>z</sup> defined by zl <sup>¼</sup> <sup>e</sup><sup>y</sup> <sup>P</sup> <sup>l</sup> <sup>n</sup> <sup>k</sup>¼<sup>1</sup> <sup>e</sup>yk solves the replicator equation,

$$
\dot{z}\_l = z\_l \left[ \dot{y}\_l - \langle z, \dot{y} \rangle \right].
$$

Thus, xl <sup>¼</sup> <sup>e</sup><sup>y</sup> <sup>P</sup> <sup>l</sup> <sup>n</sup> <sup>k</sup>¼<sup>1</sup> <sup>e</sup>yk ^b cl al�<sup>a</sup> ð Þ<sup>l</sup> ½ � solves <sup>x</sup>\_<sup>l</sup> <sup>¼</sup> xl el; ^f að Þ D E � <sup>1</sup>^<sup>b</sup> P <sup>l</sup> el; ^f að Þ D Exl cl al � al � � � � h i: This completes the proof.

Note that the dynamics of x in Eq. (19) is a constrained replicator dynamics [8] which is widely used in evolutionary game dynamics. This observation establishes a relationship between optimization and game dynamics and explains that the replicator dynamics is the gradient flow of the (expected payoff) under simplex constraint.

The next example illustrates a constrained distributionally robust optimization in wireless communication networks.

Example 5 (Wireless communication). Consider a power allocation problem over n medium access channels. The signal-to-interference-plus-noise ratio (SINR) is

$$\text{SINR}\_{l} = \frac{\frac{a\_{l}|\omega\_{\mathcal{U}}|^{2}}{\left(d^{2}(s\_{r}(l), s\_{l}(l)) + \varepsilon^{2}\right)^{\frac{\sigma}{2}}}}{N\_{0}(\mathbf{s}\_{r}(l)) + I\_{l}(\mathbf{s}\_{r}(l))},$$

where

<sup>y</sup>\_ <sup>¼</sup> <sup>∇</sup><sup>2</sup><sup>g</sup> � ��<sup>1</sup>

8 >>>>>>><

>>>>>>>:

Algorithm 4. The constrained learning pseudocode is as follows:

3: while regret > e and t ≤ T do ⊳ We have the answer if regret is 0

generates a trajectory a tð Þ that satisfies the constraint.

from að Þ0 within ½ � 0; T

16 Optimization Algorithms - Examples

5: Compute regret

the replicator equation,

<sup>k</sup>¼<sup>1</sup> <sup>e</sup>yk

Thus, xl <sup>¼</sup> <sup>e</sup><sup>y</sup> <sup>P</sup> <sup>l</sup> <sup>n</sup>

pletes the proof.

4: Compute a tð Þ solution of (18)

Proposition 7. If ^<sup>b</sup> <sup>≤</sup> minlcl al � al

8 >>>><

>>>>:

7: return a tð Þ, regrett ⊳ get a(t) and the regret

al ≔ alxl þ al

l∈ f g 1;…; n

flow of the (expected payoff) under simplex constraint.

Proof. It suffices to check that for ^<sup>b</sup> <sup>≤</sup> minlcl al � al

^b

<sup>x</sup>\_<sup>l</sup> <sup>¼</sup> xl el; ^f að Þ

2: a að Þ0

6: end while

8: end procedure

al ≔ alxl þ al

l ∈f g 1;…; n ,

xl <sup>¼</sup> min 1; <sup>e</sup>yl

<sup>∇</sup>aEmh að Þ ; <sup>ω</sup> <sup>≕</sup>^f að Þ,

!

^b cl al � al � � � �

,

(18)

(19)

<sup>k</sup>¼<sup>1</sup> <sup>e</sup>yk solves

: This com-

ð Þ 1 � xl ,

1: procedure CONSTRAINED GRADIENT ð Þ að Þ0 ; e; T; g; m; h ⊳ The constrained learning algorithm starting

� � then Algorithm (18) reduces to

el; ^f að Þ D E

� � � � " #

<sup>l</sup> � h i <sup>z</sup>; <sup>y</sup>\_ � �:

� <sup>1</sup>^<sup>b</sup> P

D E

Note that the dynamics of x in Eq. (19) is a constrained replicator dynamics [8] which is widely used in evolutionary game dynamics. This observation establishes a relationship between optimization and game dynamics and explains that the replicator dynamics is the gradient

xl cl al � al

� �, the vector <sup>z</sup> defined by zl <sup>¼</sup> <sup>e</sup><sup>y</sup> <sup>P</sup> <sup>l</sup> <sup>n</sup>

<sup>l</sup> el; ^f að Þ D E

� � � � h i

,

xl cl al � al

ð Þ 1 � xl ,

� <sup>1</sup>^<sup>b</sup> X l

z\_<sup>l</sup> ¼ zl y\_

D E

cl al�<sup>a</sup> ð Þ<sup>l</sup> ½ � solves <sup>x</sup>\_<sup>l</sup> <sup>¼</sup> xl el; ^f að Þ

P<sup>n</sup> <sup>k</sup>¼<sup>1</sup> <sup>e</sup>yk


$$I\_l = \sum\_{k \neq l} \frac{a\_k |a\_{kl}|^2}{\left(d^2(s\_r(l), s\_t(k)) + \varepsilon^2\right)^{\frac{\mathfrak{S}}{2}}}$$


It is worth mentioning that the action constraint of the power allocation problem are similar to the ones analyzed in Section 3. The admissible action space is

$$\mathcal{A} := \left\{ a \in \mathbb{R}\_+^n \,:\, \underline{a}\_l \le a\_l \le \overline{a}\_l, \sum\_{l=1}^n a\_l \le \overline{a} \right\}.$$

Clearly, A is a non-empty convex compact set. The payoff function is the sum-rate r að Þ¼ ; ω P<sup>n</sup> <sup>l</sup>¼<sup>1</sup> Wl log 1ð Þ <sup>þ</sup> SINRl where Wl <sup>&</sup>gt; <sup>0</sup>: The mapping að Þ ; <sup>ω</sup> <sup>↦</sup> r að Þ ; <sup>ω</sup> is continuously differentiable.


true distribution of the channel state due to estimation error. The true distribution of ω is unknown. Based on this observation, an uncertainty set Brð Þ m with radius r ≥ 0 is constructed for alternative distribution candidates. Note that r ¼ 0 means that B0ð Þ¼ m f g m : The distributional robust optimization problem is supainfm<sup>~</sup> <sup>∈</sup>B<sup>r</sup> ð Þ <sup>m</sup> Em<sup>~</sup> r að Þ ; ω : In presence of interference, the function r að Þ ; ω is not necessarily concave in a: In absence of interference, the problem becomes concave.

We add the term h i a; b � a to both sides to obtain the following relationships:

Thus

This completes the proof.

h i a � b; ηEm∇h að Þ ; ω ≥ 0 ∀b∈ A, ⇔ h i a � b; ηEm∇h að Þ ; ω þ h i a � b; �a ≥ h i a; b � a ∀b∈ A, ⇔ h i b � a; �½ � a þ ηEm∇h að Þ ; ω þ h i a � b; �a ≥ 0 ∀b∈ A, ⇔ h i b � a; a � ½ � a þ ηEm∇h að Þ ; ω ≥ 0 ∀b∈ A,

Recall that the projection operator on a convex and closed set A is uniquely determined by

<sup>z</sup> <sup>∈</sup> <sup>R</sup>n, z<sup>0</sup> <sup>¼</sup> projA½ � <sup>z</sup> <sup>⇔</sup> <sup>z</sup><sup>0</sup> � <sup>z</sup>; <sup>b</sup> � <sup>z</sup><sup>0</sup> h i <sup>≥</sup> <sup>0</sup>, <sup>∀</sup>b<sup>∈</sup> <sup>A</sup>:

h i b � a; a � ½ � a þ ηEm∇h að Þ ; ω ≥ 0, ∀b∈ A

Proposition 9. Let the set of virtual population states A be a non-empty convex compact and the

Proof. A direct application of the Brouwer-Schauder's fixed-point theorem which states that if ϕ : A ! A is continuous and A non-empty convex compact then ϕ has at least one fixedpoint in A: Here we choose ϕð Þ¼ a projA½ � a þ ηEm∇h að Þ ; ω : Clearly ϕð Þ A ⊆A and ϕ is continuous on A as the mapping b ↦ Em∇h bð Þ ; ω and the projection operator b ↦ projA½ � b are both

Note that we do not need sophisticated set-valued fixed-point theory to obtain this result.

Definition 8. The virtual population state a is evolutionarily stable if a ∈ A and for any alternative

ha � b,Em∇h að Þ þ eð Þ b � a ; ω > 0, ∀e ∈ð Þ 0; e<sup>b</sup> :

The function <sup>ϱ</sup> : <sup>A</sup> � <sup>R</sup><sup>n</sup> � <sup>R</sup><sup>n</sup>�<sup>n</sup> <sup>þ</sup> ! <sup>R</sup><sup>n</sup>�<sup>n</sup> is the revision protocol, which describes how virtual agents are making decisions. The revision protocol ϱ takes a population state a, the corresponding fitness ∇Emh, the adjacency matrix Λ and returns a matrix. Therefore, let

the strategy l∈L have incentives to migrate to the strategy l∈L only if ϱlkð Þ a; h; Λ > 0, and it is also possible to design switch rates depending on the topology describing the migration constraints, i.e., λlk ¼ 0 ) ϱlkð Þ¼ a; h; Λ 0: The distributed distributionally robust optimization consists to perform the optimization problem above over the distributed network that is

th component. Then, the virtual agents selecting

th to k

mapping b ↦ Em∇h bð Þ ; ω be continuous. Then, there exists at least one equilibrium in A:

continuous. Then the announced result follows. This completes the proof.

deviant state b 6¼ a there is an invasion barrier e<sup>b</sup> > 0 such that

ϱlkð Þ a; h; Λ be the switching rate from the l

As a consequence we can derive the following existence result.

<sup>⇔</sup> <sup>a</sup> <sup>¼</sup> projA½ � <sup>a</sup> <sup>þ</sup> <sup>η</sup>Em∇h að Þ ; <sup>ω</sup> : (21)

(20)

19

Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686
