4. Distributed optimization

This section presents distributed distributionally robust optimization problems over a direct graph. A large number of virtual agents can potentially choose a node (vertex) subject to constraint. The vector a represents the population state. Since a has n components, the graph has n vertices. The interactions between virtual agents are interpreted as possible connections of the graph. Let us suppose that the current interactions are represented by a directed graph <sup>G</sup> <sup>¼</sup> ð Þ <sup>L</sup>; <sup>E</sup> , where <sup>E</sup> <sup>⊆</sup>L<sup>2</sup> is the set of links representing the possible interaction among the proportion of agents, i.e., if ð Þ l; k ∈ E, then the component l of a can interact with the k�th component of a. In other words, ð Þ l; k ∈E means that virtual agents selecting the strategy l∈L could migrate to strategy <sup>k</sup> <sup>∈</sup>L: Moreover, <sup>Λ</sup> <sup>∈</sup> f g <sup>0</sup>; <sup>1</sup> <sup>n</sup>�<sup>n</sup> is the adjacency matrix of the graph <sup>G</sup>, and whose entries are λlk ¼ 1, if ð Þ l; k ∈ E; and λlk ¼ 0, otherwise.

Definition 6. The distributionally robust fitness function is the marginal distributionally robust payoff function. If a ↦ Emh að Þ ; ω is continuously differentiable, the distributionally robust fitness function is Em∇ah að Þ ; ω :

Definition 7. The virtual population state a is an equilibrium if a ∈ A and it solves the variational inequality

$$\forall a \,\,\forall \,\,\exists \,\,\_m \nabla\_a h(a, \,\,\omega) \ge 0, \,\,\,\forall b \in \mathcal{A}.$$

Proposition 8. Let the set of virtual population state A be non-empty convex compact and b ↦ Em∇h bð Þ ; ω be continuous. Then the following conditions are equivalent:


Proof. Let a be a feasible action that solves the variational inequality:

$$(a - b, \mathbb{E}\_m \nabla h(a, \omega) \ge 0, \quad \forall b \in \mathcal{A}.)$$

Let η > 0: By multiplying both sides by η, we obtain

$$(a - b, \eta \mathbb{E}\_m \nabla h(a, \omega) \ge 0, \quad \forall b \in \mathcal{A}.)$$

We add the term h i a; b � a to both sides to obtain the following relationships:

$$\begin{aligned} \langle a-b, \eta \mathbb{E}\_{\mathrm{m}} \nabla h(a, \omega) \rangle &\geq 0 & \forall b \in \mathcal{A}, \\ \Leftrightarrow & \langle a-b, \eta \mathbb{E}\_{\mathrm{m}} \nabla h(a, \omega) \rangle + \langle a-b, -a \rangle \succeq \langle a, b-a \rangle & \forall b \in \mathcal{A}, \\ \Leftrightarrow & \langle b-a, -[a+\eta \mathbb{E}\_{\mathrm{m}} \nabla h(a, \omega)] \rangle + \langle a-b, -a \rangle \succeq 0 & \forall b \in \mathcal{A}. \\ \Leftrightarrow & \langle b-a, a-[a+\eta \mathbb{E}\_{\mathrm{m}} \nabla h(a, \omega)] \rangle \geq 0 & \forall b \in \mathcal{A}. \end{aligned} \tag{20}$$

Recall that the projection operator on a convex and closed set A is uniquely determined by

$$z \in \mathbb{R}^n, z' = \text{proj}\_{\mathcal{A}}[z] \Leftrightarrow \langle z' - z, b - z' \rangle \succeq 0, \ \forall b \in \mathcal{A}.$$

Thus

true distribution of the channel state due to estimation error. The true distribution of ω is unknown. Based on this observation, an uncertainty set Brð Þ m with radius r ≥ 0 is constructed for alternative distribution candidates. Note that r ¼ 0 means that B0ð Þ¼ m f g m : The distributional robust optimization problem is supainfm<sup>~</sup> <sup>∈</sup>B<sup>r</sup> ð Þ <sup>m</sup> Em<sup>~</sup> r að Þ ; ω : In presence of interference, the function r að Þ ; ω is

not necessarily concave in a: In absence of interference, the problem becomes concave.

and whose entries are λlk ¼ 1, if ð Þ l; k ∈ E; and λlk ¼ 0, otherwise.

b ↦ Em∇h bð Þ ; ω be continuous. Then the following conditions are equivalent:

Proof. Let a be a feasible action that solves the variational inequality:

This section presents distributed distributionally robust optimization problems over a direct graph. A large number of virtual agents can potentially choose a node (vertex) subject to constraint. The vector a represents the population state. Since a has n components, the graph has n vertices. The interactions between virtual agents are interpreted as possible connections of the graph. Let us suppose that the current interactions are represented by a directed graph <sup>G</sup> <sup>¼</sup> ð Þ <sup>L</sup>; <sup>E</sup> , where <sup>E</sup> <sup>⊆</sup>L<sup>2</sup> is the set of links representing the possible interaction among the proportion of agents, i.e., if ð Þ l; k ∈ E, then the component l of a can interact with the k�th component of a. In other words, ð Þ l; k ∈E means that virtual agents selecting the strategy l∈L could migrate to strategy <sup>k</sup> <sup>∈</sup>L: Moreover, <sup>Λ</sup> <sup>∈</sup> f g <sup>0</sup>; <sup>1</sup> <sup>n</sup>�<sup>n</sup> is the adjacency matrix of the graph <sup>G</sup>,

Definition 6. The distributionally robust fitness function is the marginal distributionally robust payoff function. If a ↦ Emh að Þ ; ω is continuously differentiable, the distributionally robust fitness

Definition 7. The virtual population state a is an equilibrium if a ∈ A and it solves the variational

ha � b, Em∇ah að Þ ; ω ≥ 0, ∀b∈ A:

Proposition 8. Let the set of virtual population state A be non-empty convex compact and

ha � b,Em∇h að Þ ; ω ≥ 0, ∀b∈ A:

ha � b, ηEm∇h að Þ ; ω ≥ 0, ∀b∈ A:

4. Distributed optimization

18 Optimization Algorithms - Examples

function is Em∇ah að Þ ; ω :

• ha � b,Em∇h að Þ ; ω ≥ 0, ∀b∈ A:

• the action a satisfies a ¼ projA½ � a þ ηEm∇h að Þ ; ω

Let η > 0: By multiplying both sides by η, we obtain

inequality

$$\begin{aligned} &\langle b-a, a-[a+\eta\mathbb{E}\_m\nabla h(a,\omega)]\rangle \ge 0, \quad \forall b \in \mathcal{A} \\ &\Leftrightarrow a = \operatorname{proj}\_{\mathcal{A}}[a+\eta\mathbb{E}\_m\nabla h(a,\omega)]. \end{aligned} \tag{21}$$

This completes the proof.

As a consequence we can derive the following existence result.

Proposition 9. Let the set of virtual population states A be a non-empty convex compact and the mapping b ↦ Em∇h bð Þ ; ω be continuous. Then, there exists at least one equilibrium in A:

Proof. A direct application of the Brouwer-Schauder's fixed-point theorem which states that if ϕ : A ! A is continuous and A non-empty convex compact then ϕ has at least one fixedpoint in A: Here we choose ϕð Þ¼ a projA½ � a þ ηEm∇h að Þ ; ω : Clearly ϕð Þ A ⊆A and ϕ is continuous on A as the mapping b ↦ Em∇h bð Þ ; ω and the projection operator b ↦ projA½ � b are both continuous. Then the announced result follows. This completes the proof.

Note that we do not need sophisticated set-valued fixed-point theory to obtain this result.

Definition 8. The virtual population state a is evolutionarily stable if a ∈ A and for any alternative deviant state b 6¼ a there is an invasion barrier e<sup>b</sup> > 0 such that

$$\langle a - b, \mathbb{E}\_m \nabla h(a + \mathfrak{e}(b - a), \omega) > 0, \quad \forall \mathfrak{e} \in (0, \mathfrak{e}\_b).$$

The function <sup>ϱ</sup> : <sup>A</sup> � <sup>R</sup><sup>n</sup> � <sup>R</sup><sup>n</sup>�<sup>n</sup> <sup>þ</sup> ! <sup>R</sup><sup>n</sup>�<sup>n</sup> is the revision protocol, which describes how virtual agents are making decisions. The revision protocol ϱ takes a population state a, the corresponding fitness ∇Emh, the adjacency matrix Λ and returns a matrix. Therefore, let ϱlkð Þ a; h; Λ be the switching rate from the l th to k th component. Then, the virtual agents selecting the strategy l∈L have incentives to migrate to the strategy l∈L only if ϱlkð Þ a; h; Λ > 0, and it is also possible to design switch rates depending on the topology describing the migration constraints, i.e., λlk ¼ 0 ) ϱlkð Þ¼ a; h; Λ 0: The distributed distributionally robust optimization consists to perform the optimization problem above over the distributed network that is subject to communication restriction. We construct a distributed distributionally robust game dynamics to perform such a task. The distributed distributionally robust evolutionary game dynamics emerge from the combination of the (robust) fitness h and the constrained switching rates ϱ: The evolution of the portion al is given by the distributed distributional robust mean dynamics

$$\dot{a}\_{l} = \sum\_{k \in \mathcal{L}} a\_{k} \mathbf{q}\_{kl}(a, h, \Lambda) - a\_{l} \sum\_{k \in \mathcal{L}} \mathbf{q}\_{lk}(a, h, \Lambda), l \in \mathcal{L},\tag{22}$$

functions for the corresponding full potential game are given by f <sup>l</sup>

<sup>A</sup> <sup>¼</sup> <sup>a</sup> <sup>∈</sup> <sup>R</sup><sup>n</sup>

þ : X l ∈L

max 0ð Þ ; ak � ak max 0; al � al

2. al ¼ 0, for all l∈L f g 9; 10 , a<sup>9</sup> ¼ 1:1, and a<sup>10</sup> ¼ 1; and al ¼ d, for all l∈L f g 1; 2 , a<sup>1</sup> ¼ 3, and

3. Case 1 constraints and with interaction restricted to the cycle graph G ¼ ð Þ L; E with set of links

Figure 3 presents the evolution of the generated power, the fitness functions corresponding to the marginal costs and the total cost. For the first scenario, the evolutionary game dynamics converge to a standard evolutionarily stable state in which ^f a<sup>⋆</sup> ð Þ¼ <sup>c</sup>1n. In contrast, for the

• Player p has a decision space Ap ⊂ Rnp , np ≥ 1: Players are coupled through their actions

( )

Ap; <sup>X</sup> p∈P

( )

� � <sup>¼</sup> ap <sup>∈</sup> Ap : ap; <sup>a</sup>�<sup>p</sup>

� �; <sup>X</sup>np

cp; ap � � ≤ b

� � ∈ A � �:

l¼1

:

cplapl ≤ bp

and their payoffs. The set of all feasible action profiles is <sup>A</sup> <sup>⊂</sup> <sup>R</sup>n, with <sup>n</sup> <sup>¼</sup> <sup>P</sup>

Ap <sup>¼</sup> ap <sup>∈</sup> <sup>R</sup>np <sup>j</sup> apl <sup>∈</sup> apl; apl h i; <sup>l</sup><sup>∈</sup> <sup>1</sup>;…; np

<sup>A</sup> <sup>¼</sup> <sup>a</sup> <sup>∈</sup> <sup>Y</sup>

p

second scenario, the dynamics converge to a constrained evolutionarily stable state.

al ¼ d; al ∈ al

� � ( ):

; al

� �max 0ð Þ ;Emð Þ hk � hl ,

l∈L, and action space is given by

The distributed revision protocol is set to

<sup>E</sup> <sup>¼</sup> <sup>∪</sup><sup>l</sup> <sup>∈</sup><sup>L</sup> f g<sup>n</sup> ð Þ <sup>l</sup>; <sup>l</sup> <sup>þ</sup> <sup>1</sup> � �∪f g ð Þ <sup>n</sup>; <sup>1</sup> ,

4.1. Extension to multiple decision-makers

1. a ¼ 0<sup>n</sup> and a ¼ d1ln,

a<sup>2</sup> ¼ 2:5,

<sup>ϱ</sup>lkð Þ¼ <sup>a</sup>; <sup>h</sup>; <sup>Λ</sup> <sup>λ</sup>lk

al

4. Case 2 constraints and with interaction restricted as in Case 3.

Consider a constrained game G in strategic-form given by

p can choose an action ap in the set A<sup>p</sup> a�<sup>p</sup>

We restrict our attention to the following constraints:

• Player p has a payoff function rp : A ! R:

The coupled constraint is

• P ¼ f g 1;…; P is the set of players. The cardinality of P is P ≥ 2:

for al 6¼ 0: We evaluate four different scenarios, i.e.,

ð Þ¼� a 2alc2<sup>l</sup> � c1<sup>l</sup>

Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686

, for all

21

<sup>p</sup>∈<sup>P</sup>np: Player

Since the distributionally robust function h is obtained after the transformation from payoff function r by means of triality theory, the dynamics (22) is seeking for distributed distributionally robust solution.

#### Algorithm 5. The distributed distributional robust mean dynamics pseudocode is as follows:


#### 8: end procedure

The next example establishes evolutionarily stable state, equilibria and rest-point of the dynamics (22) by designing ϱ:

Example 6. Let us consider a power system that is composed of 10 generators, i.e., let L ¼ f g 1;…; 10 . Let al ∈ R<sup>þ</sup> be the power generated by the generator l ∈L. Each power generation should satisfy the physical and/or operation constraints al ∈ al ; al � �, for all l∈L. It is desired to satisfy the power demand given by d∈ R, i.e., it is necessary to guarantee that P <sup>l</sup> <sup>∈</sup><sup>L</sup>al ¼ d, i.e., the supply meets the demand. The objective is to minimize the generation quadratic costs for all the generators, i.e.,

$$\text{Maximize}\quad r(a,\omega) = \sum\_{l \in \mathcal{L}} r\_l(a\_l) = -\sum\_{l \in \mathcal{L}} \left(c\_{0l} + c\_{1l}a\_l + c\_{2l}a\_l^2\right),$$

$$\text{s.t.}\qquad \sum\_{l \in \mathcal{L}} a\_l = d, \quad \underline{a}\_l \le a\_l \le \overline{a}\_l, \quad l \in \mathcal{L},$$

where <sup>r</sup> : <sup>R</sup><sup>n</sup> ! <sup>R</sup> is concave, and the parameters are possibly uncertain and selected as c0<sup>l</sup> ¼ 25 þ 6l, c1<sup>l</sup> ¼ 15 þ 4l þ ω1l, c2<sup>l</sup> ¼ 5 þ l þ ω2l, and d ¼ 20 þ ω3l. Therefore, the fitness

functions for the corresponding full potential game are given by f <sup>l</sup> ð Þ¼� a 2alc2<sup>l</sup> � c1<sup>l</sup> , for all l∈L, and action space is given by

$$\mathcal{A} = \left\{ a \in \mathbb{R}\_+^n : \sum\_{l \in \mathcal{L}} a\_l = d, \ \left. a\_l \in \left[ \underline{a}\_l, \overline{a}\_l \right] \right\} .\right\}.$$

The distributed revision protocol is set to

$$\mathbf{q}\_{lk}(a, h, \Lambda) = \frac{\Lambda\_{lk}}{a\_l} \max(0, \overline{a}\_k - a\_k) \max\{0, a\_l - \underline{a}\_l\} \max(0, \mathbb{E}\_m(h\_k - h\_l)),$$

for al 6¼ 0: We evaluate four different scenarios, i.e.,

1. a ¼ 0<sup>n</sup> and a ¼ d1ln,

subject to communication restriction. We construct a distributed distributionally robust game dynamics to perform such a task. The distributed distributionally robust evolutionary game dynamics emerge from the combination of the (robust) fitness h and the constrained switching rates ϱ: The evolution of the portion al is given by the distributed distributional robust mean

Since the distributionally robust function h is obtained after the transformation from payoff function r by means of triality theory, the dynamics (22) is seeking for distributed distribu-

1: procedure POPULATION-INSPIRED ALGORITHM ð Þ að Þ0 ; e; T; ϱ; g; m; h; Λ ⊳ The population-inspired

The next example establishes evolutionarily stable state, equilibria and rest-point of the

Example 6. Let us consider a power system that is composed of 10 generators, i.e., let L ¼ f g 1;…; 10 . Let al ∈ R<sup>þ</sup> be the power generated by the generator l ∈L. Each power generation should satisfy the

rlð Þ¼� al

where <sup>r</sup> : <sup>R</sup><sup>n</sup> ! <sup>R</sup> is concave, and the parameters are possibly uncertain and selected as c0<sup>l</sup> ¼ 25 þ 6l, c1<sup>l</sup> ¼ 15 þ 4l þ ω1l, c2<sup>l</sup> ¼ 5 þ l þ ω2l, and d ¼ 20 þ ω3l. Therefore, the fitness

X l∈L

al ¼ d, al ≤ al ≤ al, l∈L,

� �, for all l∈L. It is desired to satisfy the power demand

<sup>c</sup>0<sup>l</sup> <sup>þ</sup> <sup>c</sup>1lal <sup>þ</sup> <sup>c</sup>2la<sup>2</sup>

� �,

<sup>l</sup> <sup>∈</sup><sup>L</sup>al ¼ d, i.e., the supply meets the demand.

l

; al

The objective is to minimize the generation quadratic costs for all the generators, i.e.,

l ∈L

l ∈L

X k∈L

ϱlkð Þ a; h; Λ , l∈L, (22)

akϱklð Þ� a; h; Λ al

Algorithm 5. The distributed distributional robust mean dynamics pseudocode is as follows:

<sup>a</sup>\_<sup>l</sup> <sup>¼</sup> <sup>X</sup> k∈L

3: while regret > e and t ≤ T do ⊳ We have the answer if regret is 0

dynamics

tionally robust solution.

20 Optimization Algorithms - Examples

2: a að Þ0

6: end while

8: end procedure

learning starting from að Þ0 within ½ � 0; T

7: return a tð Þ, regrett ⊳ get a tð Þ and the regret

physical and/or operation constraints al ∈ al

given by d∈ R, i.e., it is necessary to guarantee that P

Maximize r að Þ¼ ; <sup>ω</sup> <sup>X</sup>

s:t: X

4: Compute a tð Þ solution of (22)

5: Compute regrett

dynamics (22) by designing ϱ:


Figure 3 presents the evolution of the generated power, the fitness functions corresponding to the marginal costs and the total cost. For the first scenario, the evolutionary game dynamics converge to a standard evolutionarily stable state in which ^f a<sup>⋆</sup> ð Þ¼ <sup>c</sup>1n. In contrast, for the second scenario, the dynamics converge to a constrained evolutionarily stable state.

#### 4.1. Extension to multiple decision-makers

Consider a constrained game G in strategic-form given by


We restrict our attention to the following constraints:

$$A\_p = \left\{ a\_p \in \mathbb{R}^{n\_p} \mid \ a\_{pl} \in \left[ \underline{a}\_{pl}, \overline{a}\_{pl} \right], \ l \in \left\{ 1, \dots, n\_p \right\}, \ \sum\_{l=1}^{n\_p} c\_{pl} a\_{pl} \le b\_p \right\}$$

The coupled constraint is

$$\mathcal{A} = \left\{ a \in \prod\_{p} A\_p, \quad \sum\_{p \in \mathcal{P}} \langle \overline{c}\_p, a\_p \rangle \le \overline{b} \right\}.$$

xpl <sup>¼</sup> <sup>ξ</sup>�<sup>1</sup> apl � � <sup>¼</sup> apl � apl

cpl apl � apl � �xpl <sup>≤</sup> bp �Xnp

∇ap rpð Þ a; ω ,

ypl Pnp <sup>k</sup>¼<sup>1</sup> <sup>e</sup> ypk

The work in [10] provides a nice intuitive introduction to robust optimization emphasizing the parallel with static optimization. Another nice treatment [11], focusing on robust empirical risk minimization problem, is designed to give calibrated confidence intervals on performance and provide optimal tradeoffs between bias and variance [12, 13]. f-divergence based performance evaluations are conducted in [11, 14, 15]. The connection between risk-sensitivity measures such as the exponentiated payoff and distributionally robustness can be found in [16]. Distributionally robust optimization and learning are extended to multiple strategic decision-

We gratefully acknowledge support from U.S. Air Force Office of Scientific Research under

Jian Gao, Yida Xu, Julian Barreiro-Gomez, Massa Ndong, Michalis Smyrnakis and

Learning and Game Theory Laboratory, New York University, Abu Dhabi,

apl <sup>≔</sup> aplxpl <sup>þ</sup> apl <sup>1</sup> � xpl � �,

0 @

xpl <sup>¼</sup> min 1; <sup>e</sup>

l∈ 1; …; np � �,

making problems i.e., distributionally robust games in [17, 18].

Xnp l¼1

y\_ <sup>p</sup> <sup>¼</sup> <sup>∇</sup><sup>2</sup> pg h i�<sup>1</sup>

8

>>>>>>>>>><

>>>>>>>>>>:

The learning algorithm (23) is

generates a trajectory apðÞ¼ <sup>t</sup> aplð Þ<sup>t</sup> � �

Acknowledgements

Author details

Hamidou Tembine\*

United Arab Emirates

grant number FA9550-17-1-0259.

\*Address all correspondence to: tembine@ieee.org

5. Notes

apl � apl

l¼1

^bp cpl apl � apl h i � �

1 A,

<sup>l</sup> that satisfies the constraint of player p at any time t:

(23)

23

Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686

∈½ � 0; 1 :

cplapl≕^bp:

Figure 3. Economic power dispatch. Evolution of the population states (generated power), fitness functions ^f að Þ¼ <sup>∇</sup>Eh að Þ ; <sup>ω</sup> , and the costs �Er að Þ ; <sup>ω</sup> . Figures (a)-(c) for case 1, (d)-(f) for case 2, (g)-(i) for case 3, and (j)-(l) for case 4.

Feasibility condition: If apl < apl, l ∈ 1;…; np � �, cpl <sup>&</sup>gt; <sup>0</sup>, <sup>P</sup>np <sup>l</sup>¼<sup>1</sup> cplapl <sup>&</sup>lt; bp, cp <sup>∈</sup> <sup>R</sup>np <sup>&</sup>gt;<sup>0</sup> and <sup>P</sup> p∈P cp; ap D E <sup>&</sup>lt; b, the constraint set <sup>A</sup> is non-empty, convex and compact.

We propose a method to compute a constrained equilibrium that has a full support (whenever it exists). We do not use the projection operator. Indeed we transform the domain apl; apl h i <sup>¼</sup> <sup>ξ</sup>ð Þ ½ � <sup>0</sup>; <sup>1</sup> where <sup>ξ</sup> xpl � � <sup>¼</sup> aplxpl <sup>þ</sup> apl <sup>1</sup> � xpl � � <sup>¼</sup> apl: <sup>ξ</sup> is a one-to-one mapping and

#### Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686 23

$$\mathbf{x}\_{pl} = \boldsymbol{\xi}^{-1} \left( a\_{pl} \right) = \frac{\underline{a}\_{pl} - \underline{a}\_{pl}}{\overline{a}\_{pl} - \underline{a}\_{pl}} \in [0, 1].$$

$$\sum\_{l=1}^{n\_p} c\_{pl} \left( \overline{a}\_{pl} - \underline{a}\_{pl} \right) \mathbf{x}\_{pl} \le b\_p - \sum\_{l=1}^{n\_p} c\_{pl} \underline{a}\_{pl} =: \hat{\underline{b}}\_p.$$

The learning algorithm (23) is

$$\begin{cases} \dot{\mathcal{Y}}\_{p} = \left[\nabla\_{p}^{2}\mathcal{g}\right]^{-1} \nabla\_{a\_{l}} r\_{p}(a,\omega),\\ a\_{pl} := \overline{a}\_{pl}\mathbf{x}\_{pl} + \underline{a}\_{pl}\left(1 - \mathbf{x}\_{pl}\right),\\ \quad\times\_{pl} = \min\left(1, \frac{e^{\mathcal{Y}\_{pl}}}{\sum\_{k=1}^{n\_{p}} e^{\mathcal{Y}\_{pl}}} \frac{\hat{b}\_{p}}{\left[c\_{pl}\left(\overline{a}\_{pl} - \underline{a}\_{pl}\right)\right]}\right),\\ \quad l \in \{1, \ldots, n\_{p}\}\_{l} \end{cases} \tag{23}$$

generates a trajectory apðÞ¼ <sup>t</sup> aplð Þ<sup>t</sup> � � <sup>l</sup> that satisfies the constraint of player p at any time t:
