1. Introduction

Robust optimization can be defined as the process of determining the best or most effective result, utilizing a quantitative measurement system under worst case uncertain functions or parameters. The optimization may occur in terms of best robust design, net cash flows, profits, costs, benefit/cost ratio, quality-of-experience, satisfaction, end-to-end delay, completion time, etc. Other measurement units may be used, such as units of production or production time, and optimization may occur in terms of maximizing production units, minimizing processing time,

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and eproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

production time, maximizing profits, or minimizing costs under uncertain parameters. There are numerous techniques of robust optimization methods such as robust linear programming, robust dynamic programming, robust geometric programming, queuing theory, risk analysis, etc. One of the main drawbacks of the robust optimization is that the worst scenario may be too conservative. The bounds provided by the worst case scenarios may not be useful in many interesting problems (see the wireless communication example provided below). However, distributionally robust optimization is not based on the worst case parameters. The distributional robustness method is based the probability distribution instead of worst parameters. The worse case distribution within a certain carefully designed distributional uncertainty set may provide interesting features. Distributionally robust programming can be used not only to provide a distributionally robust solution to a problem when the true distribution is unknown, but it also can, in many instances, give a general solution taking into account some risk. The presented methodology is simple and reduces significantly the dimensionality of the distributionally robust optimization. We hope that the designs of distributionally robust programming presented here can help designers, engineers, cost–benefit analyst, managers to solve concrete problems under unknown distribution.

The decision-maker chooses to experiment several trials and obtains statistical realizations of ω from measurements. The measurement data can be noisy, imperfect and erroneous. Then, an empirical distribution (or histogram) m is built from the realizations of ω: However, m is not the true distribution of the random variable ω, and m may not be a reliable measure due to statistical, bias, measurement, observation or computational errors. Therefore, the decisionmaker is facing a risk. The risk-sensitive decision-maker should decide action that improves the performance of Em<sup>~</sup> r að Þ ; ω among alternative distributions m~ within a certain level of deviation r > 0 from the distribution m: The distributionally robust optimization problem is

where Brð Þ m is the uncertainty set of alternative admissible distributions from m within a certain radius r > 0: Different distributional uncertainty sets are presented: the f-divergence

We introduce the notion of f � divergence which will be used to compute the discrepancy

Definition 1. Let m and m be two probability measures over ~ Ω such that m is absolutely continuous with respect to m~ : Let f be a convex function. Then, the f -divergence between m and m is defined as ~

� �dm<sup>~</sup> � <sup>f</sup>ð Þ<sup>1</sup> ,

� �dm<sup>~</sup> � <sup>f</sup>ð Þ<sup>1</sup>

dm � � � <sup>f</sup>ð Þ<sup>1</sup>

¼ fð Þ� 1 fð Þ¼ 1 0:

Thus, Dfð Þ m∥m~ ≥ 0 for any convex function f : Note however that, the f� divergence Dfð Þ m∥m~ is not a distance (for example, it does not satisfy the symmetry property). Here the distribu-

� fð Þ1

(2)

ð Ω <sup>f</sup> dm dm~

dm<sup>~</sup> is the Radon-Nikodym derivative of the measure m with the respect the measure m~ :

ð Ω <sup>f</sup> dm dm~

≥ f ð Ω dm dm<sup>~</sup> dm<sup>~</sup> � �

¼ f ð Ω

Dfð Þ� m∥m~

Dfð Þ¼ m∥m~

tional uncertainty set imposed to the alternative distribution m~ is given by

sup<sup>a</sup><sup>∈</sup> <sup>A</sup>infm<sup>~</sup> <sup>∈</sup> <sup>B</sup>rð Þ <sup>m</sup> E<sup>ω</sup>�m<sup>~</sup> r að Þ ; ω : (1)

Distributionally Robust Optimization http://dx.doi.org/10.5772/intechopen.76686 3

therefore formulated as

2.1.1. f -divergence

follows:

where dm

By Jensen's inequality:

and the Wasserstein metric, defined below.

between probability distributions.

The rest of the chapter is organized as follows. Section 2 presents some preliminary concepts of distributionally robust optimization. A class of constrained distributionally robust optimization problems are presented in Section 3. Section 4 focuses on distributed distributionally robust optimization. Afterwards, illustrative examples in distributed power networks and in wireless communication networks are provided to evaluate the performance of the method. Finally, prior works and concluding remarks are drawn in Section 5.

Notation: Let R, Rþ, denote the set of real and non-negative real numbers, respectively, ð Þ Ω; d be a separable completely metrizable topological space with d : Ω � Ω ! R<sup>þ</sup> a metric (distance). Let Pð Þ Ω be the set of all probability measures over Ω:
