**5. Numerical results**

In this section we provide examples of the Pareto front computation procedures described in previous section for each DSA type.

## **5.1 Priority based access**

For this DSA scheme we will consider three scenarios characterized by the asymmetry between the traffic intensity of licensed and unlicensed users. In every scenario, the average holding time is equal for every user, independently of their type. Therefore the service rate *μ<sup>L</sup>* =*μ<sup>U</sup>* = 5. Assuming that the time unit is an hour, this results in an average holding time of 12 minutes per connection. The total traffic (*λ* = *λ<sup>L</sup>* + *λU*) is 40 calls/h, which results in a total incoming traffic of 8 Erlangs. In a wireless cell covering 2.5 km<sup>2</sup> of urban area (cell radius equal to 400 m), with 2000 people per km2 and a 10% aggregate market penetration (licensed and unlicensed users), the number of covered users is around 500, and the resulting traffic intensity is 0.016 Erlangs per user. The number of available channels is set to *N* = 10, in order to evaluate the system in a relatively congested situation. With the assumed traffic intensity we can estimate the blocking probability of the system for the aggregate traffic by means of 12 Will-be-set-by-IN-TECH

s.t.

∑ *u*∈*U*(*i*)

*φ* (*j*, *u*) ≥ 1

Assuming that the problem is feasible and *φ*∗ is the optimal solution of the LP problem above,

for cases where the sum in the denominator is nonzero. Otherwise, the state is transitory and the control is irrelevant. Note that *qμ*∗(*i*) (*u*) denotes the probability of choosing action *u* at

Using the approach above in the problems described in previous section is straightforward: • *Priority-based access*: in the LP problem (23) replace *g* (*i*, *u*) by *gU* (*i*, *u*) defined in (7), and *c* (*i*, *u*) by *gL* (*i*, *u*) defined in (6). For each value of *β* we obtain the point in the Pareto front

• *Auction-based access*: in the LP problem (23) replace *g* (*i*, *u*) by *gU* (*i*, *u*) defined in (13), and *c* (*i*, *u*) by *gL* (*i*, *u*) defined in (6). As in previous case, for each value of *β* we obtain a point

In this section we provide examples of the Pareto front computation procedures described in

For this DSA scheme we will consider three scenarios characterized by the asymmetry between the traffic intensity of licensed and unlicensed users. In every scenario, the average holding time is equal for every user, independently of their type. Therefore the service rate *μ<sup>L</sup>* =*μ<sup>U</sup>* = 5. Assuming that the time unit is an hour, this results in an average holding time of 12 minutes per connection. The total traffic (*λ* = *λ<sup>L</sup>* + *λU*) is 40 calls/h, which results in a total incoming traffic of 8 Erlangs. In a wireless cell covering 2.5 km<sup>2</sup> of urban area (cell radius equal to 400 m), with 2000 people per km2 and a 10% aggregate market penetration (licensed and unlicensed users), the number of covered users is around 500, and the resulting traffic intensity is 0.016 Erlangs per user. The number of available channels is set to *N* = 10, in order to evaluate the system in a relatively congested situation. With the assumed traffic intensity we can estimate the blocking probability of the system for the aggregate traffic by means of

*<sup>q</sup>μ*∗(*i*) (*u*) <sup>=</sup> *<sup>φ</sup>*<sup>∗</sup> (*i*, *<sup>u</sup>*) ∑*u*�

*g* (*i*, *u*) *φ* (*i*, *u*)

*pi*,*<sup>j</sup>* (*u*) *φ* (*i*, *u*) = 0

(23)

) (24)

*c* (*i*, *u*) *φ* (*i*, *u*) ≤ *β*

*φ* (*i*, *u*) = 1

<sup>∈</sup>*U*(*i*) *<sup>φ</sup>*<sup>∗</sup> (*i*, *<sup>u</sup>*�

∑ *u*∈*U*(*i*)

min*<sup>φ</sup>* ∑ *i*∈*S*

> ∑ *u*∈*U*(*i*)

> > *i*∈*S*

∑ *u*∈*U*(*i*)

*<sup>φ</sup>* (*j*, *<sup>u</sup>*) − ∑

∑ *i*∈*S*

∑ *i*∈*S*

∑ *u*∈*U*(*j*)

the stationary randomized optimal policy *μ*∗ is generated by

corresponding to a blocking probability *β* for the licensed users.

following formulation

state *i* under policy *μ*∗.

in the Pareto front.

**5.1 Priority based access**

previous section for each DSA type.

**5. Numerical results**

the well-known Erlang's B formula (see Kleinrock (1975)):

$$E\left(n,\rho\right) = \frac{\frac{\rho^n}{n!}}{\sum\_{j=0}^{j=n} \frac{\rho^j}{j!}}\tag{25}$$

where *n* is the number of channels and *ρ* denotes the utilization factor. In our case *ρ*=*λ*/*μ<sup>L</sup>* = *λ*/*μU*. According to this formula, if the system accepted every incoming user, the total blocking probability would be *E* (10, 8)=0.12. As we will see, this probability is an upper bound for the blocking probability of the primary users, which are always accepted if the system has any available channel, and a lower bound for the secondary users.


The three scenarios are summarized in Table 1.

Table 1. Parameters values at the three scenarios of the priority based access problem.

First, we show in Fig. 3 the Pareto front obtained by means of an MDP where the blocking costs of licensed and unlicensed users were merged by means of a convex combination. The Pareto front was obtained by solving each MDP problem for 10000 values of the *α* parameter ranging from 0.01 to 1.

Fig. 3. Pareto fronts obtained for the priority-based access in scenario 1 (a), scenario 2 (b) and scenario 3 (c)

All the three scenarios receive the same total traffic intensity. However, when the traffic intensity of the primary users is smaller, the Pareto front is closer to both axes, *i.e.* the performances of both the primary and secondary users improve. This is an expectable result since only the traffic of secondary users is controlled by the access policy. When the optimization affects to a higher portion of the total amount of traffic the improvement is also more noticeable, showing the benefits of the MDP formulation.

The Pareto fronts obtained by means of the CMDP formulation in previous scenarios are identical to those shown in Fig. 3, showing that both formulations are equivalent in terms of finding the Pareto front for the priority-based access problem. The only difference relies on practical considerations. The CMDP approach allows us to find a policy with a predefined

can be observed that, for the same traffic intensity (the three scenarios receive 40 calls per unit of time) when the traffic share of the secondary users is higher (scenarios with higher number) the Pareto front moves away from the y-axis, *i.e.* the income obtained from secondary users increases and it also approaches the x-axis, *i.e.* the blocking probability of the licensed users diminishes. It is interesting to check that, especially in scenarios 2 and 3, a very small increment of the blocking probability of licensed users can multiply the benefit obtained from spectrum leasing by a factor of 2 or 3. On the other hand, these figures also indicate that once the income surpasses certain threshold, Pareto-optimal policies can only produce small

Dynamic Spectrum Access in Cognitive Radio: An MDP Approach 109

This chapter has surveyed the use of MDP formulation within the framework of cognitive radio. We have reviewed the fundamentals of MDP and its generalizations, such as CMDP, POMDP and constrained POMDP. While most previous works focus on decentralized access, we focus on centralized access. The main difference between them is that when the access relies on a central controller or *spectrum broker*, it generally has full knowledge of the spectrum occupation, while in decentralized access decision have to be taken with partial and sometimes unreliable information about channel occupation. Therefore, centralized schemes are more suitable to MDP or CMDP modeling, while decentralized ones generally require POMDP or constrained POMDP which are intractable in many cases and require approximated or heuristic algorithms. We consider two types of access: one where only one type of secondary user tries to access the licensed spectrum and other where users are classified according to the price they are willing to pay for the use of the spectrum. The first one is referred to as priority-based access and the second one as auction-based access. The main issue of the problems addressed is that two contrary objectives coexist. In priority-based access, the controller tries to reduce the blocking probability of both types of users. In auction-based, the objectives are to reduce blocking probability for licensed users and to increase the income received from spectrum leasing. For these problems there does not exists an *optimal* policy, but a set of *Pareto optimal* policies. The performance of these policies lie on the Pareto front, defined as the set of points where one objective cannot be improved without worsening the other one. We have shown how to compute these Pareto fronts for each access scheme by weighting the objectives in an MDP problem and by formulating a CMDP. The first approach requires solving Bellman's equation and the second requires solving a linear program. We have obtained the Pareto fronts for several scenarios, showing the influence of traffic share on system's performance. The Pareto front is a very usual tool to determine the performance threshold for each objective upon which further increments on this objective require excessive degradation of the other one. MDP and CMDP are useful tools for developing centralized access policies for cognitive radio systems. One drawback is the so-called *curse of dimensionality*, that may render computationally intractable the problem as the sizes of the state and action spaces increase. In addition, although policies can be computed off-line, alleviating the computational overhead of the access controller, the system's parameters may be variable, requiring many pre-computed policies and thus

This research has been supported by the MICINN/FEDER project grant TEC2010-21405-C02-02/TCM (CALM) and it was also developed in the framework of

increments of the income by dramatically rising the blocking probability.

**6. Conclusions**

imposing large memory requirements.

**7. Acknowledgments**

blocking probability for primary users while the MDP formulation implies the exploration of the Pareto front, since there is no a priori relationship between *α* and this blocking probability. On the other hand, implementing the policy solving the CMDP problem implies to randomize at least one control (it can be shown that the number of required randomized controls equals the number of constraints). While this is technically feasible, a stationary deterministic policy is simpler to implement.
