**1. Introduction**

94 Mobile Networks

Faltstrom, P. (2000) E.164 number and DNS, RFC 2916, Internet Engineering Task Force,

Gurbani, V.; Jagadeesan,L. & Mendiratta, V.(2005). Characterizing session initiation protocol

ITU-T (2005) Rec. G.107. The E-model, a Computational Model for use in Transmission

ITU-T (1999) Rec. G.108. Application of the E-model: A planning guide. Technical report,

ITU-T (2007) Rec. G.729. Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP)", Technical report.ITU-T. ITU-T (2006) Rec.G.723. Dual rate speech coder for multimedia communications

ITU-T (2007) Rec. G.113.Transmission impairments due to speech processing. Technical

Kueh, V.; Tafazolli, R.& Evans, B.(2003). Performance Evaluation of SIP-based Session

Rosenberg, J.; H. Schulzrinne , H.; Camarillo,G.; Johnston, A.; J. Peterson,J.; R. Sparks, R.;M.

Schulzrinne, H.; Casner, S.; Frederick, S.; & Jacobson, R.(2003) RTP: a Transport Protocol for

Sisalem, D.; Floroiu, J.; Kuthan, J.; Abend, U. & Schulzrinne, H.(2009) . SIP Security. ISBN

Sisalem, D.; Liisberg, M. & Rebahi, Y.(2008).A Theoretical Model of the Effects of Losses and Delays on the Performance of SIP. Globecom 2008, New Orleans, (December 2008) Wu, J.; & Wang, P.(2003).The Performance Analysais of SIP-T Signaling System in Carrier

Establishment Over Satellite-UMTS. Vehicular Technology Conference, 2003. VTC 2003-Spring. The 57th IEEE Semiannual, Vol: 2, 22-25 Apr 2003, pp: 1381 – 1385. Muhammad T. Alam (2005). An Optimal Method for SIP-Based Session Establishment over

IMS. International Symposium on Performance Evaluation of Computer And Telecommunication Systems (SCS 2005), July 24-28, Hilton Cherry Hill/Philadelphia, Philadelphia, Pennsylvania, Sim Series.Vol 37, No. 3, pp: 692-

Handley,M.& Schooler,E.(2002) SIP: Session Initiation Protocol. Internet

Real-Time Applications, STD 64, RFC 3550, Internet Engineering Task Force, (July

Class VoIP Network, Proceedings of the 17th International Conference on Advanced Information Networking and Applications. Washington, DC, USA: IEEE

*Science,* M. Malek, E. Nett, and N. Suri, Eds. Springer, pp. 196–211. ITU-T (2005) H.248.1 ITR 2005 Gateway control protocol. Technical report, ITU-T.

transmitting at 5.3 and6.3 kbit/s. Technical report, ITU-T.

Engineering Task Force, RFC 3261,(June 2002).

(SIP) network performance and reliability. *In ISAS, ser. Lecture Notes in Computer* 

(September 2000).

ITU-T.

698.

2003).

978-0-4-470.51636.2 (cloth)

Computer Society, 2003, p. 39.

report, ITU-T.

Planning.Technical report, ITU-T.

Cognitive radio refers to a set of technologies aiming to increase the efficiency in the use of the radio frequency (RF) spectrum. Wireless communication systems are offering increasing bandwidth to their users, therefore the spectrum demand is becoming higher. However, RF spectrum is scarce and operators gain access to it by a licensing scheme by which public administrations assign a frequency band to each operator. Currently, this allocation is static and inflexible in the sense that a licensed band can only be accessed by one operator and their clients (licensed users). However, it is a known fact that while some RF bands are heavily used at some locations and at particular times, many other bands remain largely underused FCC (2002). This is, in fact, a classical property of tele-traffic systems, *i.e.* traffic intensity is highly variable during a day. The consequence is a paradoxical situation: while the spectrum scarcity problem hinders the development of new wireless applications, there are large portions of unoccupied spectrum (*spectrum holes* or spectrum opportunities).

Cognitive radio provides the mechanisms allowing unlicensed (or secondary) users to access licensed RF bands by exploiting spectrum opportunities. Cognitive radio is based on software-defined radio, which refers to a wireless communication system that can dynamically adjust transmission parameters such as operating frequency, modulation scheme, protocol and so on. It is crucial that this opportunistic access is performed with the least possible impact on the service provided to licensed users. Therefore, cognitive users should implement algorithms to detect the spectrum use (*spectrum sensing*), identify the spectrum holes (*spectrum analysis*) and decide the best action based on this analysis (*decision making*). Once the decision is made, the cognitive user performs the *spectrum access* according to a medium access control (MAC) protocol facilitating the communication among unlicensed users with minimum collision with other licensed and unlicensed users.

Dynamic spectrum access (DSA) refers to the mechanism that manages the spectrum use in response to system changes (e.g. available channels, unlicensed user requests) according to certain objectives (e.g. maximize spectrum usage) and subject to some constraints (e.g. minimum blocking probability for licensed users). DSA can be implemented in a centralized or distributed fashion. In the former one, a central controller collects all the information required about current spectrum usage and the transmission requirements of secondary users in order to make the spectrum access decision, which is generally derived from the solution of some optimization problem. In distributed DSA unlicensed users make their own decisions autonomously, according to their local information. Compared to centralized DSA, this

• *Cost*: Each pair state-action is associated to a return or outcome, which we will generally refer to as cost. Sometimes the outcome has a positive meaning and is considered a benefit. Additionally, we can compute the total outcome obtained in the whole process. Depending on how it is computed, this overall cost is referred to as total discounted cost or average

Dynamic Spectrum Access in Cognitive Radio: An MDP Approach 97

• *Policy*: A policy is a function that relates the states with the actions taken at each stage, for the whole duration of the process considered. An optimal policy is the one that attains the

As can be anticipated from previous definitions, the goal of DP is to find the optimal policy for a given process. DP is, in fact, a decomposition strategy for complex optimization problems.

Markov Decision Processes are the application of DP to systems described by controlled discrete-time Markov chains, that is, Markov chains whose transition probabilities are

Let the integer *k* denote the *k*-th stage of an MDP. At a given stage, let *i* and *u* denote the state of the system and the action taken, respectively. The set of possible values of the state, the *state space*, is denoted by *S*, therefore *i* ∈ *S*. The control space *U*, is defined similarly. In general, at each state *i* only a subset of actions *U*(*i*) ⊆ *U* is allowed. We restrict our attention to processes where both *S*, *U*(*i*) and *U* are independent of *k*. In this case, the transition probability from state *i* to state *j* is denoted as *pij*(*u*). A policy takes the form: *u* = *μ*(*i*), and because it does not depend on *k* it is said to be a *stationary* policy. It is said that a policy is admissible if *μ*(*i*) ∈ *U*(*i*) for *i* ∈ *S*. At each state *i*, the policy provides the probability distribution of next

The cost of each pair action-state is denoted by *g*(*i*, *u*). Sometimes the costs are associated to transitions instead of states. Let *g*˜(*i*, *u*, *j*) denote the transition cost from state *i* to state *j*. In

The objective of the MDP is to find the optimal stationary policy *μ* such that the total cost is minimized. The total cost may be defined in several ways. We will focus our attention on average cost problems. In this case, the cost to be optimized is given by the following equation

> *<sup>N</sup>*−<sup>1</sup> ∑ *k*=1

where *xk* represents the system's state at the *k*-th stage. Note that in the definition of the average cost *λ* we are implicitly assuming that its value is independent of the initial state of the system. This is generally not always true. However there are certain conditions under which this assumption holds. For example, in our scenario, the value of the per-stage cost is always bounded and both *S* and *U* are finite sets. Moreover, there is at least one state, *n* that is *recurrent* in every stationary policy. Given previous conditions, the limit in the right side of

Sometimes the system is modeled as a continuous-time Markov chain. In this case, as we shall see, the definition of the average cost is slightly different. In order to solve it by means of the known equations for average cost MDP problems, we have to construct an auxiliary discrete-time problem whose average cost equals the one of the continuous-time problem.

1 *N E*

*<sup>g</sup>*(*i*, *<sup>u</sup>*) = <sup>∑</sup>*j*∈*<sup>S</sup> <sup>g</sup>*˜(*i*, *<sup>u</sup>*, *<sup>j</sup>*)*pi*,*j*(*u*) (1)

(2)

*g* (*xk*, *μ*(*xk*))

In this case, the decomposition exploits the discrete-time structure of the policy.

cost, among others.

best overall cost for a given objective.

determined by a decision variable.

state as *pij*(*μ*(*i*)), for *j* ∈ *S*.

this case, we use the *expected cost* per stage defined as:

*λ* = lim*N*→<sup>∞</sup>

(2) exists and the average cost does not depend on the initial state.

scheme requires greater computational resources at the user terminal and generally does not achieve globally optimal solutions. On the other side, distributed schemes imply a smaller communication overhead.

MAC protocols for DSA can also include spectrum trading features. In situations of low spectrum usage, the licensed operator may decide to sell spectrum opportunities to unlicensed users. In order to do this in real-time, a protocol is required to support negotiations on access price, channel holding time, *etc*, between the spectrum owner and secondary users. There are several models for spectrum trading. In this work, we consider the bid-auction model, in which secondary users bid for the spectrum of a single spectrum owner.

This chapter addresses the design of DSA MAC protocols for centralized dynamic spectrum access. We explore the possibilities of a formal design based on a Markov decision process (MDP) formulation. We survey previous works on this issue and propose a design framework to balance the grade-of-service (e.g. blocking probability) of different user categories and the expected economic revenue. When two or more contrary objectives are balanced on an optimization problem, there is not an optimal solution, in the strict sense, but a Pareto front, defined as the set of values, for each individual objective, such that any objective can not be improved without worsening the others. In this work we study the Pareto front solutions for two possible access models. The first one consists of simply providing priority to the licensed users, and the second one is an auction-based model, where unlicensed users offer a bidding price for the spectrum opportunities. In the priority-based access, the centralized policy should balance the blocking probability of each class of users. In the auction-based access, the trade-off appears between the blocking probability of primary users and the expected revenue.

The content structure of the rest of this chapter is the following. Section 2 provides a brief introduction to Markov Decision Processes. Section 3 reviews previous works using the MDP approach in cognitive radio systems. Section 4 explains the system model and MDP formulation for both DSA procedures considered. Section 5 contains the performance analysis of each model based on numerical evaluations of practical examples. Section 6 summarizes the conclusions of this work.
