base station BS = 1

Path loss ∝ *d*−<sup>2</sup>

Mobile terminals distrib. ∼ U[*x*cell, *y*cell]

Fading Rice: [0.6; 0.4] Time selectivity slow

File Size *Fs* ∈ [8, 25] Diversity Factor *q* ∈ [0, 1]; Pheromone Evaporation Rate *ξ* ∈ [0, 1]; Population Size *m* ∈ [7, 35]; Max. # iterations *N* = 1000

Fs = 8, m = 7, q = 0.61, ξ = 1.00

**Table 1.** Multirate DS/CDMA system, channel and ACO input parameters

≈ 450 iterations necessary for the RA-PSO convergence.

 **p**\*

a) ACO − Power convergence through iterations

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−5

Iterations, N

10−4

from [1].

10−3

10−2

Power [W]

10−1

100

User Services [voice; video; data] User Rates *ri*,min = [ *rc*

Trials number T = 1000 realizations

*DS/CDMA Power-Rate Allocation System*

Time slot duration *T*slot = 666.7*µ*s or *R*slot = 1500 slots/s # mobile terminals *U* ∈ {5; 10; 20; 100; 250} users

Cell geometry rectangular, with *x*cell = *y*cell = 5 Km

*Fading Channel Type*

Shadowing uncorrelated log-normal, *σ*<sup>2</sup> = 6 dB

*User Features and QoS*

User BER *BER* = [5 × 10<sup>−</sup>3; 5 × 10<sup>−</sup>5; 5 × 10−<sup>8</sup>] *RA-ACO Algorithm* Problem Dimensionality *U* ∈ {5; 10; 20; 100; 250} users

*Monte-Carlo Simulation*

see the smooth-monotonic convergence of the RA-ACO algorithm toward the optimal power solution, in this case given by (16), in contrast to the non-monotonic oscillated convergence behavior presented by the RA-PSO algorithm. Besides, for *U* = 5 users power allocation problem, the ACO was able to achieve convergence after ≈ 250 iterations in contrast to the

10−4

b)

**Figure 2.** Power allocation for *U* = 5 users. Equally information rate among users is adopted. a) RA-ACO; b) RA-PSO algorithm

Power [W]

10−3

10−2

<sup>128</sup> ; *rc* <sup>32</sup> ; *rc* <sup>16</sup> ] [bps]

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−5

Iterations, N

b) PSO − Power convergence through iterations

M = 7; φ<sup>1</sup> = 2; φ<sup>2</sup> = 2

 **p**\*

Note that the population size *m* and file size *Fs* parameters (*m*, *Fs* ∈ **N**), both with entry values common for all the different {*q*, *ξ*} input parameters configurations, where chosen based on the problem dimensionality. Numerical experiments have shown that different entries around the ones chosen do not affect substantially the NMSE results as the different entries for *q* and *ξ* parameters do. It is worth noting that the PA problem in (9) presents a non-convex characteristic; hence, the value entries for the population size *m* and file size *Fs* parameters assume relative high values regarding the dimensions of the problem, meaning that both parameters are of the order of problem dimension, {*m*, *Fs*}≈O[*U*]. It means that RA-ACO can solve the non-convex PA problem in DS/CDMA systems but with input parameter loads relatively high.

Herein, a parameter calibration strategy was adopted in order to find the best tradeoff for the {*q*; *Fs*} set, given in Eq. (23). Since the parameters *Fs* and *m* are directly related to the computational complexity of the algorithm, finding a suitable parameter set with *Fs* entries as low as possible is of great interest.

On the other hand, the population size *m* parameter has a small or even no influence on the any other ACO input parameter (as *q* and *Fs* interfere each other). Although the *m* entries values directly increases the algorithm computational complexity. Therefore, the parameters *m* and *Fs* were fixed at low values and then the best *q* and *ξ* combination for it was sought. Hence, based on the NMSE *versus* convergence speed results obtained in Fig. 3, the optimized RA-ACO input *q* and *ξ* parameters for the power control problem in DS/CDMA networks under different level of interference could be found, as summarized in Table 2.


**Table 2.** Optimized RA-ACO input parameters and respective robustness for the Problem of Eq. (17).

Also, the robustness achieved by the RA-ACO for the power allocation problem is added to the Table 2. Herein, the success of convergence is reached when the NMSE of the algorithm's solution goes less than 10<sup>−</sup>2. Due to the non-convexity of the PA problem in (17), when the number of users grows from 10 to 20, the needed robustnes grows exponentially, thus, the algorithm's performance have a critical decay of 70%.

**Weighted Throughput Maximization (WTM) Problem.** For the weighted throughput maximization (WTM) problem posed in Eq. (21), Figure 4 shows different cost function evolutions when parameters *q* and *ξ* are combined under three distinct system loading, *U* = 20, 100 and 250 users. The average cost function evolution values where taken over T = 1000 trials. Also, the correspondent sum rate difference (<sup>∆</sup> <sup>∑</sup>rate) is zoomed in.

From Fig. 4-a it is clear that for *U* = 20 users, the *q* = 0.10 and *ξ* = 1.00 choice results in an average cost function value higher than the other ones. Besides, even in a

dimensions of the problem, i.e., {*m*, *Fs*} ≪ *U*. It means that RA-ACO can solve the WTM problem with soft parameter loads. The best input parameter configuration for the RA-ACO

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

*U* (users) 20 100 250 *q* 0.10 0.20 0.20 *ξ* 1.00 1.00 1.00 *m* 377 *Fs* 555

algorithm in order to solve the WTM problem is summarized in Table 3.

985 990 995 2.5 2.52 2.54 2.56 2.58 2.6 2.62 2.64

q = 0.10, ξ = 1.00 q = 0.20, ξ = 1.00 q = 0.40, ξ = 0.80 q = 0.50, ξ = 0.80 q = 0.30, ξ = 0.60 q = 0.40, ξ = 0.60 q = 0.10, ξ = 0.70 q = 0.20, ξ = 0.70 q = 0.70, ξ = 1.00

20 40 60 80 100

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Cost Function Evolution for U = 250 users, Fs = 5 and m = 7

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> <sup>0</sup>

**c)**

∆Σrate= 1.24Mb/s

Iterations, N

**Figure 4.** Cost function *J* evolution from Eq. (21) across *N* = 1000 iterations for different ACO input parameters values combination. The correspondent sum rate difference (∆ ∑rate) is zoomed in. a) *U* = 20; b) *U* = 100, and c) *U* = 250 users.

Numerical results for the WTM problem with RA-ACO algorithm under optimized input parameters are shown in Figure 5. Here, its clear that ACO can evolve pretty fast to the three different system loads, finding a good solution in less than 100 iterations. Besides, one can note the great increase on the total system power from the 100 users case to the 250. It is due to the interference increase given the high number of users in the system. Nevertheless, a good system throughput result is found. For the 20 users results, a system total throughput

of 200*Mb*/*s* is found. This results in an remarkable average user rate of 10*Mb*/*s*.

Cost Function Value

0.82 0.84 0.86 0.88

> q = 0.10, ξ = 1.00 q = 0.20, ξ = 1.00 q = 0.40, ξ = 0.80 q = 0.50, ξ = 0.80 q = 0.30, ξ = 0.60 q = 0.40, ξ = 0.60 q = 0.10, ξ = 0.70 q = 0.20, ξ = 0.70 q = 0.70, ξ = 1.00

50 100 150

950 960 970

0.414 0.416 0.418 0.42 0.422 0.424 ∆Σ 0.426 rate= 96Kb/s

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> <sup>0</sup>

**b)**

940 960 980

0.84 0.85 0.86 0.87 0.88 0.89

http://dx.doi.org/10.5772/53338

125

∆Σrate= 2Mb/s

Cost Function Evolution for U = 100 users, Fs = 5 and m = 7

∆Σrate= 38.4Kb/s

Iterations, N

**Table 3.** Optimized RA-ACO input parameters for the WTM Problem, Eq. (21).

2.66 ∆Σrate= 76.8Kb/s

> ∆Σrate = 1.3Mb/s

Cost Function Evolution for U = 20 users, Fs=5 and m=3

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> <sup>0</sup>

**a)**

Iterations, N

Cost Function Value

0.4 0.41 0.42

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

*2.6.2. WTM RA-ACO Performance Results*

0.5

1

q = 0.10, ξ = 1.00 q = 0.20, ξ = 1.00 q = 0.40, ξ = 0.80 q = 0.50, ξ = 0.80 q = 0.30, ξ = 0.60 q = 0.40, ξ = 0.60 q = 0.10, ξ = 0.70 q = 0.20, ξ = 0.70 q = 0.70, ξ = 1.00

0 50 100

1.5

2.4 2.5 2.6 2.7

Cost Function Value

2

2.5

**Figure 3.** NMSE for different ACO input parameters. a) *U* = 5 users; b) *U* = 10 users and c) *U* = 20 users;

relative course optimization scenarios for *q* and *ξ* parameters it is clear the importance of deploying optimized RA-ACO input parameters; for instance, the best ACO input parameters configuration in Figure 4.a shows a difference of ∆ ∑rate = 76.8*Kb*/*s* on the total achievable system throughput regarding the second best parameters set choice, and a difference of 1.3*Mb*/*s* to the worst parameters set.

On Fig. 4.b, a lightly cost function value difference shows that the best parameter configuration for the system load of *U* = 100 users is *q* = 0.20 and *ξ* = 1.00. In terms of system throughput, the best parameter configuration shows a difference of 38.4*Kb*/*s* to the second one, and of 2*Mb*/*s* to the worst one.

Finally, the best input parameters configuration set for *U* = 250 users in Fig. 4.c is obtained as *q* = 0.20 and *ξ* = 1.00. Again, the associated total system throughput variations due to the different ACO input parameter configurations was not significant, ranging from 96*Kb*/*s* to 1.24*Mb*/*s*. This confirms a certain robustness to the input {*q*; *ξ*} deviation values, thanks to the convexity of the optimization problem formulated in (21). In summary, the RA-ACO algorithm was able to attain reasonable converge in less than *n* = 150 iterations for the WTM problem with dimension up to *U* = 250 users.

Note that in the WTM problem given the convex characteristic of the objective function, Eq. (21), the robustness of the RA-ACO approaches to R ≈ 100%, and the value entries for the population size *m* and file size *Fs* parameters are impressively less than the number of dimensions of the problem, i.e., {*m*, *Fs*} ≪ *U*. It means that RA-ACO can solve the WTM problem with soft parameter loads. The best input parameter configuration for the RA-ACO algorithm in order to solve the WTM problem is summarized in Table 3.


**Table 3.** Optimized RA-ACO input parameters for the WTM Problem, Eq. (21).

16 Search Algorithms

10−10

10−5

q = 0.61, ξ = 1.00 q = 0.50, ξ = 1.00 q = 0.75, ξ = 1.00 q = 0.61, ξ = 0.10 q = 0.61, ξ = 0.50 q = 0.61, ξ = 0.80 q = 0.10, ξ = 1.00 q = 0.10, ξ = 0.50 q = 0.90, ξ = 1.00

NMSE

100

105

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−15

Iterations, N

10−2

difference of 1.3*Mb*/*s* to the worst parameters set.

the second one, and of 2*Mb*/*s* to the worst one.

problem with dimension up to *U* = 250 users.

10−1

100

q = 0.40, ξ = 0.75 q = 0.80, ξ = 0.75 q = 0.10, ξ = 0.75 q = 0.40, ξ = 0.10 q = 0.40, ξ = 0.50 q = 0.40, ξ = 0.90 q = 0.10, ξ = 1.00 q = 0.10, ξ = 0.50 q = 0.10, ξ = 1.00

**Figure 3.** NMSE for different ACO input parameters. a) *U* = 5 users; b) *U* = 10 users and c) *U* = 20 users;

NMSE

101

102

10<sup>3</sup>

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−10

q = 0.40, ξ = 0.83 q = 0.50, ξ = 1.00 q = 0.75, ξ = 1.00 q = 0.40, ξ = 0.10 q = 0.40, ξ = 0.50 q = 0.40, ξ = 0.70 q = 0.10, ξ = 1.00 q = 0.10, ξ = 0.50 q = 0.90, ξ = 1.00

Iterations, N

 **b)** ACO − NMSE − Fs = 14; m = 15; U = 10 users

10−8 10−6 10−4 10−2 100 102 104

 **c)** ACO − NMSE − Fs = 25; m = 35; U = 20 users

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−3

relative course optimization scenarios for *q* and *ξ* parameters it is clear the importance of deploying optimized RA-ACO input parameters; for instance, the best ACO input parameters configuration in Figure 4.a shows a difference of ∆ ∑rate = 76.8*Kb*/*s* on the total achievable system throughput regarding the second best parameters set choice, and a

On Fig. 4.b, a lightly cost function value difference shows that the best parameter configuration for the system load of *U* = 100 users is *q* = 0.20 and *ξ* = 1.00. In terms of system throughput, the best parameter configuration shows a difference of 38.4*Kb*/*s* to

Finally, the best input parameters configuration set for *U* = 250 users in Fig. 4.c is obtained as *q* = 0.20 and *ξ* = 1.00. Again, the associated total system throughput variations due to the different ACO input parameter configurations was not significant, ranging from 96*Kb*/*s* to 1.24*Mb*/*s*. This confirms a certain robustness to the input {*q*; *ξ*} deviation values, thanks to the convexity of the optimization problem formulated in (21). In summary, the RA-ACO algorithm was able to attain reasonable converge in less than *n* = 150 iterations for the WTM

Note that in the WTM problem given the convex characteristic of the objective function, Eq. (21), the robustness of the RA-ACO approaches to R ≈ 100%, and the value entries for the population size *m* and file size *Fs* parameters are impressively less than the number of

Iterations, N

NMSE

 **a)** Power Control ACO − NMSE − Fs = 8; m = 7; U = 5 users

**Figure 4.** Cost function *J* evolution from Eq. (21) across *N* = 1000 iterations for different ACO input parameters values combination. The correspondent sum rate difference (∆ ∑rate) is zoomed in. a) *U* = 20; b) *U* = 100, and c) *U* = 250 users.

#### *2.6.2. WTM RA-ACO Performance Results*

Numerical results for the WTM problem with RA-ACO algorithm under optimized input parameters are shown in Figure 5. Here, its clear that ACO can evolve pretty fast to the three different system loads, finding a good solution in less than 100 iterations. Besides, one can note the great increase on the total system power from the 100 users case to the 250. It is due to the interference increase given the high number of users in the system. Nevertheless, a good system throughput result is found. For the 20 users results, a system total throughput of 200*Mb*/*s* is found. This results in an remarkable average user rate of 10*Mb*/*s*.

On the *U* = 100 users results, ≈ 340*Mb*/*s* of system throughput is reached, with a total power consumption of ≈ 55*W*. Herein, the average user rate is ≈ 3, 4*Mb*/*s*. This is due to the higher interference values given the medium system load.

Finally, for the *U* = 250 users results, a total system throughput of ≈ 400*Mb*/*s* is reached, with a total power consumption of ≈ 125*W*. Again, the average user rate decays regarding the low and medium system loads, reaching ≈ 1, 6*Mb*/*s*.

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−12

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−3

special attention in network traffic.

**3.1. Anomaly**

Iterations, N

**Figure 6.** NMSE attainable by RA-ACO and RA-PSO [1] algorithms, for *U* = 5, 10, and 20 users.

Iterações, N

NMSE for U = 20 users

NMSE for U = 5 users

ACO PSO

PSO ACO

Management is an essential activity in enterprizes, service providers and other elements for which the networks have become a necessity because the continued growth of networks have introduced an extensive options of services. The main goal of management it is to ensure the fewest possible flaws and vulnerabilities to not affect the operation of networks. There are several factors that can lead to an anomaly such as configuration errors, improper use by users and programming errors and malicious attacks among many other causes [20].

A tool to help the network management task are the Anomaly Detection System (ADS), which consist of a technique or a set of techniques to detect anomalies and report to the network administrator, helping to decide which action perform in each situation. Anomaly detection is not an easy task and brings together a range of techniques in several areas. The quality and safety of the service provided to end users are ongoing concerns and receive

Thottan et. al [21] present two anomalies categories. The first one is related to network failures and performance problems, where the role of a malicious agent does not exist. The flash crowd is a typical example of this category, where a server receives a huge amount of not malicious client requests in the same period of time, congesting the server, i.e., a webstore promotes a product at a lower price at a certain time. If the webstore does not prepare a

10−8 10−6 10−4 10−2 100 102 104

NMSE

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> 10−10

Iterações, N

NMSE for U = 10 users

http://dx.doi.org/10.5772/53338

ACO PSO 127

10−10 10−8 10−6 10−4 10−2 100 102 104

10−2

10−1

100

NMSE

101

102

10<sup>3</sup>

NMSE

**Figure 5.** Sum Power and Sum Rate evolution for *U* = 20, 100 and 250 users under RA-ACO algorithm.

## *2.6.3. RA-ACO and RA-PSO Simulation Performance Results*

The main objective in this analysis is put in perspective the both RA-heuristic algorithm performance regarding the non-convexity of the power allocation problem posed in (17). Simulations were conducted on different system loadings according to the best input RA-ACO parameters presented in Section 2.6.1 and those best parameters obtained in [1] for the resource (power) allocation using particle swarm optimization (RA-PSO) algorithm.

Figure 6, shows the NMSE evolution for the power control problem with *U* = 5, 10 and 20 users, respectively, for the algorithms RA-PSO [1] and RA-ACO. Clearly, the NMSE ≈ 10<sup>−</sup>12, 10<sup>−</sup><sup>10</sup> and 10−<sup>2</sup> attainable by the RA-ACO is much more lower than that values reached by RA-PSO (NMSE ≈ 10<sup>−</sup>5, 101 and 101) after *N* = 1000 iterations. This means the ACO could surpass the various convergence problems in solving the non-convex power control problem. The associated robustness shown that the RA-ACO achieves near to total convergence success, while the RA-PSO was not able to do. Fig. 6 also shows a table containing the percentage of algorithm success, i.e. the percentage of trials in which the algorithm ended with a NMSE less or equal to 10<sup>−</sup>2, showing a clearly superiority of the RA-ACO scheme. Nonetheless, this robustness comes with a computational complexity increasing.

## **3. Anomaly Detection in Computer Networks**

This section, presents the main concepts and related work in computer networks anomaly. It is presented anomalies type and their causes, the different techniques and methods applied into anomaly detection, as well as a compilation of recent proposals for detecting anomalies in networks using ACO and a case study.

<sup>126</sup> Search Algorithms for Engineering Optimization Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks 19 Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks http://dx.doi.org/10.5772/53338 127

**Figure 6.** NMSE attainable by RA-ACO and RA-PSO [1] algorithms, for *U* = 5, 10, and 20 users.

Management is an essential activity in enterprizes, service providers and other elements for which the networks have become a necessity because the continued growth of networks have introduced an extensive options of services. The main goal of management it is to ensure the fewest possible flaws and vulnerabilities to not affect the operation of networks. There are several factors that can lead to an anomaly such as configuration errors, improper use by users and programming errors and malicious attacks among many other causes [20].

A tool to help the network management task are the Anomaly Detection System (ADS), which consist of a technique or a set of techniques to detect anomalies and report to the network administrator, helping to decide which action perform in each situation. Anomaly detection is not an easy task and brings together a range of techniques in several areas. The quality and safety of the service provided to end users are ongoing concerns and receive special attention in network traffic.

### **3.1. Anomaly**

18 Search Algorithms

increasing.

Total Power [W]

On the *U* = 100 users results, ≈ 340*Mb*/*s* of system throughput is reached, with a total power consumption of ≈ 55*W*. Herein, the average user rate is ≈ 3, 4*Mb*/*s*. This is due to

Finally, for the *U* = 250 users results, a total system throughput of ≈ 400*Mb*/*s* is reached, with a total power consumption of ≈ 125*W*. Again, the average user rate decays regarding

The main objective in this analysis is put in perspective the both RA-heuristic algorithm performance regarding the non-convexity of the power allocation problem posed in (17). Simulations were conducted on different system loadings according to the best input RA-ACO parameters presented in Section 2.6.1 and those best parameters obtained in [1] for the resource (power) allocation using particle swarm optimization (RA-PSO) algorithm. Figure 6, shows the NMSE evolution for the power control problem with *U* = 5, 10 and 20 users, respectively, for the algorithms RA-PSO [1] and RA-ACO. Clearly, the NMSE ≈ 10<sup>−</sup>12, 10<sup>−</sup><sup>10</sup> and 10−<sup>2</sup> attainable by the RA-ACO is much more lower than that values reached by RA-PSO (NMSE ≈ 10<sup>−</sup>5, 101 and 101) after *N* = 1000 iterations. This means the ACO could surpass the various convergence problems in solving the non-convex power control problem. The associated robustness shown that the RA-ACO achieves near to total convergence success, while the RA-PSO was not able to do. Fig. 6 also shows a table containing the percentage of algorithm success, i.e. the percentage of trials in which the algorithm ended with a NMSE less or equal to 10<sup>−</sup>2, showing a clearly superiority of the RA-ACO scheme. Nonetheless, this robustness comes with a computational complexity

This section, presents the main concepts and related work in computer networks anomaly. It is presented anomalies type and their causes, the different techniques and methods applied into anomaly detection, as well as a compilation of recent proposals for detecting anomalies

Throughput [Mbps]

0 100 200 300 400 500 600 700 800 900 1000

U=20 users U=100 users U=250 users

Iterations, N

Sum Rate − U = 20, 100 and 250 users

U = 20 users U = 100 users U = 250 users

**Figure 5.** Sum Power and Sum Rate evolution for *U* = 20, 100 and 250 users under RA-ACO algorithm.

the higher interference values given the medium system load.

the low and medium system loads, reaching ≈ 1, 6*Mb*/*s*.

Sum Power − U = 20, 100 and 250 users

<sup>0</sup> <sup>100</sup> <sup>200</sup> <sup>300</sup> <sup>400</sup> <sup>500</sup> <sup>600</sup> <sup>700</sup> <sup>800</sup> <sup>900</sup> <sup>1000</sup> <sup>0</sup>

Iterations, N

*2.6.3. RA-ACO and RA-PSO Simulation Performance Results*

**3. Anomaly Detection in Computer Networks**

in networks using ACO and a case study.

Thottan et. al [21] present two anomalies categories. The first one is related to network failures and performance problems, where the role of a malicious agent does not exist. The flash crowd is a typical example of this category, where a server receives a huge amount of not malicious client requests in the same period of time, congesting the server, i.e., a webstore promotes a product at a lower price at a certain time. If the webstore does not prepare a infrastructure to support the client access, the server may interrupt the operations and do not operate all sales transactions, causing congestion on the server and possible financial loss to the webstore. The congestion itself is another anomaly type within the first category, due to the abrupt increase in traffic at some point in the network, causing delays in the delivery of packages until the saturation of the link, with packet discards. Congestion can also be generated by configuration errors, the server does not respond adequately to requests sent by clients by wrong settings.

the comparison with the signatures is performed, and if a match occurs as described the

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

http://dx.doi.org/10.5772/53338

129

Using signature, the method offers low rates of false positive, since the signature clearly describes what is required to be considered an anomaly, however, unknown attacks characteristics and not formulated signatures might pass unnoticed. Another negative point

Unlike the signature-based detection, the focus of this method is to detect anomalies based on the characterization of normal behavior. The first and fundamental step is to generate the normal behavior profile of traffic or adopt a model that more accurately describes the traffic. Consequently, any activity that deviates from the monitored normal profile built will be considered anomaly. The construction of the profile can be static or dynamic. It is static when the profile is built and replaced only when a new profile is constructed, and it is

A positive point is the possibility of detecting new anomalies, whereas these new anomalies describe a behavior different from normal. Another aspect is the difficulty created for the malicious agent devise an attack, because it ignores the profile and the possibility exists that it can not simulate an attack describing the profile and generates an alarm [20]. But there are disadvantages in profile construction as the required training period or the information amount on the basis of historical data. The difficulty in characterizing the traffic itself generates a high percentage rate of false positives, since the ADS can point to many natural

There are several techniques, below are listed some relevant techniques that enrich the

• **Machine Learning:** The machine learning solutions have the ability to learn and improve the performance over time because the system changes the implementation strategy based on previous results. Bayesian networks, Markov chains, neural network are techniques applied to the generation of the normal profiles and detection of the anomalies. The main advantage of this approach is the ability to detect anomalies unknown and adapt to changes in the behavior of a monitored environment, however, this adjustment requires a

• **Based on Specification:** These solutions are constructed by an expert, since the specifications of the normal behavior of the system are carried out manually. If the system is well represented, the false negative rate will be minimized by avoiding any behavior not predicted, but may increase if some behavior is overlooked or not well described. The most widely used technique for this task are finite state machines. A drawback of this approach point is the time and complexity to the development of the solutions [26]. • **Signal Processing:** The most commonly used techniques are the Fourier transforms, wavelet and algorithms such as ARIMA (Autoregressive Integrated Moving Average). It presents the advantage to adapt to the monitored environment and detecting unknown anomalies and low training period. The complexity is presented as a disadvantage of this

dynamic when the profile is updated according to the network behavior changes

event in the signature, an alarm is generated [20].

is the need to constantly update the signatures database [20].

*3.2.2. Detection Based on the Characterization of Normal Behavior*

variations of the network as an anomalous behavior.

large amount of data to generate a new profile [20].

discussion with several different proposals:

approach [28].

In the second category, it is found anomalies that arise from problems related to security. Denial of Service (DoS) is a main example of a anomaly in this category. DoS occurs when a user is unable to obtain a particular service because of some malicious agent used methods of attack that occupied the machine resources such as CPU, RAM. Besides DoS attacks, also have Distributed Denial of Service (DDoS) where a master machine dominates other machine, called zombies, to perform a DoS [22]. The flash crowd is differentiate from DoS and DDoS because of the malicious agent. Worms, port scan and others usually are programmed to discover vulnerabilities in networks and perform attacks [23].

## **3.2. Anomaly Detection Techniques**

The techniques implemented in ADS are present in diverse areas such as intrusion detection, fraud detection, medical anomaly detection, prevention of damage to industrial image processing, sensor networks, among others [24]. Presenting so many different application domains, many tools have been developed specifically for some activity and other solutions are more general. Chandola et. al [24] group the techniques into: based on classification, clustering, information theory, statistical and spectral theory. [24, 25] are surveys with a general content in anomaly detection field, but it is found surveys toward the area of computer network, [20, 26]. The nomenclature and some categories may differ, but the concept presented is consistent with each other.

Patcha et. al [20] divides the techniques of detecting anomalies in three categories: based on signature, based on the characterization of normal behavior and hybrid techniques. The signature-based techniques are based on the number of signatures of known or constructed attack patterns. The strength of this kind of detection is the low rate of false positives. The techniques based on characterization of normal traffic build the profile of network traffic, and any event that deviates from the normal behavior is considered an anomaly. The hybrid techniques are a junction of the two previous techniques [20].

Many authors consider the proposed work by Denning [27] as a watershed between the methods based on signature and the methods used to characterize the normal traffic behavior, and these methods consist of two phases: training phase and test phase. The training phase is generated from the network profile and the test phase is applied the profile obtained to evaluation.

## *3.2.1. Detection Based on Signature*

Detection based on signature requires the construction of a database of events related to certain anomalies, thereby generating the signature. The signatures describes specific events that form a specific attack or anomaly; this way, when the tool monitors the traffic behavior, the comparison with the signatures is performed, and if a match occurs as described the event in the signature, an alarm is generated [20].

Using signature, the method offers low rates of false positive, since the signature clearly describes what is required to be considered an anomaly, however, unknown attacks characteristics and not formulated signatures might pass unnoticed. Another negative point is the need to constantly update the signatures database [20].

## *3.2.2. Detection Based on the Characterization of Normal Behavior*

20 Search Algorithms

evaluation.

*3.2.1. Detection Based on Signature*

by clients by wrong settings.

infrastructure to support the client access, the server may interrupt the operations and do not operate all sales transactions, causing congestion on the server and possible financial loss to the webstore. The congestion itself is another anomaly type within the first category, due to the abrupt increase in traffic at some point in the network, causing delays in the delivery of packages until the saturation of the link, with packet discards. Congestion can also be generated by configuration errors, the server does not respond adequately to requests sent

In the second category, it is found anomalies that arise from problems related to security. Denial of Service (DoS) is a main example of a anomaly in this category. DoS occurs when a user is unable to obtain a particular service because of some malicious agent used methods of attack that occupied the machine resources such as CPU, RAM. Besides DoS attacks, also have Distributed Denial of Service (DDoS) where a master machine dominates other machine, called zombies, to perform a DoS [22]. The flash crowd is differentiate from DoS and DDoS because of the malicious agent. Worms, port scan and others usually are programmed to

The techniques implemented in ADS are present in diverse areas such as intrusion detection, fraud detection, medical anomaly detection, prevention of damage to industrial image processing, sensor networks, among others [24]. Presenting so many different application domains, many tools have been developed specifically for some activity and other solutions are more general. Chandola et. al [24] group the techniques into: based on classification, clustering, information theory, statistical and spectral theory. [24, 25] are surveys with a general content in anomaly detection field, but it is found surveys toward the area of computer network, [20, 26]. The nomenclature and some categories may differ, but the

Patcha et. al [20] divides the techniques of detecting anomalies in three categories: based on signature, based on the characterization of normal behavior and hybrid techniques. The signature-based techniques are based on the number of signatures of known or constructed attack patterns. The strength of this kind of detection is the low rate of false positives. The techniques based on characterization of normal traffic build the profile of network traffic, and any event that deviates from the normal behavior is considered an anomaly. The hybrid

Many authors consider the proposed work by Denning [27] as a watershed between the methods based on signature and the methods used to characterize the normal traffic behavior, and these methods consist of two phases: training phase and test phase. The training phase is generated from the network profile and the test phase is applied the profile obtained to

Detection based on signature requires the construction of a database of events related to certain anomalies, thereby generating the signature. The signatures describes specific events that form a specific attack or anomaly; this way, when the tool monitors the traffic behavior,

discover vulnerabilities in networks and perform attacks [23].

techniques are a junction of the two previous techniques [20].

**3.2. Anomaly Detection Techniques**

concept presented is consistent with each other.

Unlike the signature-based detection, the focus of this method is to detect anomalies based on the characterization of normal behavior. The first and fundamental step is to generate the normal behavior profile of traffic or adopt a model that more accurately describes the traffic. Consequently, any activity that deviates from the monitored normal profile built will be considered anomaly. The construction of the profile can be static or dynamic. It is static when the profile is built and replaced only when a new profile is constructed, and it is dynamic when the profile is updated according to the network behavior changes

A positive point is the possibility of detecting new anomalies, whereas these new anomalies describe a behavior different from normal. Another aspect is the difficulty created for the malicious agent devise an attack, because it ignores the profile and the possibility exists that it can not simulate an attack describing the profile and generates an alarm [20]. But there are disadvantages in profile construction as the required training period or the information amount on the basis of historical data. The difficulty in characterizing the traffic itself generates a high percentage rate of false positives, since the ADS can point to many natural variations of the network as an anomalous behavior.

There are several techniques, below are listed some relevant techniques that enrich the discussion with several different proposals:


• **Data Mining:** The Data Mining techniques usually deal with a huge amount of data, looking for patterns to form sets of normal data. Principal Component Analysis (PCA), clustering algorithms, Support Vector Machine (SVM) and others statistical tools are commonly employed in these solutions [20].

Feng et. al present a network intrusion detection, assuming two assumptions: the number of normal instances vastly outnumbers the number of intrusions and the intrusions themselves are qualitatively different from the normal instances. Then, three steps are followed: 1)

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

http://dx.doi.org/10.5772/53338

131

In [36], it is found the use of data mining to intrusion detection. Abadeh et. al proposed an extract fuzzy classification rules for misuse intrusion detection in computer networks, named Evolutionary Fuzzy System with an Ant Colony Optimization (EFS-ACO). It consists of two stages, in the first stage, an iterative rule learning algorithm is applied to the training data to generate a primary set of fuzzy rules. The second stage of the algorithm employs an ant colony optimization procedure to enhance the quality of the primary fuzzy rule set from the

This sections provides an application of ACO for anomaly detection. The proposed approach is under the categorie of the detection methods based on the characterization of normal behavior, and follow two steps: 1) Training Phase, 2)Detection Phase. The dataset used for evaluation of the method is the KDD'99 [37]. In the Training Phase, it is used the training dataset and it is generated the centroids for each class of attack, and in the Detection Phase, it is used the generated centroids in each conection to classify it and generate the final resuts.

For evaluation for anomaly detection methods it is commonly used the KDD'99 dataset, used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining [37]. This dataset was built based on the data captured in DARPA'98 IDS evaluation program, used for The Second International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-98 The Fourth

KDD training dataset consists of approximately 490,000 single connection vectors each of which contains 41 features and it is labeled as: back, buffer overflow, ftp write, guess passwd, imap, ipsweep, land, loadmodule, multihop, neptune, nmap, perl, phf, pod, portsweep, rootkit, satan, smurf, spy, teardrop, warezclient, warezmaster, normal. Depending on the

1. **Denial of Service (DoS)**: is an attack in which the attacker makes some computing or memory resource too busy or too full to handle legitimate requests, or denies legitimate

2. **User to Root (U2R)**: is a class of exploit in which the attacker starts out with access to a normal user account on the system and is able to exploit some vulnerability to gain root

3. **Remote to Local (R2L)**: occurs when an attacker who has the ability to send packets to a machine over a network but who does not have an account on that machine exploits some

Clustering, 2) Labeling cluster and 3) Detection.

*3.4.1. KDD Cup 99 Data Set Description*

users access to a machine.

access to the system.

**3.4. Applying ACO for Anomaly Detection - A Study Case**

International Conference on Knowledge Discovery and Data Mining [38].

label, the connection fall in one of the following four attack categories:

vulnerability to gain local access as a user of that machine.

previous stage.

## **3.3. Recent Proposals Using ACO in Computer Networks Field**

Since Dorigo et. al [29, 30] proposed the Ant System (AS) from the first time, several applications have emerged using AS itself or others algorithms arising from the ACO approach. One algorithm proposed to the networks routing problem is the AntNet, proposed by Di Caro et al. [31], a different approach to the adaptive learning of routing tables in communications networks. To the information to travel from point A to point B, it is necessary to determine the path that will be covered. The construction process itself and the path is known as routing, and it is one at the core of the To the network control system together with congestion control components, admission control, among others [31]. The AntNet is close to the real ants' behavior that inspired the ACO metaheuristic, because the routing problem can be characterized as a directed weighted graph, where the ants move on the graph, building the paths and loosing the pheromone trails.

Information Retrieval is another field where ACO found application, as proposed by [32, 33]. The problem in the information retrieval system consists in finding a set of documents including information expressed in a query specifying user needs. The process involves a matching mechanism between the query and the documents of the collection. In [32], Drias et. al designed two ACO algorithms, named AC-IR and ACS-IR. Each term of the document has an amount of pheromone that represents the importance of its previous contribution in constructing good solutions, the main difference between AC-IR and ACS-IR is mainly in the design of the probabilistic decision rule and the procedure of building solutions. In [33], the ACO algorithm is applied to retrieve relevant documents in the reduced lower-dimensionality document feature space, the probability function is built using the frequency of the terms and the total number of documents containing the term.

In [34], the autors make use of a Fuzzy Rule Based System (FRBS), Naive Bayes Classifier (NBC) and Support vector machine (SVM) to increase the interpretability and accuracy of intrusion detection model for better classification results. The FRBS system is a set of IF-THEN rules, whose antecedents and consequents are composed of fuzzy statements, related by the dual concepts of fuzzy implication and the compositional rule of inference. The NBC method based on the "Bayes rule" for conditional probability as this rule provides a framework for data assimilation. The SVM is a statistical tool for data classification which is one of the most robust and accurate methods among all well-known algorithms. Its basic idea is to map data into a high dimensional space and find a separating hyper plane with the maximal margin. Then, the authors proposed NBC with ACO, linking a Quality computation function, ranking the best rule between discovered ones, to the pheromone updating.

A commonly approach to network intrusion detection is to produce cluster using a swarm intelligence-based clustering. Therefore, in the traditional clustering algorithms it is used a simple distance-based metric and detection based on the centers of clusters, which generally degrade detection accuracy and efficiency because the centers might not be well calculated or the data do not associate to the closest center. Using ACO, it is possible to surround the local optimum and find the best or the most close to the best center. This technique is used in [35], Feng et. al present a network intrusion detection, assuming two assumptions: the number of normal instances vastly outnumbers the number of intrusions and the intrusions themselves are qualitatively different from the normal instances. Then, three steps are followed: 1) Clustering, 2) Labeling cluster and 3) Detection.

In [36], it is found the use of data mining to intrusion detection. Abadeh et. al proposed an extract fuzzy classification rules for misuse intrusion detection in computer networks, named Evolutionary Fuzzy System with an Ant Colony Optimization (EFS-ACO). It consists of two stages, in the first stage, an iterative rule learning algorithm is applied to the training data to generate a primary set of fuzzy rules. The second stage of the algorithm employs an ant colony optimization procedure to enhance the quality of the primary fuzzy rule set from the previous stage.

## **3.4. Applying ACO for Anomaly Detection - A Study Case**

This sections provides an application of ACO for anomaly detection. The proposed approach is under the categorie of the detection methods based on the characterization of normal behavior, and follow two steps: 1) Training Phase, 2)Detection Phase. The dataset used for evaluation of the method is the KDD'99 [37]. In the Training Phase, it is used the training dataset and it is generated the centroids for each class of attack, and in the Detection Phase, it is used the generated centroids in each conection to classify it and generate the final resuts.

## *3.4.1. KDD Cup 99 Data Set Description*

22 Search Algorithms

• **Data Mining:** The Data Mining techniques usually deal with a huge amount of data, looking for patterns to form sets of normal data. Principal Component Analysis (PCA), clustering algorithms, Support Vector Machine (SVM) and others statistical tools are

Since Dorigo et. al [29, 30] proposed the Ant System (AS) from the first time, several applications have emerged using AS itself or others algorithms arising from the ACO approach. One algorithm proposed to the networks routing problem is the AntNet, proposed by Di Caro et al. [31], a different approach to the adaptive learning of routing tables in communications networks. To the information to travel from point A to point B, it is necessary to determine the path that will be covered. The construction process itself and the path is known as routing, and it is one at the core of the To the network control system together with congestion control components, admission control, among others [31]. The AntNet is close to the real ants' behavior that inspired the ACO metaheuristic, because the routing problem can be characterized as a directed weighted graph, where the ants move on

Information Retrieval is another field where ACO found application, as proposed by [32, 33]. The problem in the information retrieval system consists in finding a set of documents including information expressed in a query specifying user needs. The process involves a matching mechanism between the query and the documents of the collection. In [32], Drias et. al designed two ACO algorithms, named AC-IR and ACS-IR. Each term of the document has an amount of pheromone that represents the importance of its previous contribution in constructing good solutions, the main difference between AC-IR and ACS-IR is mainly in the design of the probabilistic decision rule and the procedure of building solutions. In [33], the ACO algorithm is applied to retrieve relevant documents in the reduced lower-dimensionality document feature space, the probability function is built using

the frequency of the terms and the total number of documents containing the term.

In [34], the autors make use of a Fuzzy Rule Based System (FRBS), Naive Bayes Classifier (NBC) and Support vector machine (SVM) to increase the interpretability and accuracy of intrusion detection model for better classification results. The FRBS system is a set of IF-THEN rules, whose antecedents and consequents are composed of fuzzy statements, related by the dual concepts of fuzzy implication and the compositional rule of inference. The NBC method based on the "Bayes rule" for conditional probability as this rule provides a framework for data assimilation. The SVM is a statistical tool for data classification which is one of the most robust and accurate methods among all well-known algorithms. Its basic idea is to map data into a high dimensional space and find a separating hyper plane with the maximal margin. Then, the authors proposed NBC with ACO, linking a Quality computation function, ranking the best rule between discovered ones, to the pheromone updating.

A commonly approach to network intrusion detection is to produce cluster using a swarm intelligence-based clustering. Therefore, in the traditional clustering algorithms it is used a simple distance-based metric and detection based on the centers of clusters, which generally degrade detection accuracy and efficiency because the centers might not be well calculated or the data do not associate to the closest center. Using ACO, it is possible to surround the local optimum and find the best or the most close to the best center. This technique is used in [35],

commonly employed in these solutions [20].

**3.3. Recent Proposals Using ACO in Computer Networks Field**

the graph, building the paths and loosing the pheromone trails.

For evaluation for anomaly detection methods it is commonly used the KDD'99 dataset, used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining [37]. This dataset was built based on the data captured in DARPA'98 IDS evaluation program, used for The Second International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-98 The Fourth International Conference on Knowledge Discovery and Data Mining [38].

KDD training dataset consists of approximately 490,000 single connection vectors each of which contains 41 features and it is labeled as: back, buffer overflow, ftp write, guess passwd, imap, ipsweep, land, loadmodule, multihop, neptune, nmap, perl, phf, pod, portsweep, rootkit, satan, smurf, spy, teardrop, warezclient, warezmaster, normal. Depending on the label, the connection fall in one of the following four attack categories:


4. **Probing** : is an attempt to gather information about a network of computers for the apparent purpose of circumventing its security controls.

The technique of clustering is a data mining tool used to find and quantify similarities between points of determined group of data. This process seeks to minimize the variance between elements of a given group and maximize them in relation to other groups [41]. The

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

The equation that measures the similarity between the data is called the objective function. The purpose of the use clustering is to create a template from which to extract a pattern of information. Thus, when a distance of data is found in smaller quantities in relation to this standard, you can group them into clusters of different sets of greater representation. The most classical algorithm in the literature is the K-means (KM) algorithm. It is a partitional center-based clustering method and the popularity is due to simplicity of implementation

The problem to find the *K* center locations can be defined as an optimization problem to minimize the sum of the Euclidean distances between data points and their closest centers, described in Eq. (29). The KM randomly select *k* points and make them the initial centres of *k* clusters, then assigns each data point to the cluster with centre closest to it. In the second step, the centres are recomputed, and the data points are redistributed according to the new centres. The algorithm stop when the number of iterations is achieved or there is no change in the membership of the clusters over successive iterations [42]. One issue founded in KM is the initialization due to partitioning strategy, when in local density data results in a strong

> *n* ∑ *i*=1

where **x** is the data, **c** is the center. The parameter *n* is total number of elements in **x** and *k* is

The ACO described in this section aims to optimize the efficiency of clustering minimizing the objective function value described by Eq. (29). Thus, this ensures that each point *i* will be grouped to the best cluster *j* where *j* = 1, ..., *K*. In addition, it enables the construction of solutions that are not givens by local optimal, which is the existing problem in most clustering algorithms. The ACO algorithm proposed is described in Algorithm 2, in which

a) *CalculateFitnessFunction()*: For each ant *m* is calculated the Fitness Function based on Eq. 29. As each ant represent a possible solution, each ant will be a possible center to

b) *SortAnts()*: This function sort and rank the ants according to the *CalculateFitnessFunction()*.

*k* ∑ *j*=1


http://dx.doi.org/10.5772/53338

133

�*xi* − *cj*�<sup>2</sup> (29)

similarity function adopted is the Euclidean distance described in Eq. (28).

KM(**x**, **c**) =

the functions performed into the Algorithm 2 are briefly described in following.

clusterize the data **x**, and the ant *m* describing the lowest value of KM(**x**, **c***m*).

∑ *i*=1

*<sup>d</sup>*(*x*, *<sup>y</sup>*) = *<sup>m</sup>*

and competence to handle large volumes of data.

association between data points and centers [43].

the number of center in **c**.

In Table 4 it is present the labels related to the attack categories. From all the attack categories, the study subject will be the DoS attacks. From all the 41 features, for this study case, it is adopt the source bytes and destiny bytes, because the main idea of the approach is to detect volume anomaly.


**Table 4.** Labels related to the attack categories

## *3.4.2. The ACO Clustering*

ACO is composed of a population of agents competing and globally asynchronous, cooperating to find an optimal solution. Although each agent has the ability to build a viable solution, as well as a real ant can somehow find a path between the colony and food source, the highest quality solutions are obtained through cooperation between individuals of the whole colony. Like other metaheuristics, ACO is compound of a set of strategies that guide the search for the solution. It makes use of choices based on the use of stochastic processes, verifying the information acquired from previous results to guide it through the search space [39].

Artificial ants travel the search space represented by a graph *G*(*V*, *E*), where *V* is a finite set of all nodes and *E* is the set of edges. The ants are attracted to more favorable locations to optimize an objective function, in other words, those in which the concentration of pheromone deposited by ants that previously went through the same path is higher [40]. While real ants deposit pheromone on the place they visit, artificial ants change some numeric information stored locally which describe the state of problem. This information is acquired through the historical and current performance of the ant during construction of solutions [39].

The responsibility of hosting the information during the search of the solution lies with the trail pheromone, *τ*. In ACO, the tracks are channels of communication between agents and only they have access to the tracks, i.e., only ants have the propriety of reading and modifying the numeric information contained in the pheromone trails. Every new path selection produces a solution, and each ant modifies all local information in a given region of the graph. Normally, an evaporation mechanism modifies the pheromone trail information over time. This characteristic allows agents slowly forget their history, allowing their search for new directions without being constrained by past decisions, thereby, avoiding the problem of precipitated convergences and resulting in not so great solutions.

The technique of clustering is a data mining tool used to find and quantify similarities between points of determined group of data. This process seeks to minimize the variance between elements of a given group and maximize them in relation to other groups [41]. The similarity function adopted is the Euclidean distance described in Eq. (28).

24 Search Algorithms

volume anomaly.

**Table 4.** Labels related to the attack categories

*3.4.2. The ACO Clustering*

search space [39].

solutions [39].

4. **Probing** : is an attempt to gather information about a network of computers for the

In Table 4 it is present the labels related to the attack categories. From all the attack categories, the study subject will be the DoS attacks. From all the 41 features, for this study case, it is adopt the source bytes and destiny bytes, because the main idea of the approach is to detect

> **Attack Categorie Labels Samples** Denial of Service (DoS) back, land, neptune, 391458 (79.2391%) pod, smurf, teardrop User to Root (U2R) buffer overflow,perl, 52 (0.0105%) loadmodule, rootkit Remote to Local (R2L) ftp write, guess passwd, 1126 (0.2279%) imap, multihop, phf, spy, warezclient, warezmaster Probing ipsweep, nmap, portsweep, 4107 (0.8313%)

> > satan

ACO is composed of a population of agents competing and globally asynchronous, cooperating to find an optimal solution. Although each agent has the ability to build a viable solution, as well as a real ant can somehow find a path between the colony and food source, the highest quality solutions are obtained through cooperation between individuals of the whole colony. Like other metaheuristics, ACO is compound of a set of strategies that guide the search for the solution. It makes use of choices based on the use of stochastic processes, verifying the information acquired from previous results to guide it through the

Artificial ants travel the search space represented by a graph *G*(*V*, *E*), where *V* is a finite set of all nodes and *E* is the set of edges. The ants are attracted to more favorable locations to optimize an objective function, in other words, those in which the concentration of pheromone deposited by ants that previously went through the same path is higher [40]. While real ants deposit pheromone on the place they visit, artificial ants change some numeric information stored locally which describe the state of problem. This information is acquired through the historical and current performance of the ant during construction of

The responsibility of hosting the information during the search of the solution lies with the trail pheromone, *τ*. In ACO, the tracks are channels of communication between agents and only they have access to the tracks, i.e., only ants have the propriety of reading and modifying the numeric information contained in the pheromone trails. Every new path selection produces a solution, and each ant modifies all local information in a given region of the graph. Normally, an evaporation mechanism modifies the pheromone trail information over time. This characteristic allows agents slowly forget their history, allowing their search for new directions without being constrained by past decisions, thereby, avoiding the problem

of precipitated convergences and resulting in not so great solutions.

apparent purpose of circumventing its security controls.

$$d(\mathbf{x}, y) = \sqrt{\sum\_{i=1}^{m} |\mathbf{x}\_i - y\_i|^2} = ||\mathbf{x}\_i - \mathbf{y}\_i||\tag{28}$$

The equation that measures the similarity between the data is called the objective function. The purpose of the use clustering is to create a template from which to extract a pattern of information. Thus, when a distance of data is found in smaller quantities in relation to this standard, you can group them into clusters of different sets of greater representation. The most classical algorithm in the literature is the K-means (KM) algorithm. It is a partitional center-based clustering method and the popularity is due to simplicity of implementation and competence to handle large volumes of data.

The problem to find the *K* center locations can be defined as an optimization problem to minimize the sum of the Euclidean distances between data points and their closest centers, described in Eq. (29). The KM randomly select *k* points and make them the initial centres of *k* clusters, then assigns each data point to the cluster with centre closest to it. In the second step, the centres are recomputed, and the data points are redistributed according to the new centres. The algorithm stop when the number of iterations is achieved or there is no change in the membership of the clusters over successive iterations [42]. One issue founded in KM is the initialization due to partitioning strategy, when in local density data results in a strong association between data points and centers [43].

$$\mathbf{KM}(\mathbf{x}, \mathbf{c}) = \sum\_{i=1}^{n} \sum\_{j=1}^{k} ||\mathbf{x}\_{i} - \mathbf{c}\_{j}||^{2} \tag{29}$$

where **x** is the data, **c** is the center. The parameter *n* is total number of elements in **x** and *k* is the number of center in **c**.

The ACO described in this section aims to optimize the efficiency of clustering minimizing the objective function value described by Eq. (29). Thus, this ensures that each point *i* will be grouped to the best cluster *j* where *j* = 1, ..., *K*. In addition, it enables the construction of solutions that are not givens by local optimal, which is the existing problem in most clustering algorithms. The ACO algorithm proposed is described in Algorithm 2, in which the functions performed into the Algorithm 2 are briefly described in following.


#### **Algorithm 2** ACO Clustering

Objective function *f*(**x**), **x** = (*xm*, ..., *xd*)*<sup>T</sup>* Initialize the ants population **x***m*(*m* = 1, 2, ..., *n*) Set the parameters *γ*, *β*, *ρ* **WHILE** (*The end conditions aren't met*) **FOR** *m* = 1 to *M CalculateFitnessFunction()*; **end FOR** *SortAnts()*; *UpdatePheromone()*; **end WHILE**

c) *UpdatePheromone()*: This function directs the algorithm at the search for new solutions using promising path that were previously found. The links between point-cluster that showed better results are intensified and expected to be used in the construction of increasingly better solutions. In contrast, point-cluster unsuccessful links are expected to be forgotten by the algorithm through the evaporation process of the pheromone. The pheromone updating can be described by:

$$
\pi\_{\vec{l}\vec{j}}(t+1) = (1 - \rho)\pi\_{\vec{l}\vec{j}}(t) \tag{30}
$$

*3.4.3. Numerical Results*

*K* = [2, . . . , 10].

100

separation and is describe by the Eq. (32):

101

DUNN'S INDEX

**Figure 8.** Dunn's index.

102

10<sup>3</sup>

A paramount importance question when working with cluster is the optimal number of clusters to grouping the dataset in a good manner. We adopted the following clustering

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

The Dunn's index is based on the calculation of the ratio between the minimal intracluster distance to maximal intercluster distance and the main idea is to identifying the cluster sets

> *<sup>D</sup>* <sup>=</sup> *<sup>d</sup>*min *d*max

where, *d*min is the smallest distance between two objects from different clusters, and *d*max is the largest distance of two objects from the same cluster. *D* is limited to the interval [0, ∞] and higher is the desirable value. In figure 8, it is presented the values for the tests for

<sup>2</sup> <sup>3</sup> <sup>4</sup> <sup>5</sup> <sup>6</sup> <sup>7</sup> <sup>8</sup> <sup>9</sup> <sup>10</sup> 10−1

The Dunn's index aims to indentify clusters that are compact, well separeted and with a low variance between the members within the same cluster. The results indicates that *K* = 2 are the best number of centers to adopt, because it is the higher Dunn's index indicating a better clustering. But, before adopting *K* = 2, we tested another index, the Davies-Bouldin index [45]. It is a function of the ratio of the sum of within-cluster scatter to between-cluster

> *n* ∑ *i*=1,*i*=*j*

where *n* is the number of clusters, *σ<sup>i</sup>* is the average distance of all objects in cluster *i* to their cluster center *ci*, *σ<sup>j</sup>* is the average distance of all objects in cluster *j* to their cluster center *cj*,

max *<sup>σ</sup><sup>i</sup>* <sup>+</sup> *<sup>σ</sup><sup>j</sup> d*(*ci*, *cj*) *DB* <sup>=</sup> <sup>1</sup> *n*

NUMBER OF CENTERS (K)

(31)

135

http://dx.doi.org/10.5772/53338

(32)

dts\_src src\_bytes

quality criteria: Dunn's index [44] and Davies-Bouldin index [45].

that are compact and well separated. The following Eq. (31) describes:

where *ρ* is a constant suitably defined, which describes the evaporation rate of the pheromone and has value 0 < *ρ* < 1. The variable *t* identifies the interaction running.

After the ACOClustering generates the centers from the training dataset, it is applied to the dataset. Then, it is adopted a parameter *ǫ* which is a value describing a range accepted to cluster the data in that center of not. The figure 7 illustrated the idea.

**Figure 7.** The area generated by the *ǫ* parameter.

### *3.4.3. Numerical Results*

26 Search Algorithms

**Algorithm 2** ACO Clustering Objective function *f*(**x**), **x** = (*xm*, ..., *xd*)*<sup>T</sup>* Initialize the ants population **x***m*(*m* = 1, 2, ..., *n*)

**WHILE** (*The end conditions aren't met*)

*CalculateFitnessFunction()*;

pheromone updating can be described by:

**Figure 7.** The area generated by the *ǫ* parameter.

c) *UpdatePheromone()*: This function directs the algorithm at the search for new solutions using promising path that were previously found. The links between point-cluster that showed better results are intensified and expected to be used in the construction of increasingly better solutions. In contrast, point-cluster unsuccessful links are expected to be forgotten by the algorithm through the evaporation process of the pheromone. The

where *ρ* is a constant suitably defined, which describes the evaporation rate of the pheromone and has value 0 < *ρ* < 1. The variable *t* identifies the interaction running.

After the ACOClustering generates the centers from the training dataset, it is applied to the dataset. Then, it is adopted a parameter *ǫ* which is a value describing a range accepted to

ε

cluster the data in that center of not. The figure 7 illustrated the idea.

*τij*(*t* + 1)=(1 − *ρ*)*τij*(*t*) (30)

Set the parameters *γ*, *β*, *ρ*

**FOR** *m* = 1 to *M*

**end FOR** *SortAnts()*; *UpdatePheromone()*;

**end WHILE**

A paramount importance question when working with cluster is the optimal number of clusters to grouping the dataset in a good manner. We adopted the following clustering quality criteria: Dunn's index [44] and Davies-Bouldin index [45].

The Dunn's index is based on the calculation of the ratio between the minimal intracluster distance to maximal intercluster distance and the main idea is to identifying the cluster sets that are compact and well separated. The following Eq. (31) describes:

$$D = \frac{d\_{\rm min}}{d\_{\rm max}}\tag{31}$$

where, *d*min is the smallest distance between two objects from different clusters, and *d*max is the largest distance of two objects from the same cluster. *D* is limited to the interval [0, ∞] and higher is the desirable value. In figure 8, it is presented the values for the tests for *K* = [2, . . . , 10].

**Figure 8.** Dunn's index.

The Dunn's index aims to indentify clusters that are compact, well separeted and with a low variance between the members within the same cluster. The results indicates that *K* = 2 are the best number of centers to adopt, because it is the higher Dunn's index indicating a better clustering. But, before adopting *K* = 2, we tested another index, the Davies-Bouldin index [45]. It is a function of the ratio of the sum of within-cluster scatter to between-cluster separation and is describe by the Eq. (32):

$$DB = \frac{1}{n} \sum\_{i=1, i \neq j}^{n} \max\left[\frac{\sigma\_i + \sigma\_j}{d(c\_{i\prime}c\_j)}\right] \tag{32}$$

where *n* is the number of clusters, *σ<sup>i</sup>* is the average distance of all objects in cluster *i* to their cluster center *ci*, *σ<sup>j</sup>* is the average distance of all objects in cluster *j* to their cluster center *cj*, and *d*(*ci*, *cj*) is the distance of centers *ci* and *cj*. If *DB* result in low values, the clusters are compact and the centers are far from each other. In figure 9, it is presented the results for *K* = [2, . . . , 10].

Eq. (33) describes how much the method wrongly classified as anomalous from all the normal data, while Eq. (34) measures the closeness of the method measures in relation to the real values. The Eq. (35) describes the percentage of the corrected classified data among all

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

It was decided to test four rules to capture the anomalies: 1)using only the src\_bytes, 2)using only the dst\_bytes, 3)using the src\_bytes or dst\_bytes and 4)using the src\_bytes and dst\_bytes.

> 30 60

> 30 60

FALSE−POSITIVE RATE [%]

The figure 10 shows the method study achievies low rates for FAR, that means the method does not classify as anomalous the normal data. As we increase the value of *ǫ*, the FAR starts increasing, but only achieves high rates close to 1, therefore, the area created is large enough

In figure 11 shows the result for ACC. The method is not so close to the real value, expressed by the rate around 30%. This rate can be originated because the method is not classifying

To finally demonstrate that the method is not classifying the anomalies in a good manner, the figure 12 shows the precision results. It is observed higher rates when *ǫ* < 0.15, meaning the method can classify anomaly from normal when the area adopted is small. This makes sense, because in computer networks the traffic behavior follows a regular action and the anomaly usually is a abrupt change in this behavior. When *ǫ* > 0.15, the PRE rate decrease

As for the four rules, we can conclude that when separate they show different results, the src\_bytes describes better results. But when using then together, they express similar results

right the anomalies, in the other hand, it is classifying the normal data right.

at a high pace, that means the method adopt more anomalies as normal data.

as well, and the src\_bytes results suppress the dst\_bytes results.

100

FALSE−POSITIVE RATE [%]

100

<sup>0</sup> 0.5 <sup>1</sup> <sup>0</sup>

<sup>0</sup> 0.5 <sup>1</sup> <sup>0</sup>

ε

ε

src\_bytes && dst\_bytes

dst\_bytes

http://dx.doi.org/10.5772/53338

137

the classified data.

The figure 10 shows the results for the FAR.

30 60

30 60

FALSE−POSITIVE RATE [%]

**Figure 10.** False alarm rate.

100

FALSE−POSITIVE RATE [%]

100

<sup>0</sup> 0.5 <sup>1</sup> <sup>0</sup>

src\_bytes

ε

src\_bytes || dst\_bytes

<sup>0</sup> 0.5 <sup>1</sup> <sup>0</sup>

to capture anomalies data and wrongly classify.

ε

**Figure 9.** Davies-Bouldin index.

The ratio of the within cluster scatter to the between cluster separation will constrain the index to be symmetric and non-negative, thus it is expected a lower value for a fair clustering. From figure 9, the src\_bytes (red line) present a value close to 0 at *K* = 2, and to the dst\_bytes (blue line) for all the *K* tested values it had a regular behavior around 0.5. As *K* = 2 present a better result for Dunn's index and the best result for src\_bytes, it is adopted for all the following tests.

Besides the number of centers, to measure the efficiency of the proposed case study, we adopted the following variables [46]:


Hence, through the declaration of these variables the following equations can be calculated:

$$\text{False Alarm Rate (FAR)} = \frac{\text{FALSE NEGATIVE}}{\text{TOTAL OF NORMAL DATA}} \tag{33}$$

$$\text{Accuracy (ACC)} = \frac{\text{TRUE POSITIVE} + \text{TRUE NEGATIVE}}{\text{TOTAL NORMAL DATA} + \text{TOTAL ANOMALY DATA}} \tag{34}$$

$$\text{Precision (PRE)} = \frac{\text{TRUE POSITIVE}}{\text{TRUE POSITIVE} + \text{FALSE POSITIVE}} \tag{35}$$

Eq. (33) describes how much the method wrongly classified as anomalous from all the normal data, while Eq. (34) measures the closeness of the method measures in relation to the real values. The Eq. (35) describes the percentage of the corrected classified data among all the classified data.

It was decided to test four rules to capture the anomalies: 1)using only the src\_bytes, 2)using only the dst\_bytes, 3)using the src\_bytes or dst\_bytes and 4)using the src\_bytes and dst\_bytes. The figure 10 shows the results for the FAR.

**Figure 10.** False alarm rate.

28 Search Algorithms

*K* = [2, . . . , 10].

0.5

1

1.5

DAVIES−BOULDIN INDEX

adopted the following variables [46]:

**Figure 9.** Davies-Bouldin index.

following tests.

2

2.5

3

dst\_bytes src\_bytes

3.5

and *d*(*ci*, *cj*) is the distance of centers *ci* and *cj*. If *DB* result in low values, the clusters are compact and the centers are far from each other. In figure 9, it is presented the results for

<sup>2</sup> <sup>3</sup> <sup>4</sup> <sup>5</sup> <sup>6</sup> <sup>7</sup> <sup>8</sup> <sup>9</sup> <sup>10</sup> <sup>0</sup>

NUMBER OF CENTERS (K)

The ratio of the within cluster scatter to the between cluster separation will constrain the index to be symmetric and non-negative, thus it is expected a lower value for a fair clustering. From figure 9, the src\_bytes (red line) present a value close to 0 at *K* = 2, and to the dst\_bytes (blue line) for all the *K* tested values it had a regular behavior around 0.5. As *K* = 2 present a better result for Dunn's index and the best result for src\_bytes, it is adopted for all the

Besides the number of centers, to measure the efficiency of the proposed case study, we

Hence, through the declaration of these variables the following equations can be calculated:

TOTAL OF NORMAL DATA (33)

TOTAL NORMAL DATA + TOTAL ANOMALY DATA (34)

TRUE POSITIVE <sup>+</sup> FALSE POSITIVE (35)

False Alarm Rate (FAR) <sup>=</sup> FALSE NEGATIVE

Accuracy (ACC) <sup>=</sup> TRUE POSITIVE <sup>+</sup> TRUE NEGATIVE

Precision (PRE) <sup>=</sup> TRUE POSITIVE

• TRUE POSITIVE : If the instance is an anomaly and it is classified as an anomaly; • FALSE NEGATIVE : If the instance is an anomaly and it is classified as normal; • FALSE POSITIVE : If the instance is normal and it is classified as an anomaly; • TRUE NEGATIVE : If the instance is an normal and it is classified as normal;

The figure 10 shows the method study achievies low rates for FAR, that means the method does not classify as anomalous the normal data. As we increase the value of *ǫ*, the FAR starts increasing, but only achieves high rates close to 1, therefore, the area created is large enough to capture anomalies data and wrongly classify.

In figure 11 shows the result for ACC. The method is not so close to the real value, expressed by the rate around 30%. This rate can be originated because the method is not classifying right the anomalies, in the other hand, it is classifying the normal data right.

To finally demonstrate that the method is not classifying the anomalies in a good manner, the figure 12 shows the precision results. It is observed higher rates when *ǫ* < 0.15, meaning the method can classify anomaly from normal when the area adopted is small. This makes sense, because in computer networks the traffic behavior follows a regular action and the anomaly usually is a abrupt change in this behavior. When *ǫ* > 0.15, the PRE rate decrease at a high pace, that means the method adopt more anomalies as normal data.

As for the four rules, we can conclude that when separate they show different results, the src\_bytes describes better results. But when using then together, they express similar results as well, and the src\_bytes results suppress the dst\_bytes results.

**Figure 12.** Precision rate.

## **4. Conclusion Remarks**

## **4.1. ACO Resource Allocation in CDMA Networks**

ε

The ACO algorithm proved itself robust and efficient in solving both RA problems presented in this chapter. In fact, the ACO performance exceeded the PSO performance discussed in [1].

ε

improvement after that. This fast convergence behavior is an important feature due to the system dynamics, i.e. the algorithm must update the transmission power every 666.7*µ*s if we consider power control applications in 3G wireless cellular communication CDMA systems. Future work includes analyzes over dynamic channels, i.e. the channel gain matrix is

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

http://dx.doi.org/10.5772/53338

139

The method presented in the study case does not have excellent results, because for all the 41 features from the KDD dataset, it only works with 2: src\_bytes and dst\_bytes. From all the anomalies presented, we focus on the Denial of Service (DoS), and the method presented good False Alarm Rates (FAR), assuming the parameter *ǫ* normally is 0 < *ǫ* < 2, where the

The Accuracy rate and Precision rate can be increased using the other features like flag, land. Flag is the status the connection and land assumes 1 if connection is from/to the same host/port; 0 otherwise. Adding more rules to the anomaly detection method, it is possible

The ACOClustering have an important rule, helping to cluster the data, preventing to get stuck in local optimum centers. In the task of manage the computer network, it can handle a set of thousands of connections per second, thus it is possible to get stuck in some optimum local center. As each ant is a possible set of the centers, and the search is always guide by the

constant only over a single time slot.

FAR is below 5%.

to increase the rates.

best answer found so far.

Mateus de Paula Marques1,

Mário H. A. C. Adaniya1, Taufik Abrão1,

*Communications*, Sept. 2008.

Lucas Hiera Dias Sampaio<sup>2</sup> and Paul Jean E. Jeszensky<sup>2</sup> 1 State University of Londrina (UEL), Londrina, PR, Brazil

chapter 13, pages 261–298. InTech Open, 2011.

*Evolutionary Computation. CEC 99.* IEEE, 1999.

2 Polytechnic School of University of São Paulo (EPUSP), São Paulo, SP, Brazil

[1] T. Abrão, Sampaio L.D.H., M. L. Proença JR, B. A. Angélico, and P. J. E. Jeszensky. *Multiple Access Network Optimization Aspects via Swarm Search Algorithms*, volume 1,

[2] Marco Dorigo and Gianni Caro. Ant colony optimization: A new meta-heuristic. In

[3] Gerhard Fettweis and Ernesto Zimmermann. Ict energy consumption - trends and challenges. In *WPMC'08 – 11th International Symposium on Wireless Personal Multimedia*

[4] G. Foschini and Z. Miljanic. A simple distributed autonomous power control algorithm

and is convergence. volume 42, pages 641–656. IEEE, November 1993.

**Author details**

**References**

**4.2. ACO Anomaly Detection in Computer Networks**

In terms of solution quality the ACO power control scheme was able to achieve much better solutions than the PSO approach. For the weighted throughput maximization problem, the numerical results show a fast convergence in the first hundred iterations and a solution improvement after that. This fast convergence behavior is an important feature due to the system dynamics, i.e. the algorithm must update the transmission power every 666.7*µ*s if we consider power control applications in 3G wireless cellular communication CDMA systems.

Future work includes analyzes over dynamic channels, i.e. the channel gain matrix is constant only over a single time slot.

## **4.2. ACO Anomaly Detection in Computer Networks**

The method presented in the study case does not have excellent results, because for all the 41 features from the KDD dataset, it only works with 2: src\_bytes and dst\_bytes. From all the anomalies presented, we focus on the Denial of Service (DoS), and the method presented good False Alarm Rates (FAR), assuming the parameter *ǫ* normally is 0 < *ǫ* < 2, where the FAR is below 5%.

The Accuracy rate and Precision rate can be increased using the other features like flag, land. Flag is the status the connection and land assumes 1 if connection is from/to the same host/port; 0 otherwise. Adding more rules to the anomaly detection method, it is possible to increase the rates.

The ACOClustering have an important rule, helping to cluster the data, preventing to get stuck in local optimum centers. In the task of manage the computer network, it can handle a set of thousands of connections per second, thus it is possible to get stuck in some optimum local center. As each ant is a possible set of the centers, and the search is always guide by the best answer found so far.

## **Author details**

30 Search Algorithms

10

20

PRECISION [%]

**4. Conclusion Remarks**

**Figure 12.** Precision rate.

[1].

PRECISION [%]

ACCURACY [%]

**Figure 11.** Accuracy rate.

40

ACCURACY [%]

40

<sup>0</sup> 0.5 <sup>1</sup> <sup>5</sup>

<sup>0</sup> 0.5 <sup>1</sup> <sup>21</sup>

ε

src\_bytes && dst\_bytes

<sup>0</sup> 0.5 <sup>1</sup> <sup>5</sup>

ε

dst\_bytes

<sup>0</sup> 0.2 0.4 0.6 0.8 <sup>1</sup> <sup>0</sup>

ε

src\_bytes && dst\_bytes

<sup>0</sup> 0.2 0.4 0.6 0.8 <sup>1</sup> <sup>0</sup>

ε

dst\_bytes

25

20

PRECISION [%]

The ACO algorithm proved itself robust and efficient in solving both RA problems presented in this chapter. In fact, the ACO performance exceeded the PSO performance discussed in

In terms of solution quality the ACO power control scheme was able to achieve much better solutions than the PSO approach. For the weighted throughput maximization problem, the numerical results show a fast convergence in the first hundred iterations and a solution

PRECISION [%]

ACCURACY [%]

40

ACCURACY [%]

30

src\_bytes

ε

src\_bytes || dst\_bytes

<sup>0</sup> 0.5 <sup>1</sup> <sup>5</sup>

ε

src\_bytes

<sup>0</sup> 0.2 0.4 0.6 0.8 <sup>1</sup> <sup>0</sup>

ε

src\_bytes || dst\_bytes

<sup>0</sup> 0.2 0.4 0.6 0.8 <sup>1</sup> <sup>0</sup>

ε

**4.1. ACO Resource Allocation in CDMA Networks**

Mateus de Paula Marques1, Mário H. A. C. Adaniya1, Taufik Abrão1, Lucas Hiera Dias Sampaio<sup>2</sup> and Paul Jean E. Jeszensky<sup>2</sup>

1 State University of Londrina (UEL), Londrina, PR, Brazil

2 Polytechnic School of University of São Paulo (EPUSP), São Paulo, SP, Brazil

## **References**


[5] M. Moustafa, I. Habib, and M. Naghshineh. Genetic algorithm for mobiles equilibrium. MILCOM 200, October 2000.

[20] Animesh Patcha and Jung-Min Park. An overview of anomaly detection techniques: Existing solutions and latest technological trends. *Computer Networks: The International Journal of Computer and Telecommunications Networking*, 51:3448–3470, August 2007.

Ant Colony Optimization for Resource Allocation and Anomaly Detection in Communication Networks

http://dx.doi.org/10.5772/53338

141

[21] M Thottan and Chuanyi Ji. Anomaly detection in IP networks. *IEEE Transactions on*

[22] Wentao Liu. Research on DoS attack and detection programming. In *Proceedings of the 3rd international conference on Intelligent information technology application*, volume 1 of

[23] J. Gadge and A.A. Patil. Port scan detection. In *16th IEEE International Conference on*

[24] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey.

[25] Victoria J. Hodge and Jim Austin. A survey of outlier detection methodologies. *Artif.*

[26] Juan M. Estévez-Tapiador, Pedro Garcia-Teodoro, and Jesús E. Díaz-Verdejo. Anomaly detection methods in wired networks: a survey and taxonomy. *Computer*

[27] D.E. Denning. An intrusion-detection model. *Software Engineering, IEEE Transactions on*,

[28] Bruno Bogaz Zarpelão. *Detecção de Anomalias em Redes de Computadores*. PhD thesis, Universidade Estadual de Campinas (UNICAMP). Faculdade de Engenharia Eletrica e

[29] Marco Dorigo, Vittorio Maniezzo, and Alberto Colorni. Positive feedback as a search strategy. Technical report, Technical Report No. 91-016, Politecnico di Milano, Italy,

[30] Marco Dorigo, Vittorio Maniezzo, and Alberto Colorni. Ant system: optimization by a colony of cooperating agents. *IEEE Transactions on Systems, Man, and Cybernetics, Part B*,

[31] Gianni Di Caro and Marco Dorigo. Antnet: Distributed stigmergetic control for

[32] Habiba Drias, Moufida Rahmani, and Manel Khodja. Aco approaches for large scale information retrieval. In *World Congress on Nature and Biologically Inspired Computing*

[33] Wang Ziqiang and Sun Xia. Web document retrieval using manifold learning and aco algorithm. In *Broadband Network Multimedia Technology, 2009. IC-BNMT '09. 2nd IEEE*

communications networks. *J. Artif. Intell. Res. (JAIR)*, 9:317–365, 1998.

*(NaBIC)*, pages 713–718. IEEE, December 2009.

*International Conference on*, pages 152–155, oct. 2009.

*IITA'09*, pages 207–210, Piscataway, NJ, USA, November 2009. IEEE Press.

*Networks*, ICON, pages 1–6, USA, December 2008. IEEE Press.

*Signal Processing*, 51(8):2191–2204, August 2003.

*ACM Computing Surveys.*, 41(3), 2009.

*Communications*, 27(16):1569–1584, 2004.

SE-13(2):222–232, February 1987.

de Computação (FEEC)., 2010.

1991.

26(1):29–41, 1996.

*Intell. Rev.*, 22(2):85–126, 2004.


[20] Animesh Patcha and Jung-Min Park. An overview of anomaly detection techniques: Existing solutions and latest technological trends. *Computer Networks: The International Journal of Computer and Telecommunications Networking*, 51:3448–3470, August 2007.

32 Search Algorithms

MILCOM 200, October 2000.

pages 128–150, February 2009.

Systems, May 2010.

Elsevier.

pages 343–353. IEEE, Jan 2008.

fractional programming. pages 229–259, 2003.

particle swarm optimization. pages 1–8, March 2007.

da Sociedade Brasileira de Telecomunicacoes, 2010.

wireless systems. volume 13, pages 854–867. IEEE, August 2005.

power control problem. volume 8, pages 1553–1563. IEEE, March 2009.

[5] M. Moustafa, I. Habib, and M. Naghshineh. Genetic algorithm for mobiles equilibrium.

[6] M.Moustafa, I. Habib, and M. Naghshineh. Wireless resource management using genethic algorithm for mobiles equilibrium. volume 37, pages 631–643, November 2011.

[7] H. Elkamchouchi, H. Elragal, and M. Makar. Power control in cdma system using

[8] K. Zielinski, P. Weitkemper, R. Laur, and K. D. Kammeyer. Optimization of power allocation for interference cancellation with particle swarm optimization. volume 13,

[9] J.W. Lee, R.R. Mazumdar, and N. B. Shroff. Downlink power allocation for multi-class

[10] J. Dai, Z. Ye, and X. Xu. Mapel: Achieving global optimality for a non-convex wireless

[11] T. J. Gross, T. Abrao, and P. J. E. Jeszensky. Algoritmo de controle de potência distribuido fundamentado no modelo populacional de verhulst. volume 20, pages 59–74. Revista

[12] J. H. Ping Qian and Ying Jun Zhang. Mapel: Achieving global optimality for a non-convex wireless power control problem. volume 8, pages 1553–1563, March 2009.

[13] Lucas Dias H. Sampaio, Moisés F. Lima, Bruno B. Zarpelão, Mario Lemes Proença Junior, and Taufik Abrão. Swarm power-rate optimization in multi-class services ds/cdma networks. 28th Brazilian Symposium on Computer Networks and Distributed

[14] M. Elmusrati and H. Koivo. Multi-objective totally distributed power and rate control for wireless communications. volume 4, pages 2216–2220. VTC'03-Spring, Apr. 2003.

[15] C. E. Shannon. The mathematical theory of communication. *The Bell System Technical Journal*, 27((reprinted with corrections 1998)):379–423, 623–656, July, October 1948.

[16] M. Elmusrati, H. El-Sallabi, and H. Koivo. Aplications of multi-objective optimization techniques in radio resource scheduling of cellular communication systems. volume 7,

[17] E. Seneta. *Non-Negative Matrices and Markov Chains*, volume 2. Springer-Verlag, 1981.

[18] N. T. H. Phuong and H. Tuy. A unified monotonic approach to generalized linear

[19] Krzysztof Socha and Marco Dorigo. Ant colony optimization for continuous domains. In *European Jornal of Operational Research*, pages 1155–1173, Brussels, Belgium, 2008.


[34] Namita Shrivastava and Vineet Richariya. Ant colony optimization with classification algorithms used for intrusion detection. In *International Journal of Computational Engineering and Management, IJCEM*, volume 7, pages 54–63, January 2012.

**Chapter 6**

**Optical Network Optimization Based on Particle Swarm**

Modern optical communication networks are expected to meet a broad range of services with different and variable demands of bit rate, connection (session) duration, frequency of use, and set up time [1]. Thus, it is necessary to build flexible all-optical networks that allow dynamic resources sharing between different users and clients in an efficient way. The alloptical network is able to implement ultrahigh speed transmitting, routing and switching of data in the optical domain, presenting the transparency to data formats and protocols which increases network flexibility and functionality such that future network requirements can be met [2]. Optical code division multiplexing access (OCDMA) based technology has attracted a lot of interests due to its various advantages including asynchronous operation, high net‐ work flexibility, protocol transparency, simplified network control and potentially enhanced security [3]. Therefore, recent developments and researches on OCDMA have been experi‐ enced an expansion of interest, from short-range networks, such as access networks, to high-

The optical network presents two promising scenarios: the transport (backbone) networks with optical code division multiplexing/wavelength division multiplexing (OCDM/WDM) technology and the access network with OCDMA technology. In both, transport OCDM/WDM and access OCDMA networks, each different code defines a specific user or logic channel transmitted in a common channel. In a common channel, the interference that may arise between different user codes is known as multiple access interference (MAI), and it can limit the number of users utilizing the channel simultaneously [3]. In this work we

> © 2013 Durand et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2013 Durand et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Intelligence**

Taufik Abrão

**1. Introduction**

http://dx.doi.org/10.5772/52225

capacity medium/large networks.

Fábio Renan Durand, Larissa Melo,

Lucas Ricken Garcia, Alysson José dos Santos and

Additional information is available at the end of the chapter


**Chapter 6**
