**4.2 Experiment**

We evaluated HSTCP using a synthetic dataset that was generated by combining the data from the 1999 DARPA intrusion detection project and 2000 DARPA "Scenario Datasets" that have been crafted to provide examples of multiple component attack scenarios instead of the atomic attacks as found in past evaluations [24] and Münchner Wissenschaftsnetz (MWN), Germany. The MWN provides Internet connectivity to two major universities and a number of research institutes. Overall, the network contains about 50,000 individual hosts and 65,000 registered users. The trace "mwn-cs-full" that we analyzed is a 2-h trace including the full payload of all packets to/from one of the CS departments in MWN, with some high-volume servers excluded.

The 1999 DARPA dataset that we used consisted of 5 weeks of TCPdump data. Weeks 1 and 3 have normal attack-free network traffic. Week 2 consists of network traffic with labeled attacks, while weeks 4 and 5 contain 201 instances of 58 different attacks, 177 of which are visible in the TCPdump data. The 2000 DARPA dataset includes two recently created scenario datasets that address the needs of mid-level correlation systems. Each includes several hours of background traffic and a complete attack scenario. Attacks and background traffic were run on the same testbed used in the 1999 evaluation, but with the addition of a commercial off-the-shelf firewall and demilitarized zone (DMZ) network separating the Internal and Internet networks and a Solaris 2.7 victim host.

The experimental evaluation of HSTCP has been divided into three steps. In section 4.1, we evaluate the performance of the sampling algorithm and compare its performance with the stratified random sampling algorithm. Then in section 4.2 we evaluate the performance of HSTCP for intrusion detection/prevention.

## **4.2.1 Evaluation of the sampling algorithm**

Experiments were conducted to compare and evaluate the performance of the proposed adaptive sampling algorithm with the stratified random sampling algorithm.

packet data between NIC memory and the PHY. Finally, the DDR controller required

The processors, memories, and hardware units are interconnected on the Virtex FPGA by a processor local bus (PLB). The PLB is a high-performance memory-mapped 100-MHz 64-bitwide split-transaction bus that can provide a maximum theoretical bandwidth of 12.5 Gbits/s in a full duplex mode. The PLB allows for burst transmissions of up to 128 bytes in a single operation, which is used by the DMA and MAC units to improve memory access

Also attached to the PLB is a small memory-mapped control module used to route descriptors to or from the MAC and DMA hardware assist units. This module also provides a central location for hardware counters, event thresholds, and other low-level control functions. By attaching this central control unit to the PLB, either PowerPC processor can manipulate these important NIC functions. The control unit takes 18 PowerPC cycles to read

We evaluated HSTCP using a synthetic dataset that was generated by combining the data from the 1999 DARPA intrusion detection project and 2000 DARPA "Scenario Datasets" that have been crafted to provide examples of multiple component attack scenarios instead of the atomic attacks as found in past evaluations [24] and Münchner Wissenschaftsnetz (MWN), Germany. The MWN provides Internet connectivity to two major universities and a number of research institutes. Overall, the network contains about 50,000 individual hosts and 65,000 registered users. The trace "mwn-cs-full" that we analyzed is a 2-h trace including the full payload of all packets to/from one of the CS departments in MWN, with some high-volume

The 1999 DARPA dataset that we used consisted of 5 weeks of TCPdump data. Weeks 1 and 3 have normal attack-free network traffic. Week 2 consists of network traffic with labeled attacks, while weeks 4 and 5 contain 201 instances of 58 different attacks, 177 of which are visible in the TCPdump data. The 2000 DARPA dataset includes two recently created scenario datasets that address the needs of mid-level correlation systems. Each includes several hours of background traffic and a complete attack scenario. Attacks and background traffic were run on the same testbed used in the 1999 evaluation, but with the addition of a commercial off-the-shelf firewall and demilitarized zone (DMZ) network separating the

The experimental evaluation of HSTCP has been divided into three steps. In section 4.1, we evaluate the performance of the sampling algorithm and compare its performance with the stratified random sampling algorithm. Then in section 4.2 we evaluate the performance of

Experiments were conducted to compare and evaluate the performance of the proposed

adaptive sampling algorithm with the stratified random sampling algorithm.

and 12 cycles to write a 32-bit word, primarily due to bus arbitration delays.

Internal and Internet networks and a Solaris 2.7 victim host.

HSTCP for intrusion detection/prevention.

**4.2.1 Evaluation of the sampling algorithm** 

modifications to function in this specific development board with its unique wiring.

efficiency.

**4.2 Experiment** 

servers excluded.

In 1994, Leland *et al*. showed that the Ethernet traffic consisted of slowly decaying packet count bursts across all time scales. Time series that consist of such a pattern are said to exhibit the property of long-range dependence and are termed as "self-similar." Similar selfsimilar behavior has also been observed in wide-area Internet traffic by other researchers . One important characteristic of a self-similar process is that its degree of self-similarity can be expressed with a single parameter, namely the Hurst parameter which can be derived from the rescaled adjusted range (R/S) statistic. It is defined as follows.

For a given set of observations X1, X2, …, Xn with a sample mean *X n*( ) and sample variance <sup>2</sup> *S n*( ) , the rescaled adjusted range or the R/S statistic is given by

$$R(n) / \,\mathrm{S}(n) = 1 / \,\mathrm{S}(n) \Big[\max(0, \mathcal{W}\_1, \mathcal{W}\_2, \dots, \mathcal{W}\_n) - \min(0, \mathcal{W}\_1, \mathcal{W}\_2, \dots, \mathcal{W}\_n) \Big], \text{ with } n$$

1 2 ( ) () *W X X X kX n k k* , k=1, 2, …, n. Hurst (1955) found that many naturally occurring time series appear to be well represented by the relation <sup>4</sup> ( )/ ( ) *<sup>H</sup> ERn Sn n* � , as *n* , with Hurst Parameter H "typically" about 0.73. On the other hand, if the observations Xk come from a short-range dependent model, then Mandelbrot and Van Ness (1968) showed that 0.5 <sup>5</sup> *ERn Sn n* ( )/ ( ) � , as *n* . This discrepancy is generally referred to as the Hurst effect or Hurst phenomenon.

It is very important whether the traffic data sampled by the proposed sampling scheme retains the self-similar property for various anomaly detection techniques, which may directly affect the accuracy and efficiency of detection. So we verify this based on two different parameters: the mean of the packet count and the Hurst parameter. The peak-tomean ratio (PMR) can be used as an indicator of traffic burstiness. The PMR is calculated by comparing the peak value of the measure entity with the average value from the population. However, this statistic is heavily dependent on the size of the intervals, and therefore may or may not represent the actual traffic characteristic. A more accurate indicator of traffic burstiness is given by the Hurst parameter. The Hurst parameter (*H*) is a measure of the degree of self-similarity. In this paper we use the R/S statistical test to obtain an estimate for the Hurst parameter. We run the test on both the original and the sampled data.

In our sampling scheme, simple random sampling is conducted in every time block (strata) and this refers to stratified random sampling.

In Figures 3 and 4, we show the average sampling error for the Hurst parameter and the sample mean, respectively. As one can see in Figure 13, the stratified random sampling algorithm resulted in a higher average percent error for the Hurst parameter when compared to adaptive sampling. This could be the result of missing data spread out over a number of sampling intervals. In Figure 14, the average percentage error for the mean statistic was marginally lower for our sampling algorithm when compared with the stratified random sampling algorithm, albeit the difference was insignificant. One possible reason for this marginal difference is the inherent randomness nature of stratified random sampling algorithm—i.e., the weighted mean packets are nicely sampled randomly, which results in good estimation of mean.

Intrusion Detection and Prevention in High Speed Network 77

In Figure 15, we can see that HSTCP was able to cope with the 1-Gbps network traffic, and elephant flow sampling was not initiated. The detection rates of RealSecure remarkably

To evaluate the performance of HSTCP for intrusion detection in the high-speed network, we used IXIA1600T to transmit the synthetic dataset at a 10-Gbps speed. Under this circumstance, HSTCP could not capture the whole traffic and tended to drop packets. In Figure16, we can see that the detection rates sharply decline without sampling. While HSTCP initiated elephant flow sampling methods to cope with the network traffic, RealSecure could still make response to the high-speed traffic. With the increase in false

0 0.04 0.08 0.12 0.16 0.2 False Alarm Rate

We compared the performance of the adaptive sampling technique and the random sampling technique. As expected, the adaptive sampling technique showed superior performance.

0 0.04 0.08 0.12 0.16 0.2 False Alarm Rate

Without sampling Random sampling Adaptive sampling

increased during initial phases, and then tended to be stable.

1

0

0.2

0.4

Detection Rate

0.6

0.8

Fig. 15. The ROC curves at 1-Gbps speed.

alarm rates, detection rates remained high and stable.

Fig. 16. The ROC curves at a 10-Gbps speed.

0

0.2

0.4

Detection Rate

0.6

0.8

1

Fig. 13. Average percentage error for the Hurst parameter.

Fig. 14. Average percentage error for the mean statistic.

#### **4.2.2 Evaluation of the platform for intrusion detection/prevention**

The IXIA1600T (a special network test facility of IXIA Crop) is used to transmit the synthetic dataset in 1000M Ethernet. The ISS RealSecure gigabit network is selected to detect attacks (intrusions), and HSTCP is used for collecting network traffic for it. As the first commercial IDS, RealSecure has been playing an important role in this field, and it provides network intrusion detection and response capabilities that monitor the gigabit network.

The ROC (receiver operating characteristic) technique is used to evaluate the detection effect of ISS RealSecure. The ROC approach analyzes the tradeoff between false alarm and detection rates for detection systems. It was originally developed in the field of signal detection. More recently, it has become the standard approach to evaluate IDSs. In this paper, we mainly take into account the detection rates of RealSecure when its false alarm rate is under 0.2. Too many false alarms will make administrators consume unnecessary time and energy analyzing these alarms, which compromises the usability and validity.

Fig. 13. Average percentage error for the Hurst parameter.

Stratified random sampling

Hurst Parameter

Adaptive Sampling

Adaptive Sampling

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Average Percentage Error

Fig. 14. Average percentage error for the mean statistic.

0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085

Average Percentage Error

**4.2.2 Evaluation of the platform for intrusion detection/prevention** 

intrusion detection and response capabilities that monitor the gigabit network.

Stratified random sampling

The IXIA1600T (a special network test facility of IXIA Crop) is used to transmit the synthetic dataset in 1000M Ethernet. The ISS RealSecure gigabit network is selected to detect attacks (intrusions), and HSTCP is used for collecting network traffic for it. As the first commercial IDS, RealSecure has been playing an important role in this field, and it provides network

Mean

The ROC (receiver operating characteristic) technique is used to evaluate the detection effect of ISS RealSecure. The ROC approach analyzes the tradeoff between false alarm and detection rates for detection systems. It was originally developed in the field of signal detection. More recently, it has become the standard approach to evaluate IDSs. In this paper, we mainly take into account the detection rates of RealSecure when its false alarm rate is under 0.2. Too many false alarms will make administrators consume unnecessary time and energy analyzing these alarms, which compromises the usability and validity.

In Figure 15, we can see that HSTCP was able to cope with the 1-Gbps network traffic, and elephant flow sampling was not initiated. The detection rates of RealSecure remarkably increased during initial phases, and then tended to be stable.

Fig. 15. The ROC curves at 1-Gbps speed.

To evaluate the performance of HSTCP for intrusion detection in the high-speed network, we used IXIA1600T to transmit the synthetic dataset at a 10-Gbps speed. Under this circumstance, HSTCP could not capture the whole traffic and tended to drop packets. In Figure16, we can see that the detection rates sharply decline without sampling. While HSTCP initiated elephant flow sampling methods to cope with the network traffic, RealSecure could still make response to the high-speed traffic. With the increase in false alarm rates, detection rates remained high and stable.

Fig. 16. The ROC curves at a 10-Gbps speed.

We compared the performance of the adaptive sampling technique and the random sampling technique. As expected, the adaptive sampling technique showed superior performance.

Intrusion Detection and Prevention in High Speed Network 79

The application layer mainly refers to Client and Server; in this paper it represents firewall and IDS. During the transmission, both of the firewall and IDS can be client or server. Firewall and IDS system are all in the network layer. Firewall provides the data source and a place to process the final data while IDS is responsible for receiving data request from

In this paper, a security protocol called TLS (Transport Layer Security) applied to supporting the security and reliability for data communicate among each other. It also may protect the privacy of the applications and users for network communication. When server and client are communicated, TLS could make sure important messages won't be sniffered

CORBA security service (CORBASec) is an important public object service in CORBA. It constructs secure language environment between client objects and service objects, and also

In a distributed intrusion prevention system, data filter module is composed of a firewall and other components. Network data processing module generally refers to IDS. In this paper, data transmitted between firewall and IDS are classified into four categories: event data, rule data, analysis result data and actions response data, and it is referred to the

The relationship among above data: Data generated by the firewall with network packets filtered called event data. Whether the event described is an intrusion event, it depends on the match or analysis of IDS based rule data. If it was a real intrusion event, then generated the analysis result data. The firewall will make a response to the analysis result data based

Data filtered original network packets by firewall according to security strategy is event data. Therefore, this kind of data must contain the complete description of network original data that IDS can detect or analyze by those matching information in rules. In addition, it needs to contain the firewall name and the time of event happened. The reason of including firewall name is that more firewalls may be deployed in network. When detecting the network data, more than one firewall will find the same event, so we need to distinguish and analyze. In some cases, IDS need to detect what happened during some phases and then ensure whether any intrusions had happened, so the data type also contains the time of event happened. At last, to support data expansion, we need additional data applied to describing some additional description in event data, and it can be used as a reserved

firewall and analyzing them, and then return the processed results to the firewall.

This layer primarily encapsulates and analyzes the communicated data.

or stolen by the third party. It is a successor protocol followed by SSL.

on the corresponding strategies, and generates action response data.

**Application Layer** 

**XML parsing layer** 

**Message transaction layer** 

**CORBA security service** 

**Design of event data** 

provides better security service [10].

function units in CIDF framework [11].

**5.3 Design of data exchange format based on XML** 

interface. The illustration of event data is as follows:
