**4.1 Design and implementation of STAMP**

68 Security Enhanced Applications for Information Systems

/2 /2 { ( 1) ( 1)} 1 *S S pX t n X t n*

/2 ( ( 1)) *<sup>S</sup> X tn n* 

The measured value of average length of packet with systematic sampling is 134.0631, standard variance is 73.607 and total number of packet *n* is 92647. From the distribution

We estimate the average packet length of population by systematic sampling and ensure the confidence of this length is 95%. From (3), we get the confidence interval of average packet length of population with confidence 95% is (134.0634 0.47). If the error is no more than 0.94, the confidence of this error is 95% considering any value in this interval as the estimate of packet average length in population. From preceding experimental results, the value of average length of packet in population is 134.4601, which is in the interval of

It can be seen from experiments that the average length of packet in normal traffic tends to be stable. When ICMP sweep attack appears, there are lots of packets with short length, and average length of packet is obviously various [15]. So the average length may be considered

With the change of user behavior and network topology, the characteristic of network may also vary. So the data related to the behavior of certain network during some time may be chosen to characterize the normal characteristics rather than all the passed data. Assume the granularity of time is *T*, *Li* is the average length of packet in the *ith* time interval. Anomaly actions are to be detected by comparing current sampled value and the value of preceding *i-1th* time interval. That is, if current average length of packet is in the scope of normal value calculated by preceding sampled data, there is no intrusion, and then the preceding data can

With the rapid development of network technology, there are more severe challenges to information security, and IDS has been an indispensable part of computer security. However, there appears packet drop for IDS especially in a high speed network environment. In this chaper, we apply packet selection model based on sampling methods of statistics to the procedure of data collection of IDS. Experiment results show that selected sample (packets) can be applied to detection and analysis for IDS in the scope of certain precision. In short, our method has the following advantages: firstly, this method exceedingly strengthens the processing performance of IDS by the means of replacing dropping packets passively with sampling packets actively especially in the large-scale high-speed network; secondly, this method has better expansibility, and various sampling

be updated. Otherwise, there appears intrusion, and current data can't be updated.

strategies may be applied corresponding to different implementation.

with given confidence 1

 

> is

(3)

*n n*

table of *t*, the value of *t(n-1)* appears to be a constant when *n* run to infinite.

that is

so the confidence interval of

(134.0634 0.47).

**3.3 Summary** 

as the measure to detect intrusions.

In this paper, we describe the design and implementation of a uniform high-speed traffic collection platform for intrusion detection/prevention based on sampling on FPGAs. To achieve this goal, HSTCP's architecture integrates elephant flow identification and adaptive elephant flow sampling into a FPGA prototyping board, which is a gigabit Ethernet network interface card with open hardware and software specifications.

A flow is a sequence of packets that share certain common properties (called flow specification) and have some temporal locality as observed at a given measurement point. Depending on the application and measurement objectives, flows may be defined in various manners such as source/destination IP addresses, port numbers, protocols, or combinations thereof. They can be further grouped and aggregated into various granularity levels such as network prefixes or autonomous systems. In this paper, we present flow statistics and experimental results using flows of 5 tuple (source/destination IP addresses, port numbers, and the protocol number) with a 60-s timeout value as our basic flow definition.

As many measurement-based studies have revealed, flow statistics exhibit strong heavy-tail behaviors in various networks (including the Internet). This characteristic is often referred to as the elephant and mice phenomenon (aka the vital few and trivial many rule), i.e., most flows (mice flows) only have a small number of packets, while a very few flows (elephant flows) have a large number of packets. A noticeable attribute of elephant flows is that they contribute a large portion of the total traffic volume despite being relatively few in the number of flows. In this paper, we define an elephant flow as a flow that contributes more than 0.1% of all unsampled packets.

The elephant flow identification module maintains an array of counters for every flow. Counters at certain index would contain the total number of packets belonging to all of the flows colliding into this index.

At intervals of certain time (60 s), the adaptive elephant flow sampling module would adjust the sampled rate according to the traffic load changes in the identified elephant flow. The sampled rate is based on the packet count. An AR model is used for predicting the number of packets of a certain elephant flow in the next time interval.

HSTCP is built on the Avnet Virtex-II Pro Development Board shown in Figure 11. This FPGA prototyping board includes all of the components necessary for a gigabit Ethernet network interface with embedded processors and on-board memory.

Fig. 11. HSTCP PCI card

Intrusion Detection and Prevention in High Speed Network 71

number of packets belonging to all of the flows colliding into this index. We do not have any explicit mechanisms to handle collisions as any such mechanism would impose additional processing and storage overheads that are unsustainable at high speeds. This makes the encoding process very simple and fast. Efficient implementations of hash functions allow the online streaming module to operate at speeds as high as OC-768 (40 Gbps) without

Internet traffic is known to have the property that a few flows can be very large, while most other flows are small. Thus, the counters in our array need to be large enough to accommodate the largest flow size . On the other hand, the counter size needs to be made as small as possible to save precious SRAM. Recent work on efficient implementation of statistical counters provides an ideal mechanism to balance these two conflicting requirements, which we will leverage in our scheme. For each counter in the array, say 32 bits wide, this mechanism uses 32 bits of slow memory (DRAM) to store a large counter and maintains a smaller counter, say 7 bits wide, in fast memory (SRAM). As the counters in SRAM exceed a certain threshold value (say 64) due to increments, it increments the value of the corresponding counter in DRAM by 64 and resets the counter in SRAM to 0. There is a 2 bit per counter overhead that covers the cost of keeping track of counters above the threshold, bringing the total number of bits per counter in SRAM to 9. For suitable choices of parameters, this scheme allows an efficient implementation of wide counters using a small amount of SRAM. This technique can be applied seamlessly to implementing the array of counters required in our data streaming module. In our algorithm 6, the size of each counter in SRAM is 9 bits and in DRAM is 32. Also, since the scheme in [23] incurs very little extra computational and memory access overhead, our streaming algorithm running on top

Traffic measurement and monitoring serves as the basis for a wide range of IP network operations and engineering tasks such as troubleshooting, accounting and usage profiling, routing weight configuration, load balancing, capacity planning, etc. Traditionally, traffic measurement and monitoring is done by capturing every packet traversing a router interface or a link. With today's high-speed (e.g., gigabit or terabit) links, such an approach is no longer feasible due to the excessive overheads it incurs on line-cards or routers. As a result, packet sampling has been suggested as a scalable alternative to address this problem. In this paper, we have investigated two sampling techniques, namely, simple random

Given the dynamic nature of network traffic, static sampling does not always ensure the accuracy of estimation, and tends to oversample at peak periods when efficiency and timeliness are most critical. More generally, static random sampling techniques do not take traffic dynamics into account; thus they cannot guarantee that the sampling error in each

In other words, under some traffic loads, static count-based sampling may be poorly suited to the monitoring task. During periods of idle activity or low network loads, a big sampling count provides sufficient accuracy at a minimal overhead. However, bursts of high activity require a small sampling count to accurately measure the network status at the expense of

missing any packets.

of it can still achieve high speeds such as OC-768.

packet sampling and adaptive weighted packet sampling.

block falls within a prescribed error tolerance level.

**4.1.2 Adaptive elephant flow sampling** 

#### **4.1.1 Elephant flow identification**

Identifying elephant flows is very important in developing effective and efficient traffic engineering schemes. In addition, obtaining the statistics of these flows is also very useful for network operation and management. On the other hand, with the rapid growth of the link speed in recent years, packet sampling has become a very attractive and scalable means to measure flow statistics.

To identify elephant flows, traditionally we have to collect all packets in the concerned network, and then extract their flow statistics. As many previous studies have indicated, however, such an approach lacks scalability. For very high speed links (say, OC-192+), directly measuring all flows is beyond the capability of measurement equipments (i.e., the requirements for CPU power, memory/storage capacity, and access speed are overwhelming).

Fig. 12. Elephant flow volume estimation.

Because flows are dynamic in their arrival time and active duration, it is very hard to define a sampling interval that is valid for all elephant flows, while allowing us to adjust the sampling rate in accordance with the changing traffic condition to ensure estimation accuracy. We tackle this problem by using stratified sampling. Sequenced predetermined, nonoverlapping time blocks are called strata. In each block, systematic count - based sampling is applied, that is every Cth packet of the parent process is deterministically selected for sampling, starting from some starting sampling point, and other packets will be directly dropped. At the end of each block, flow statistics are estimated. Then, naturally, a flow's volume is summarized into a single estimation record at the end of the last time block enclosing the flow. Notice that from each flow's point of view, its duration is divided or stratified in a fixed time. The predetermined time blocks enable us to estimate the flow volume without knowing dynamic flow arrival times and their durations while adjusting the sampling rate according to dynamical traffic changes.

The elephant flow identification module maintains an array of counters. Upon the arrival of a packet, its flow specification is hashed to generate an index into this array, and the counter at this index is incremented by 1. Collisions due to hashing might cause two or more flow labels to be hashed to same indices. Counters at such an index would contain the total

Identifying elephant flows is very important in developing effective and efficient traffic engineering schemes. In addition, obtaining the statistics of these flows is also very useful for network operation and management. On the other hand, with the rapid growth of the link speed in recent years, packet sampling has become a very attractive and scalable means

To identify elephant flows, traditionally we have to collect all packets in the concerned network, and then extract their flow statistics. As many previous studies have indicated, however, such an approach lacks scalability. For very high speed links (say, OC-192+), directly measuring all flows is beyond the capability of measurement equipments (i.e., the requirements for CPU power, memory/storage capacity, and access speed are

Because flows are dynamic in their arrival time and active duration, it is very hard to define a sampling interval that is valid for all elephant flows, while allowing us to adjust the sampling rate in accordance with the changing traffic condition to ensure estimation accuracy. We tackle this problem by using stratified sampling. Sequenced predetermined, nonoverlapping time blocks are called strata. In each block, systematic count - based sampling is applied, that is every Cth packet of the parent process is deterministically selected for sampling, starting from some starting sampling point, and other packets will be directly dropped. At the end of each block, flow statistics are estimated. Then, naturally, a flow's volume is summarized into a single estimation record at the end of the last time block enclosing the flow. Notice that from each flow's point of view, its duration is divided or stratified in a fixed time. The predetermined time blocks enable us to estimate the flow volume without knowing dynamic flow arrival times and their durations while adjusting

The elephant flow identification module maintains an array of counters. Upon the arrival of a packet, its flow specification is hashed to generate an index into this array, and the counter at this index is incremented by 1. Collisions due to hashing might cause two or more flow labels to be hashed to same indices. Counters at such an index would contain the total

**4.1.1 Elephant flow identification** 

Fig. 12. Elephant flow volume estimation.

the sampling rate according to dynamical traffic changes.

to measure flow statistics.

overwhelming).

number of packets belonging to all of the flows colliding into this index. We do not have any explicit mechanisms to handle collisions as any such mechanism would impose additional processing and storage overheads that are unsustainable at high speeds. This makes the encoding process very simple and fast. Efficient implementations of hash functions allow the online streaming module to operate at speeds as high as OC-768 (40 Gbps) without missing any packets.

Internet traffic is known to have the property that a few flows can be very large, while most other flows are small. Thus, the counters in our array need to be large enough to accommodate the largest flow size . On the other hand, the counter size needs to be made as small as possible to save precious SRAM. Recent work on efficient implementation of statistical counters provides an ideal mechanism to balance these two conflicting requirements, which we will leverage in our scheme. For each counter in the array, say 32 bits wide, this mechanism uses 32 bits of slow memory (DRAM) to store a large counter and maintains a smaller counter, say 7 bits wide, in fast memory (SRAM). As the counters in SRAM exceed a certain threshold value (say 64) due to increments, it increments the value of the corresponding counter in DRAM by 64 and resets the counter in SRAM to 0. There is a 2 bit per counter overhead that covers the cost of keeping track of counters above the threshold, bringing the total number of bits per counter in SRAM to 9. For suitable choices of parameters, this scheme allows an efficient implementation of wide counters using a small amount of SRAM. This technique can be applied seamlessly to implementing the array of counters required in our data streaming module. In our algorithm 6, the size of each counter in SRAM is 9 bits and in DRAM is 32. Also, since the scheme in [23] incurs very little extra computational and memory access overhead, our streaming algorithm running on top of it can still achieve high speeds such as OC-768.

#### **4.1.2 Adaptive elephant flow sampling**

Traffic measurement and monitoring serves as the basis for a wide range of IP network operations and engineering tasks such as troubleshooting, accounting and usage profiling, routing weight configuration, load balancing, capacity planning, etc. Traditionally, traffic measurement and monitoring is done by capturing every packet traversing a router interface or a link. With today's high-speed (e.g., gigabit or terabit) links, such an approach is no longer feasible due to the excessive overheads it incurs on line-cards or routers. As a result, packet sampling has been suggested as a scalable alternative to address this problem. In this paper, we have investigated two sampling techniques, namely, simple random packet sampling and adaptive weighted packet sampling.

Given the dynamic nature of network traffic, static sampling does not always ensure the accuracy of estimation, and tends to oversample at peak periods when efficiency and timeliness are most critical. More generally, static random sampling techniques do not take traffic dynamics into account; thus they cannot guarantee that the sampling error in each block falls within a prescribed error tolerance level.

In other words, under some traffic loads, static count-based sampling may be poorly suited to the monitoring task. During periods of idle activity or low network loads, a big sampling count provides sufficient accuracy at a minimal overhead. However, bursts of high activity require a small sampling count to accurately measure the network status at the expense of

Intrusion Detection and Prevention in High Speed Network 73

ˆ *f f h h f f h h*

The value of *R* will be equal to 1 when the predicted behavior is same as the observed

. *New* ( ) *C RCCurr* .

. *New* 2 *C CCurr*

. *New* (1 ) *C RCCurr* .

. 2 *R CC undefined New Curr*

HSTCP is built on the Avnet Virtex-II Pro Development Board, and its architecture satisfies the constraints of the Avnet board while efficiently integrating the components of a gigabit Ethernet network interface. The MAC unit, DMA unit, inter-FPGA bridge, hardware event management unit, and PCI interface were custom designed for HSTCP. The remaining components were provided by Xilinx and used with little or no modification. Both the MAC unit and the PCI interface are built around low-level interfaces provided by Xilinx; however, those units are still mostly custom logic to integrate them into the rest of the system and to provide flexible software control over the

The Xilinx Virtex-II Pro FPGA on the Avnet development board contains most of the NIC logic, including the PowerPC processors, on-chip memories, MAC controller, DMA unit front-end, and DDR memory controller. The smaller Spartan-IIE FPGA contains the PCI controller, the back-end DMA controller, and a SRAM memory controller. The SDRAM, although connected to a shared data bus between the Spartan and Virtex FPGAs, was not used because the entire bus bandwidth was needed to efficiently use the PCI interface.

To save development time, prebuilt Xilinx cores were used for several of the hardware modules, including the PCI interface, DDR controller, and a low-level MAC. However, these cores cannot be connected directly to form a working NIC. For example, although Xilinx provides a PCI core, it must be wrapped within a custom DMA unit to allow the PowerPC to initiate and manage high-performance burst transfers to/from the host system across the FPGA bridge. Similarly, although Xilinx provides a low-level MAC core, it must be outfitted with an advanced descriptor-based control system, data buffers, and a DMA unit to transfer

*m m <sup>R</sup> m m*

behavior. We define a range of values *<sup>M</sup>* 1 *R R IN MAX* , such that

if *R R MIN* , that is *C C New Curr*. , then

If *R R MAX* , that is *C C New Curr*. , then

If *R RR MIN MAX* , then

**4.1.3 FPGA implementation** 

hardware functionality.

Otherwise,

1 1

 .

increased sampling overhead. To address this issue, adaptive sampling techniques can be employed to dynamically adjust the sampling count and optimize accuracy and overhead.

In this paper, we investigated adaptive sampling techniques to intelligently sample the incoming elephant flows. A key element in adaptive sampling is the prediction of future behavior based on the observed samples. The AR model described in this section predicts the packet count of the next sampling interval based on the past samples. Inaccurate predictions indicate a change in the elephant flow load and require an increased/decreased sampling count to determine the new value.

In any case, we cannot accurately choose the sampling count when the population size (total packet count of the observation time block) is unknown. We can compute the sampling probability at the beginning of a block by predicting the total packet count of a certain elephant flow. We employ an AR model for predicting the total packet count *<sup>f</sup> mh* of the *h*th block of elephant flow *f*, as compared to other time series models, since it is easier to understand and computationally more efficient. In particular, using the AR model, the model parameters can be obtained by solving a set of simple linear equations, making it suitable for online implementation.

We will now briefly describe how the total packet count *<sup>f</sup> mh* of the *h*th block of elephant flow *f* can be estimated, based on the past packet counts using the AR(u) model, where u is the lag length. Using the AR(u) model, *<sup>f</sup> mh* can be expressed as

$$m\_{\rm fr}^{f} = \sum\_{i=1}^{u} a\_{i} m\_{\rm fr-i}^{f} + e\_{\rm fr}$$

where *<sup>i</sup> a* , *i*=1, …, *u*, are the model parameters, and *he* is the uncorrelated error (which we refer to as the prediction error).

The model parameters *<sup>i</sup> a* , *i*=1, …, *u*, can be determined by solving a set of linear equations in terms of *v* past values of *<sup>f</sup> mi* '*s*, where *v*≥1 is a configurable parameter independent of *u*, and is typically referred to as the memory size.

Let ˆ *<sup>f</sup> mh* denote the predicted packet count of the *h*th block of elephant flow *f*. Using the AR(*u*) prediction model, we have

$$
\hat{m}\_{\hat{h}}^{f} = \sum\_{i=1}^{n} a\_i m\_{\hat{h}-i}^{f} \cdot \hat{\phantom{i}}
$$

Using the AR prediction model, at the end of each block, the model parameters ( *<sup>i</sup> a* ) are computed. The complexity of the AR prediction model parameter computation is only *O*(*v*) where *v* is the memory size.

The predicted ˆ *<sup>f</sup> mh* is then compared with the actual value of the sample *<sup>f</sup> mh* . A set of rules is applied to adjust the current sampling count, ∆*C*curr.=*c*(*h*) *c*(*h*1), to a new value, ∆*C*new, which is used to adjust the sampling count to compare the rate of change in the predicted sample value, 1 <sup>ˆ</sup> *f f m m h h* , to the actual rate of change, 1 *f f m m h h* . The ratio between the two rates is defined as *R*, where

$$R = \left| \frac{\hat{m}\_{\text{fr}}^{f} - m\_{\text{fr}-1}^{f}}{m\_{\text{fr}}^{f} - m\_{\text{fr}-1}^{f}} \right| \cdot $$

The value of *R* will be equal to 1 when the predicted behavior is same as the observed behavior. We define a range of values *<sup>M</sup>* 1 *R R IN MAX* , such that

if *R R MIN* , that is *C C New Curr*. , then

$$
\Delta \mathbf{C}\_{New} = \lfloor (R) \times \Delta \mathbf{C}\_{Curr.} \rfloor \,\,\,.
$$

If *R RR MIN MAX* , then

$$
\Delta \mathbf{C}\_{New} = \mathbf{2} \times \Delta \mathbf{C}\_{Curr}.
$$

If *R R MAX* , that is *C C New Curr*. , then

$$
\Delta C\_{New} = \lceil (1+R) \times \Delta C\_{Curr.} \rceil \\_
$$

Otherwise,

72 Security Enhanced Applications for Information Systems

increased sampling overhead. To address this issue, adaptive sampling techniques can be employed to dynamically adjust the sampling count and optimize accuracy and overhead. In this paper, we investigated adaptive sampling techniques to intelligently sample the incoming elephant flows. A key element in adaptive sampling is the prediction of future behavior based on the observed samples. The AR model described in this section predicts the packet count of the next sampling interval based on the past samples. Inaccurate predictions indicate a change in the elephant flow load and require an increased/decreased

In any case, we cannot accurately choose the sampling count when the population size (total packet count of the observation time block) is unknown. We can compute the sampling probability at the beginning of a block by predicting the total packet count of a certain elephant flow. We employ an AR model for predicting the total packet count *<sup>f</sup> mh* of the *h*th block of elephant flow *f*, as compared to other time series models, since it is easier to understand and computationally more efficient. In particular, using the AR model, the model parameters can be obtained by solving a set of simple linear equations, making it

We will now briefly describe how the total packet count *<sup>f</sup> mh* of the *h*th block of elephant flow *f* can be estimated, based on the past packet counts using the AR(u) model, where u is

> 1 *<sup>u</sup> f f h hi i h i m am e*

where *<sup>i</sup> a* , *i*=1, …, *u*, are the model parameters, and *he* is the uncorrelated error (which we

The model parameters *<sup>i</sup> a* , *i*=1, …, *u*, can be determined by solving a set of linear equations in terms of *v* past values of *<sup>f</sup> mi* '*s*, where *v*≥1 is a configurable parameter independent of *u*, and

Let ˆ *<sup>f</sup> mh* denote the predicted packet count of the *h*th block of elephant flow *f*. Using the

1 <sup>ˆ</sup> *<sup>u</sup> f f h hi i i m am* .

Using the AR prediction model, at the end of each block, the model parameters ( *<sup>i</sup> a* ) are computed. The complexity of the AR prediction model parameter computation is only *O*(*v*)

The predicted ˆ *<sup>f</sup> mh* is then compared with the actual value of the sample *<sup>f</sup> mh* . A set of rules is applied to adjust the current sampling count, ∆*C*curr.=*c*(*h*) *c*(*h*1), to a new value, ∆*C*new, which is used to adjust the sampling count to compare the rate of change in the predicted

*f f m m h h* . The ratio between the two

sampling count to determine the new value.

suitable for online implementation.

refer to as the prediction error).

AR(*u*) prediction model, we have

where *v* is the memory size.

rates is defined as *R*, where

is typically referred to as the memory size.

the lag length. Using the AR(u) model, *<sup>f</sup> mh* can be expressed as

sample value, 1 <sup>ˆ</sup> *f f m m h h* , to the actual rate of change, 1

$$R\_{undefined} \Rightarrow \Delta C\_{New} = \mathcal{D} \times \Delta C\_{Curr.}$$

#### **4.1.3 FPGA implementation**

HSTCP is built on the Avnet Virtex-II Pro Development Board, and its architecture satisfies the constraints of the Avnet board while efficiently integrating the components of a gigabit Ethernet network interface. The MAC unit, DMA unit, inter-FPGA bridge, hardware event management unit, and PCI interface were custom designed for HSTCP. The remaining components were provided by Xilinx and used with little or no modification. Both the MAC unit and the PCI interface are built around low-level interfaces provided by Xilinx; however, those units are still mostly custom logic to integrate them into the rest of the system and to provide flexible software control over the hardware functionality.

The Xilinx Virtex-II Pro FPGA on the Avnet development board contains most of the NIC logic, including the PowerPC processors, on-chip memories, MAC controller, DMA unit front-end, and DDR memory controller. The smaller Spartan-IIE FPGA contains the PCI controller, the back-end DMA controller, and a SRAM memory controller. The SDRAM, although connected to a shared data bus between the Spartan and Virtex FPGAs, was not used because the entire bus bandwidth was needed to efficiently use the PCI interface.

To save development time, prebuilt Xilinx cores were used for several of the hardware modules, including the PCI interface, DDR controller, and a low-level MAC. However, these cores cannot be connected directly to form a working NIC. For example, although Xilinx provides a PCI core, it must be wrapped within a custom DMA unit to allow the PowerPC to initiate and manage high-performance burst transfers to/from the host system across the FPGA bridge. Similarly, although Xilinx provides a low-level MAC core, it must be outfitted with an advanced descriptor-based control system, data buffers, and a DMA unit to transfer

Intrusion Detection and Prevention in High Speed Network 75

In 1994, Leland *et al*. showed that the Ethernet traffic consisted of slowly decaying packet count bursts across all time scales. Time series that consist of such a pattern are said to exhibit the property of long-range dependence and are termed as "self-similar." Similar selfsimilar behavior has also been observed in wide-area Internet traffic by other researchers . One important characteristic of a self-similar process is that its degree of self-similarity can be expressed with a single parameter, namely the Hurst parameter which can be derived

For a given set of observations X1, X2, …, Xn with a sample mean *X n*( ) and sample variance

1 2 ( ) () *W X X X kX n k k* , k=1, 2, …, n. Hurst (1955) found that many naturally

*n* , with Hurst Parameter H "typically" about 0.73. On the other hand, if the observations Xk come from a short-range dependent model, then Mandelbrot and Van Ness

It is very important whether the traffic data sampled by the proposed sampling scheme retains the self-similar property for various anomaly detection techniques, which may directly affect the accuracy and efficiency of detection. So we verify this based on two different parameters: the mean of the packet count and the Hurst parameter. The peak-tomean ratio (PMR) can be used as an indicator of traffic burstiness. The PMR is calculated by comparing the peak value of the measure entity with the average value from the population. However, this statistic is heavily dependent on the size of the intervals, and therefore may or may not represent the actual traffic characteristic. A more accurate indicator of traffic burstiness is given by the Hurst parameter. The Hurst parameter (*H*) is a measure of the degree of self-similarity. In this paper we use the R/S statistical test to obtain an estimate for the Hurst parameter. We run the test on both the original and the

In our sampling scheme, simple random sampling is conducted in every time block (strata)

In Figures 3 and 4, we show the average sampling error for the Hurst parameter and the sample mean, respectively. As one can see in Figure 13, the stratified random sampling algorithm resulted in a higher average percent error for the Hurst parameter when compared to adaptive sampling. This could be the result of missing data spread out over a number of sampling intervals. In Figure 14, the average percentage error for the mean statistic was marginally lower for our sampling algorithm when compared with the stratified random sampling algorithm, albeit the difference was insignificant. One possible reason for this marginal difference is the inherent randomness nature of stratified random sampling algorithm—i.e., the weighted mean packets are nicely sampled randomly, which

, as *n* . This discrepancy is generally referred

, as

occurring time series appear to be well represented by the relation <sup>4</sup> ( )/ ( ) *<sup>H</sup> ERn Sn n* �

from the rescaled adjusted range (R/S) statistic. It is defined as follows.

*Rn Sn* ( )/ ( ) 1 / ( ) max(0, , ,..., ) min(0, , ,..., ) *Sn W W W W W W* 1 2 *n n* 1 2 , with

<sup>2</sup> *S n*( ) , the rescaled adjusted range or the R/S statistic is given by

<sup>5</sup> *ERn Sn n* ( )/ ( ) �

(1968) showed that 0.5

sampled data.

to as the Hurst effect or Hurst phenomenon.

and this refers to stratified random sampling.

results in good estimation of mean.

packet data between NIC memory and the PHY. Finally, the DDR controller required modifications to function in this specific development board with its unique wiring.

The processors, memories, and hardware units are interconnected on the Virtex FPGA by a processor local bus (PLB). The PLB is a high-performance memory-mapped 100-MHz 64-bitwide split-transaction bus that can provide a maximum theoretical bandwidth of 12.5 Gbits/s in a full duplex mode. The PLB allows for burst transmissions of up to 128 bytes in a single operation, which is used by the DMA and MAC units to improve memory access efficiency.

Also attached to the PLB is a small memory-mapped control module used to route descriptors to or from the MAC and DMA hardware assist units. This module also provides a central location for hardware counters, event thresholds, and other low-level control functions. By attaching this central control unit to the PLB, either PowerPC processor can manipulate these important NIC functions. The control unit takes 18 PowerPC cycles to read and 12 cycles to write a 32-bit word, primarily due to bus arbitration delays.
