**3. Results and discussions**

#### **3.1. Network traffic data 2D and 3D presentation**

**Figure 3** shows the LSBU 1-year total network traffic data, recorded at 1 h interval, in 3D format (top) and 2D format (bottom), which X axis represents the time of the day, from 01:00 to 24:00, and Y axis represents the day of the year, from 1 to 365, and Z axis represents the total traffic in Gbits per second. The data was recorded at 1 h interval for a period of 1 year, November 2016 to November 2017. By presenting the network traffic data in 2D and 3D formation we can better understand the network usage and characterisations.

The results show that the total network traffic varies from season to season throughout the year, and also varies from time to time throughout the day. By understanding the total network traffic pattern, we can plan better for the network operations, optimise the network usage, and identify potentially suspicious traffics.

**Figure 4** shows the LSBU 1-year World Wide Web (WWW) traffic data in 3D format (top) and the corresponding 2D presentation (bottom). The results show that WWW traffic is highly seasonal. It has a strong week day and weekend effect, this agrees well with our previous studies [16, 17]. It also has a strong effect of Christmas, Easter and summer holiday periods. The WWW traffic data varies significantly within a day, with the highest between 10:00 am and 19:00 pm, and lowest between 06:00 am and 09:00 am, not at the midnight! Also, there seems more traffic during the autumn semester (September–January) than spring semester (February–June).

**Figure 5** shows the 1-year Email traffic data in 3D format (top) and the corresponding 2D format (bottom). Similar to the WWW data, the Email data also shows week day and weekend effect, as well as seasonal effect. However, different from the WWW data, the major of the Email traffic was between 09:00 am and 18:00 pm, there is very little traffic in the evening and early in the morning. So in these periods, people browsed the web but did not send many emails. The massive peak at the middle of the graph is due to the Email upgrade, where a lot of emails have been sent and received.

#### **3.2. Network traffic data and Fourier transform**

**Figure 6** shows the original 1-year LSBU total network traffic data (top) and the corresponding Fourier transforms (bottom). **Figure 7** shows the original 1-year LSBU WWW data (top) and the corresponding Fourier transforms (bottom). **Figure 8** shows the original 1-year LSBU Email data (top) and the corresponding Fourier transforms (bottom).

**Figure 3.** The 3D presentation (top) and the corresponding 2D presentation (bottom) of 1-year total network data in a

Wavelet Transform for Educational Network Data Traffic Analysis

http://dx.doi.org/10.5772/intechopen.76455

159

daily usage pattern (Nov 2016–Nov 2017).

such as Short-Time Fourier Transform (STFT), which is also cable to create a local frequency

In this study, the wavelet transform used for data decomposition, data denoising and time dependent frequency components analysis by using continuous wavelet transform (CWT).

**Figure 3** shows the LSBU 1-year total network traffic data, recorded at 1 h interval, in 3D format (top) and 2D format (bottom), which X axis represents the time of the day, from 01:00 to 24:00, and Y axis represents the day of the year, from 1 to 365, and Z axis represents the total traffic in Gbits per second. The data was recorded at 1 h interval for a period of 1 year, November 2016 to November 2017. By presenting the network traffic data in 2D and 3D for-

The results show that the total network traffic varies from season to season throughout the year, and also varies from time to time throughout the day. By understanding the total network traffic pattern, we can plan better for the network operations, optimise the network

**Figure 4** shows the LSBU 1-year World Wide Web (WWW) traffic data in 3D format (top) and the corresponding 2D presentation (bottom). The results show that WWW traffic is highly seasonal. It has a strong week day and weekend effect, this agrees well with our previous studies [16, 17]. It also has a strong effect of Christmas, Easter and summer holiday periods. The WWW traffic data varies significantly within a day, with the highest between 10:00 am and 19:00 pm, and lowest between 06:00 am and 09:00 am, not at the midnight! Also, there seems more traffic during the autumn semester (September–January) than spring semester

**Figure 5** shows the 1-year Email traffic data in 3D format (top) and the corresponding 2D format (bottom). Similar to the WWW data, the Email data also shows week day and weekend effect, as well as seasonal effect. However, different from the WWW data, the major of the Email traffic was between 09:00 am and 18:00 pm, there is very little traffic in the evening and early in the morning. So in these periods, people browsed the web but did not send many emails. The massive peak at the middle of the graph is due to the Email upgrade, where a lot

**Figure 6** shows the original 1-year LSBU total network traffic data (top) and the corresponding Fourier transforms (bottom). **Figure 7** shows the original 1-year LSBU WWW data (top) and the corresponding Fourier transforms (bottom). **Figure 8** shows the original 1-year LSBU

mation we can better understand the network usage and characterisations.

analysis, the drawback of STFT is that the window size is fixed.

**3. Results and discussions**

158 Wavelet Theory and Its Applications

**3.1. Network traffic data 2D and 3D presentation**

usage, and identify potentially suspicious traffics.

(February–June).

of emails have been sent and received.

**3.2. Network traffic data and Fourier transform**

Email data (top) and the corresponding Fourier transforms (bottom).

**Figure 3.** The 3D presentation (top) and the corresponding 2D presentation (bottom) of 1-year total network data in a daily usage pattern (Nov 2016–Nov 2017).

**Figure 4.** The 3D presentation (top) and the corresponding 2D presentation (bottom) of 1-year WWW traffic data in a daily usage pattern (Nov 2016–Nov 2017).

**Figure 5.** The 3D presentation (top) and the corresponding 2D presentation (bottom) of 1-year Email data in a daily usage

Wavelet Transform for Educational Network Data Traffic Analysis

http://dx.doi.org/10.5772/intechopen.76455

161

pattern (Nov 2016–Nov 2017).

**Figure 5.** The 3D presentation (top) and the corresponding 2D presentation (bottom) of 1-year Email data in a daily usage pattern (Nov 2016–Nov 2017).

**Figure 4.** The 3D presentation (top) and the corresponding 2D presentation (bottom) of 1-year WWW traffic data in a

daily usage pattern (Nov 2016–Nov 2017).

160 Wavelet Theory and Its Applications

The total network traffic data, the WWW data, and the Email data, have completely different patterns, and therefore different FFT spectrum. With the total network traffic data, there are a lot of small peaks throughout the FFT spectrum, indicating there are repeatedly happened events. But with the WWW data and Email, the FFT peaks mainly occurs at lower frequency

Wavelet Transform for Educational Network Data Traffic Analysis

http://dx.doi.org/10.5772/intechopen.76455

163

Wavelet decomposition [18] is a powerful tool that can decompose the original network traffic into low frequency component (A) and high frequency component (D). The low frequency component (A) is also called approximation coefficient, and the high frequency component (D) is called detail coefficient. By performing decomposition several times, we also have multilevel wave decomposition, see **Figure 9**. The multilevel wavelet decomposition allows us to gradually separate and to eliminate high frequency components, which is mostly noise. Through wavelet decomposition we can reduce the data noise, and therefore observe the data

**Figure 10** shows the wavelet decomposition of the WWW network traffic data, wavelet used is "sym4" wavelet (see **Figure 1**) and the level of decomposition is level 4. The key in wavelet decomposition is to choose the right wavelet and to select the right level of decomposition. The results show that the low frequency component (A4) reflects better the trend of the network traffic data, where the high frequency component (D4) reflects more about the traffic noise.

**Figure 8.** The original LSBU 1 year Email data (top, Nov 2016 – Nov 2017) and its corresponding FFT spectrum (bottom).

range. But with FFT, it is not possible to identify when these events happened.

**3.3. Wavelet decomposition**

trend better.

**Figure 6.** The original 1-year total network traffic data (top) and its corresponding FFT spectrum (bottom) (Nov 2016– Nov 2017).

**Figure 7.** The original 1-year WWW data (top) and its corresponding FFT spectrum (bottom) (Nov 2016–Nov 2017).

The total network traffic data, the WWW data, and the Email data, have completely different patterns, and therefore different FFT spectrum. With the total network traffic data, there are a lot of small peaks throughout the FFT spectrum, indicating there are repeatedly happened events. But with the WWW data and Email, the FFT peaks mainly occurs at lower frequency range. But with FFT, it is not possible to identify when these events happened.

#### **3.3. Wavelet decomposition**

**Figure 7.** The original 1-year WWW data (top) and its corresponding FFT spectrum (bottom) (Nov 2016–Nov 2017).

**Figure 6.** The original 1-year total network traffic data (top) and its corresponding FFT spectrum (bottom) (Nov 2016–

Nov 2017).

162 Wavelet Theory and Its Applications

Wavelet decomposition [18] is a powerful tool that can decompose the original network traffic into low frequency component (A) and high frequency component (D). The low frequency component (A) is also called approximation coefficient, and the high frequency component (D) is called detail coefficient. By performing decomposition several times, we also have multilevel wave decomposition, see **Figure 9**. The multilevel wavelet decomposition allows us to gradually separate and to eliminate high frequency components, which is mostly noise. Through wavelet decomposition we can reduce the data noise, and therefore observe the data trend better.

**Figure 10** shows the wavelet decomposition of the WWW network traffic data, wavelet used is "sym4" wavelet (see **Figure 1**) and the level of decomposition is level 4. The key in wavelet decomposition is to choose the right wavelet and to select the right level of decomposition. The results show that the low frequency component (A4) reflects better the trend of the network traffic data, where the high frequency component (D4) reflects more about the traffic noise.

**Figure 8.** The original LSBU 1 year Email data (top, Nov 2016 – Nov 2017) and its corresponding FFT spectrum (bottom).

**Figure 11** shows the 1-year total network traffic data in an hourly usage pattern (Nov 2016– Nov 2017) and the corresponding denoised results. In this case, the wavelet used is "sym8" wavelet (see **Figure 1**), the level of decomposition was chosen as N = 3. Similar to wavelet decomposition, the key in wavelet denoising is to choose the right wavelet and to select the right level of decomposition, in order balance the noise removal and signal integrity. The

The quality of the denoised results is good. The trends of the original network traffic data are well preserved. To select the right wavelet and right level of decomposition is very important

With continuous wavelet transform (CWT), we can analyse the data and show how the frequency content of the data changes over time. This time dependent frequency varying information, which is lacking in other techniques, such FFT, is very useful for network traffic analysis. In this CWT calculation, there are several parameters to choose from, i.e.

*<sup>p</sup>* <sup>=</sup> *<sup>f</sup>* \_\_\_\_\_\_\_\_ *<sup>c</sup>*

**Figure 11.** One-year total network traffic data in an hourly usage pattern (Nov 2016–Nov 2017) and the corresponding

), the space between scales (ds) and number of

Wavelet Transform for Educational Network Data Traffic Analysis

http://dx.doi.org/10.5772/intechopen.76455

165

*p*

*<sup>S</sup> dt* (2)

) by using the follow-

so that we can achieve maximum denoising and preserve the useful information.

). The scales (S) can be converted to pseudo frequencies (*f*

denoising of the WWW data and Email also yields similar results.

**3.5. Continuous wavelet analysis (CWT)**

the type of wavelet, the smallest scale (S<sup>0</sup>

*f*

scales (N<sup>s</sup>

ing formula,

denoised results.

**Figure 9.** Multilevel wavelet decomposition, where approximation coefficient A is the low frequency component and detail coefficient D is the high frequency component.

**Figure 10.** One-year WWW data in an hourly usage pattern (top, Nov 2016–Nov 2017), the corresponding level 4 low frequency component (A4) (middle) and level 4 high frequency component (D4) (bottom).

#### **3.4. Wavelet denoising**

Based on wavelet decomposition, a very useful feature of wavelet analysis is denoising, which is very useful for noisy data. The steps are as follows. First, choose a wavelet and a level of decomposition N, and then compute the wavelet decompositions of the data at levels 1 to N. For each level, a threshold is selected and the threshold applied to the detail coefficients (D). Finally, compute wavelet reconstructions using the original approximation coefficients (A) of level N and the modified detail coefficients (D) of levels 1 to N.

**Figure 11** shows the 1-year total network traffic data in an hourly usage pattern (Nov 2016– Nov 2017) and the corresponding denoised results. In this case, the wavelet used is "sym8" wavelet (see **Figure 1**), the level of decomposition was chosen as N = 3. Similar to wavelet decomposition, the key in wavelet denoising is to choose the right wavelet and to select the right level of decomposition, in order balance the noise removal and signal integrity. The denoising of the WWW data and Email also yields similar results.

The quality of the denoised results is good. The trends of the original network traffic data are well preserved. To select the right wavelet and right level of decomposition is very important so that we can achieve maximum denoising and preserve the useful information.

#### **3.5. Continuous wavelet analysis (CWT)**

**3.4. Wavelet denoising**

detail coefficient D is the high frequency component.

164 Wavelet Theory and Its Applications

Based on wavelet decomposition, a very useful feature of wavelet analysis is denoising, which is very useful for noisy data. The steps are as follows. First, choose a wavelet and a level of decomposition N, and then compute the wavelet decompositions of the data at levels 1 to N. For each level, a threshold is selected and the threshold applied to the detail coefficients (D). Finally, compute wavelet reconstructions using the original approximation coefficients

**Figure 10.** One-year WWW data in an hourly usage pattern (top, Nov 2016–Nov 2017), the corresponding level 4 low

**Figure 9.** Multilevel wavelet decomposition, where approximation coefficient A is the low frequency component and

(A) of level N and the modified detail coefficients (D) of levels 1 to N.

frequency component (A4) (middle) and level 4 high frequency component (D4) (bottom).

With continuous wavelet transform (CWT), we can analyse the data and show how the frequency content of the data changes over time. This time dependent frequency varying information, which is lacking in other techniques, such FFT, is very useful for network traffic analysis. In this CWT calculation, there are several parameters to choose from, i.e. the type of wavelet, the smallest scale (S<sup>0</sup> ), the space between scales (ds) and number of scales (N<sup>s</sup> ). The scales (S) can be converted to pseudo frequencies (*f p* ) by using the following formula,

$$f\_p = \frac{f\_c}{S\,dt} \tag{2}$$

**Figure 11.** One-year total network traffic data in an hourly usage pattern (Nov 2016–Nov 2017) and the corresponding denoised results.

Where *f c* is the centre frequency of the wavelet, and *dt* is the sampling time. Scales are inversely proportional to frequencies, i.e. small scales represents high frequencies, and vice versa.

**Figure 14** shows the CWT results of the original 1 year total traffic data using Morlet wavelet, with S0 = 21,600, ds = 0.025, and Ns = 300 as CWT parameters. The X axis is time of 1 year, and the Y axis is pseudo frequency. We can convert this pseudo frequency into event. **Table 1** shows the pseudo frequencies in Hz of hourly, daily, weekly, two weekly, monthly and quarterly events. Using these pseudo frequencies we can then identify the corresponding hourly, daily, weekly, two weekly, monthly and quarterly events in **Figure 11**. The hot spot at the lower left corner is the when the system is upgraded. By using CWT, we can easily identify

Wavelet Transform for Educational Network Data Traffic Analysis

http://dx.doi.org/10.5772/intechopen.76455

167

**Figure 15** shows the CWT results of the WWW traffic data using the same wavelet and same parameters. The results show that half daily and daily events happen throughout the year. They are highly seasonal, as you can clearly identify the summer, Christmas, and Easter gaps. The half daily and daily events also show clear day and night effects, as well as weekday and weekend effect, while weekly, two weekly and monthly events are patchy, with no seasonal effects.

**Figure 13.** The continuous wavelet transform (CWT) using different parameters. (a) S<sup>0</sup> = 3600, ds = 0.025, Ns = 300,

(b) S<sup>0</sup> = 21,600, ds = 0.025, Ns = 100, (c) S<sup>0</sup> = 21,600, ds = 0.025, Ns = 300 and (d) S<sup>0</sup> = 21,600, ds = 0.025, Ns = 500.

the event which is otherwise difficult to identify in the original time domain.

**Figure 12** shows the continuous wavelet transform (CWT) of 1 year total traffic using different wavelets, e.g. Morlet wavelet (analytic), Mexican hat wavelet (nonanalytic), bump wavelet (analytic), and Paul wavelet (analytic). The X axis is time of 1 year, and the Y axis is pseudo frequency. Different wavelet gives different results. Based on the results, we have decided to use Morlet wavelet to analyse the network traffic data, as it can provide more details on daily, weekly and monthly events, more details will be discussed later.

**Figure 13** shows the continuous wavelet transform (CWT) using different parameters. The smallest scale (S<sup>0</sup> ) decides the highest frequency. The ds decides the resolution of the results, Ns decides the range of the frequency. By balancing the result resolution, frequency range and calculation time, we have decided to perform the CWT using the following values, S<sup>0</sup> = 6 × 3600 = 21,600, i.e. six-hourly event, ds = 0.025, and Ns = 300.

**Figure 12.** The continuous wavelet transform (CWT) using different wavelets. (a) CWT with Morlet wavelet, (b) CWT with Mexican hat wavelet, (c) CWT with Paul wavelet and (d) CWT with bump wavelet.

**Figure 14** shows the CWT results of the original 1 year total traffic data using Morlet wavelet, with S0 = 21,600, ds = 0.025, and Ns = 300 as CWT parameters. The X axis is time of 1 year, and the Y axis is pseudo frequency. We can convert this pseudo frequency into event. **Table 1** shows the pseudo frequencies in Hz of hourly, daily, weekly, two weekly, monthly and quarterly events. Using these pseudo frequencies we can then identify the corresponding hourly, daily, weekly, two weekly, monthly and quarterly events in **Figure 11**. The hot spot at the lower left corner is the when the system is upgraded. By using CWT, we can easily identify the event which is otherwise difficult to identify in the original time domain.

**Figure 15** shows the CWT results of the WWW traffic data using the same wavelet and same parameters. The results show that half daily and daily events happen throughout the year. They are highly seasonal, as you can clearly identify the summer, Christmas, and Easter gaps. The half daily and daily events also show clear day and night effects, as well as weekday and weekend effect, while weekly, two weekly and monthly events are patchy, with no seasonal effects.

**Figure 13.** The continuous wavelet transform (CWT) using different parameters. (a) S<sup>0</sup> = 3600, ds = 0.025, Ns = 300, (b) S<sup>0</sup> = 21,600, ds = 0.025, Ns = 100, (c) S<sup>0</sup> = 21,600, ds = 0.025, Ns = 300 and (d) S<sup>0</sup> = 21,600, ds = 0.025, Ns = 500.

**Figure 12.** The continuous wavelet transform (CWT) using different wavelets. (a) CWT with Morlet wavelet, (b) CWT

is the centre frequency of the wavelet, and *dt* is the sampling time. Scales are inversely

proportional to frequencies, i.e. small scales represents high frequencies, and vice versa.

weekly and monthly events, more details will be discussed later.

3600 = 21,600, i.e. six-hourly event, ds = 0.025, and Ns = 300.

**Figure 12** shows the continuous wavelet transform (CWT) of 1 year total traffic using different wavelets, e.g. Morlet wavelet (analytic), Mexican hat wavelet (nonanalytic), bump wavelet (analytic), and Paul wavelet (analytic). The X axis is time of 1 year, and the Y axis is pseudo frequency. Different wavelet gives different results. Based on the results, we have decided to use Morlet wavelet to analyse the network traffic data, as it can provide more details on daily,

**Figure 13** shows the continuous wavelet transform (CWT) using different parameters. The

 decides the range of the frequency. By balancing the result resolution, frequency range and calculation time, we have decided to perform the CWT using the following values, S<sup>0</sup> = 6 ×

) decides the highest frequency. The ds decides the resolution of the results,

with Mexican hat wavelet, (c) CWT with Paul wavelet and (d) CWT with bump wavelet.

Where *f c*

166 Wavelet Theory and Its Applications

smallest scale (S<sup>0</sup>

Ns

**Figure 14.** The CWT time-frequency 2D results of the 1-year total network traffic data (Nov 2016–Nov 2017). The hot spot at the lower left corner is the when the system is upgraded.

**Figure 15.** The CWT time-frequency 2D results of the 1-year WWW data (Nov 2016–Nov 2017).

Wavelet Transform for Educational Network Data Traffic Analysis

http://dx.doi.org/10.5772/intechopen.76455

169

**Figure 16.** The CWT time-frequency 2D results of the 1-year Email data (Nov 2016–Nov 2017).


**Table 1.** The pseudo frequencies (Hz) of different events.

**Figure 16** shows the CWT results of the Email traffic data using the same wavelet and same parameters. The results show that hourly event and quarter daily events happen throughout the year. The Christmas gap is obvious whilst the summer and the Easter gaps are not. They also show clear weekday and weekend effects. The half daily, daily and weekly events are very patchy, with no seasonal effects. This kind of time-frequency results can help us to understand the traffic characteristics better.

Wavelet Transform for Educational Network Data Traffic Analysis http://dx.doi.org/10.5772/intechopen.76455 169

**Figure 15.** The CWT time-frequency 2D results of the 1-year WWW data (Nov 2016–Nov 2017).

**Figure 14.** The CWT time-frequency 2D results of the 1-year total network traffic data (Nov 2016–Nov 2017). The hot spot

**Figure 16** shows the CWT results of the Email traffic data using the same wavelet and same parameters. The results show that hourly event and quarter daily events happen throughout the year. The Christmas gap is obvious whilst the summer and the Easter gaps are not. They also show clear weekday and weekend effects. The half daily, daily and weekly events are very patchy, with no seasonal effects. This kind of time-frequency results can help us to

at the lower left corner is the when the system is upgraded.

168 Wavelet Theory and Its Applications

Hourly 2.78E−04 Quarter daily 1.39E−04 Half daily 2.31E−05 Daily 1.16E−05 Weekly 1.65E−06 Two weekly 8.27E−07 Monthly 3.86E−07 Quarterly 1.29E−07

**Table 1.** The pseudo frequencies (Hz) of different events.

understand the traffic characteristics better.

**Time Pseudo frequency (Hz)**

**Figure 16.** The CWT time-frequency 2D results of the 1-year Email data (Nov 2016–Nov 2017).
