**2.3.2 Probabilistic forecasting of earthquakes**

(Kagan, & Jackson, 2000) has developed a research with both short and long-term forecast approach and testing both with a likelihood function to 5.8-magnitude (or larger) quakes. Although the long-term approach (see Table 1), is not completely developed and is suitable to estimation of occurrence of earthquakes, it is derived from statistical, physical and intuitive arguments while the short-term forecast seismicity model is based on a specific stochastic model and updated daily (see Table 1).

The research assumes that the rate density (probability per unit area and time) is proportional to a smoothed version of past seismicity and depends approximately on a negative power of the epicentral distance and linearly on magnitude of the past earthquakes.

The model (Kagan, & Jackson, 2000) does not use retrospective evaluation of seismic data. The parameters of long-term are evaluated on the basis of success in the forecasting of seismic activity also indicating possible earthquakes perturbations. A maximum likelihood procedure to infer optimal values are applied on short-term approach which can be incorporated into real-time seismic networks to provide seismic hazard estimate.

About the scientific results (Kagan, & Jackson, 2000) concluded that the research depicted a statistical relationship between successive earthquakes in a quantitative way that facilitate hypothesis testing. About the practical results the quantitative predictive assessment can be adopted into mitigation strategies.


Table 1. Example of long- and short-term forecast, 1999 February 11, north of Philippines.(Kagan, & Jackson, 2000)

Earthquake Prediction: Analogy with Forecasting

**(Pontes et al, 2011)** 

from one level to another.

UIT

out, meaning a full summation each time is unnecessary,

improve accuracy about predictions of incidents.

Models for Cyber Attacks in Internet and Computer Systems 113

When calculating successive values, a new value comes into the sum and an old value drops

Nevertheless, the forecasting approaches which use moving averages to cope with cyber attacks in IDPS are limited to analyze cyber attacks individually, e.g. in just one IDPS. Therefore, there is no collaboration among the forecasters. Besides: the concept of sensors is not adopted in (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b),

�

+ �� �

(9)

���������� ���� � ������� ���� <sup>−</sup> �� − �

(Pontes & Zucchi, 2010), (Ishida et al, 2005), (Viinikka et al, 2006), (Ye et al, 2003).

**3. The distributed intrusion forecasting system with the two stage system** 

Intrusion Forecasting Systems (IFS) can work proactively in cyber security contexts, as early warning systems, in order to indicate or identify UIT (incidents, threats, attacks) in advance. IFS can also represent an improvement of IDPS, which is based on postmortem approaches (UIT is identified and/or blocked only after they can inflict serious damage to the computer systems). IFS predicts UIT by the use of different forecasting techniques (for instance, moving average, Fibonacci sequence etc) applied either for local or distributed environment. Additionally, for distributed environments, e.g. DIFS, the use of cooperative sensors can

Fig. 6 depicts the proposal of this chapter, i.e. the DIFS and the forecasting levels. Similarly to forecasting methodologies used in other fields (e.g. Meteorology), DIFS also spreads agents and/or sensors widely to make predictions about the different kinds of UIT (spam, virus, intrusion, abnormal network traffic). There are four levels of the IFS: level 1 - independent security devices of hosts; level 2 - integrated security devices of hosts; level 3 - the network level; and level 4 - the backbone level. All levels have some communication degree among each other. In other words, the forecasts obtained from level 1 are shared and correlated to the forecasts of the other levels. Lower levels work as sensors to higher levels; consequently feedback about the UIT trends may be exchanged

Level 1 concerns the trend analysis about incidents, alerts and diagnosis reported independently by the hosts' security devices (antivirus, antispyware, host-based IDPS and other anomaly detector systems). For each security device, individual forecasts may be provided, e.g. the trend about spam for next hour or the day of tomorrow, or the trend about virus infection etc. The next step of the IFS level 1 is to help the hosts' security devices to determine whether or not they should adopt countermeasures to stop

Level 2 involves correlation of forecasts about the hosts' security devices. At this level, the analysis lays on two databases: a) All the historical data generated from each one of the hosts' security devices are processed individually by the IFS first level, then stored in a database; b) The network flow may also be recorded for further forecasting analysis. The next step for the IFS level 2 is to query and to analyze the trends (forecasts) of such databases. After analyzing it, IFS level 2 returns a feedback to IFS level 1. It is important to

notice that the databases of IFS level 1 work as sensors for IFS level 2.

The versatility of the methodology based on forecasts is evident in this work, presenting significant results. This scenario shows that quite different methods (e.g, that use and do not use historical data) can be used in conjunction with an approach that uses DIFSA.

#### **2.4 Forecasting for cyber attacks**

The forecasting approaches in IDPS lie mainly on stochastic methods (Ramasubramanian & Kannan, 2004), (Alampalayam & Kumar, 2004), (Chung et al, 2006). With no attention about predictions, references (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006) applied diverse probabilistic techniques (decision tree, Hotelling's T² test, chi-square multivariate, Markov chain and Exponential Weighted Moving Average (EWMA)) on audit data as a way to analyze three properties of the UIT: frequency, duration, and ordering. Reference (Ye et al, 2001), (Ye et al, 2003) has come to the following findings: 1) The sequence of events is necessary for IDPS, as a single audit event at a given time is not sufficient; 2) Ordering (transaction (Wong et al, 2006)) provides additional advantage to the frequency property, but it is computationally intensive. According to (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006), the frequency property by itself provides good intrusion detection. References (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006) did not approach correlation for IDPS.

Moving averages (simple, weighted, EWMA, or central) with time series data are regularly used to smooth out fluctuations and highlight trends (NIST, 2009). EWMA may be applied for auto correlated and uncorrelated data for detecting cyber-attacks which manifest themselves through significant changes in the intensity of events occurring (Ye et al, 2001). Both (EWMA for auto correlated and uncorrelated) has presented good efficiency for detecting attacks. EWMA applies weighting factors which decrease, giving much more importance to recent observations while still not discarding older observations entirely. The statistic that is calculated is (NIST, 2009):

$$EWMA\_t = aY\_t + (1 - a)EWMA\_{t-1} \quad \text{for } t = 1, 2, \dots, n. \tag{7}$$

Where: EWMA is the mean of historical data; Yt is the observation at time t; n is the number of observations to be monitored including EWMA; 0 <α< 1 is a constant that determines the depth of memory of the EWMA.

The parameter α determines the rate of weight of older data into the calculation of the EWMA statistic. So, a large value of α gives more weight to recent data and less weight to older data; a small value of α gives more weight to older data.

Reference (Cisar and Cisar, 2007) gives an overview of adopting EWMA with adaptive thresholds, based on normal profile of network traffic. The analysis of thresholds with EWMA may summarize huge amount of data in network traffic (Zhay et al, 2006), (Pontes & Zucchi, 2010). Diverse moving averages, combined with Fibonacci sequence forecasting approach, were also used by (Zuckerman et al, 2010) to spot trends of cyber attacks in the (DARPA, 1998) datasets.

A simple moving average (SMA) is the non weighted mean of the previous n data. For example, a 10-hours SMA of intrusive event X (DoS, e.g.) is the mean of the previous 10 hours' event X. If those events are: ��� �����������. Then the formula is (NIST, 2009), (Roberts, 1959):

$$SMA = \frac{e\_M + e\_{M-1} + \dots + e\_{M-9}}{10} \tag{8}$$

The versatility of the methodology based on forecasts is evident in this work, presenting significant results. This scenario shows that quite different methods (e.g, that use and do not

The forecasting approaches in IDPS lie mainly on stochastic methods (Ramasubramanian & Kannan, 2004), (Alampalayam & Kumar, 2004), (Chung et al, 2006). With no attention about predictions, references (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006) applied diverse probabilistic techniques (decision tree, Hotelling's T² test, chi-square multivariate, Markov chain and Exponential Weighted Moving Average (EWMA)) on audit data as a way to analyze three properties of the UIT: frequency, duration, and ordering. Reference (Ye et al, 2001), (Ye et al, 2003) has come to the following findings: 1) The sequence of events is necessary for IDPS, as a single audit event at a given time is not sufficient; 2) Ordering (transaction (Wong et al, 2006)) provides additional advantage to the frequency property, but it is computationally intensive. According to (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006), the frequency property by itself provides good intrusion detection. References (Ye et

use historical data) can be used in conjunction with an approach that uses DIFSA.

al, 2001), (Ye et al, 2003), (Wong et al, 2006) did not approach correlation for IDPS.

Moving averages (simple, weighted, EWMA, or central) with time series data are regularly used to smooth out fluctuations and highlight trends (NIST, 2009). EWMA may be applied for auto correlated and uncorrelated data for detecting cyber-attacks which manifest themselves through significant changes in the intensity of events occurring (Ye et al, 2001). Both (EWMA for auto correlated and uncorrelated) has presented good efficiency for detecting attacks. EWMA applies weighting factors which decrease, giving much more importance to recent observations while still not discarding older observations entirely. The

Where: EWMA is the mean of historical data; Yt is the observation at time t; n is the number of observations to be monitored including EWMA; 0 <α< 1 is a constant that determines the

The parameter α determines the rate of weight of older data into the calculation of the EWMA statistic. So, a large value of α gives more weight to recent data and less weight to

Reference (Cisar and Cisar, 2007) gives an overview of adopting EWMA with adaptive thresholds, based on normal profile of network traffic. The analysis of thresholds with EWMA may summarize huge amount of data in network traffic (Zhay et al, 2006), (Pontes & Zucchi, 2010). Diverse moving averages, combined with Fibonacci sequence forecasting approach, were also used by (Zuckerman et al, 2010) to spot trends of cyber attacks in the

A simple moving average (SMA) is the non weighted mean of the previous n data. For example, a 10-hours SMA of intrusive event X (DoS, e.g.) is the mean of the previous 10 hours' event X. If those events are: ��� �����������. Then the formula is (NIST, 2009),

> �� + ���� +�+���� 10

(8)

����� � ��� + �1−α�������� for t=1, 2, …, n. (7)

**2.4 Forecasting for cyber attacks** 

statistic that is calculated is (NIST, 2009):

older data; a small value of α gives more weight to older data.

��� �

depth of memory of the EWMA.

(DARPA, 1998) datasets.

(Roberts, 1959):

When calculating successive values, a new value comes into the sum and an old value drops out, meaning a full summation each time is unnecessary,

$$\text{SMA}\_{current\,\,hour} = \text{SMA}\_{last\,\,hour} - \frac{e\_M - n}{n} + \frac{e\_M}{n} \tag{9}$$

Nevertheless, the forecasting approaches which use moving averages to cope with cyber attacks in IDPS are limited to analyze cyber attacks individually, e.g. in just one IDPS. Therefore, there is no collaboration among the forecasters. Besides: the concept of sensors is not adopted in (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b), (Pontes & Zucchi, 2010), (Ishida et al, 2005), (Viinikka et al, 2006), (Ye et al, 2003).
