**3. Study on the time interval of earthquakes in the Longmenshan fault zone**

#### **3.1 Data and samples**

This paper uses earthquake occurrence time data from January 9, 2012, to September 24, 2021, in the earthquake catalog of the China Seismological Network (h ttp://www.ceic.ac.cn). The earthquake occurrence time, geographic coordinates, magnitude, and other factors of the earthquake event time series are characterized by volatility and nonlinearity. Therefore, we preprocess this information into data in a specific space and a certain magnitude; that is, the time series with an Ms2.4 and above in the longitude and latitude range of the Longmenshan fault zone (29.5° N-33.5° N, 102° E-107° E). By further screening whether the earthquake location is in the fault zone, 437 experimental sample data are obtained, and the earthquake occurrence time is from January 9, 2012, to August 2021, Ms. ∈ (2.5,7).

The earthquake time interval sequence is obtained by the difference operation of the selected samples. The sample has the following characteristics:


*Analysis and Prediction of the SARIMA Model for a Time Interval of Earthquakes… DOI: http://dx.doi.org/10.5772/intechopen.109174*

3.Nonstationary and strong volatility. There are many factors affecting the data, there may be complex interactions, and the earthquake sequence has complex periodic characteristics.

In addition, the 58th earthquake recorded in the sequence was the 2013 Ya'an Lushan Ms. 7.0 earthquake, followed by several very close aftershocks, resulting in a short time interval between the 58th and 203th earthquakes in the monitoring data, and the sequence approached zero in a period of time. The original sequence is represented with Xt, and the sequence diagram is shown in **Figure 2**.

#### **3.2 Model determination**

The characteristics of the data are analyzed, and its trend, seasonality, and residual series are extracted through seasonal decomposition. The results are shown in **Figure 3**, indicating that the series has a certain trend and seasonality. It is preliminarily speculated that the minimum period is approximately 5 days.

The SARIMA model requires the sequence to meet the precondition of stationarity or post-difference stationarity. After the stationarity test of Xt, we determine the number of differences d = 2. The sequence after the difference is marked as Xd, and its sequence diagram is shown in **Figure 4**. The ADF test is used to determine whether the data after the difference have a trend. The test principle is to check whether there is a unit root in the sequence: if the sequence is stable, there is no unit root. After

**Figure 2.** *Sequence diagram of earthquake time intervals.*

**Figure 3.** *Trend and seasonal decomposition of earthquake time interval series.*

testing, the ADF test statistic of sequence Xd is 15.4634, and the p value is less than 0.05, indicating that the sequence is stable.

The ratio of the model training set and test set of sequence Xd is 9 to 1, and the order of the model is determined according to the ACF and PACF of the data. ACF is used to measure the correlation between the current value of the series and its lagged terms. PACF is used to measure the correlation between the current value and its lagged terms after removing the influence explained by the previous lag. The nonseasonal order of the model is preliminarily determined from the ACF and PACF charts in **Figure 5**. The PACF diagram can be regarded as a trailing or ninth order truncation.

The optimal order of the model in combination with the adjusted R2 of the model is selected. The fitting effect of ARIMA models with different orders is shown in **Table 1**. From this, the parameters p = 9 and q = 1 in the model are determined.

Then, we analyze the ACF and PACF of integral multiple orders of s after a oneorder, s-step difference of X. Taking s = 22 as an example, as shown in **Figure 6**, the ACF and PACF coefficients lagging 22 orders are outside the range of two times the standard deviation and then gradually converge. It is preliminarily assumed that P = 1 and Q = 1 in the model. We further carry out parameter tests and LB tests for the hypothetical model. A parameter test is used to verify the validity of the model; the LB test is used to check whether the residual sequence after model fitting is a white noise sequence.

The test results are shown in **Table 2**, and the above model contains an insignificant parameter (first-order seasonal autoregressive coefficient). Through comparison, the best seasonal order of the model, P = 0, and Q = 1 are determined.

**Figure 4.** *Sequence diagram of the earthquake time interval after secondary difference.*

**Figure 5.** *ACF and PACF diagrams of the earthquake time interval sequence.*

*Analysis and Prediction of the SARIMA Model for a Time Interval of Earthquakes… DOI: http://dx.doi.org/10.5772/intechopen.109174*


#### **Table 1.**

*Comparison of ARIMA models with different autoregressive orders.*

#### **Figure 6.**

*ACF and PACF diagrams of the earthquake time interval sequence after secondary difference.*


#### **Table 2.**

*Comparison of SARIMA models with different seasonal orders.*

#### **3.3 Model evaluation**

The earthquake sequences used in the study have complex overlapping of different cycle periods. Therefore, when analyzing the periodicity of data, different seasonal periods are selected for short-, medium-, and long-term prediction. The center moving average of sequence Xd with span n is calculated, and the resulting sequence is marked as Xn, n ∈ (4,45). The difference between Xn and Xd is compared, the seasonal elimination degree of each cycle is observed, and the models of s ∈ (4,45) are fitted, respectively. To better compare the prediction effects between different models, this paper uses the RMSE to represent the fitting error of the model:

$$\text{RMSE} = \sqrt{\frac{1}{N} \sum\_{t=1}^{N} \left(\mathbf{x}\_t - \mathbf{y}\_t\right)^2} \tag{3}$$

where xt is the true value, and yt is the fitted value. For the model passing the parameter test and LB test, we analyze its adjustment R2 value and fitting RMSE to determine the fitting effect. The AIC is also commonly used to measure the complexity and goodness of fit of statistical models; the smaller the AIC value is, the better the model performance. The principle of model selection in this paper is to prioritize the model with a high R<sup>2</sup> value. When the difference between the R<sup>2</sup> values is not significant, the fitting RMSE of the model is compared to determine the optimal model, and the AIC value of the model is taken as an auxiliary reference. **Table 3** provides the fitting of some models with different periods that have passed the parameter test and LB test.

Therefore, we determine that the short-, medium-, and long-term prediction models of seismic data are ARIMA (9, 2, 1) (0, 1, 1)6 ,ARIMA(9, 2, 1) (0, 1, 1)22, and ARIMA(9, 2, 1) (0, 1, 1)42. The model-fitting effect is as follows (**Figure 7**):


**Table 3.**

*Comparison of SARIMA models with different periods.*

**Figure 7.** *Fitting effect of models with different periods.*

*Analysis and Prediction of the SARIMA Model for a Time Interval of Earthquakes… DOI: http://dx.doi.org/10.5772/intechopen.109174*
