4. Simulation studies

In this section, we investigate through simulation two salient issues pertaining to the proposed modeling frameworks. In the first part, we explore the convergence of the MCEM algorithm through simulated examples, and investigate the finite sample distributional properties of the parameter estimators through a comprehensive simulation study. In the second part, we present a simulation study to compare the performance of the proposed ZIB models to their counterpart ZIP models in characterizing zero-inflated count time series.

### 4.1. Evaluation of the MCEM algorithm

We consider time series data simulated from four different parameter-driven models: ZIB + AR (2), binomial + AR(2), ZIB + AR(1), and binomial + AR(1). The sample size is set to 300 and the number of cases nt for each time point is set to 30. All of the models feature the following linear predictor:

$$\mathbf{1}\operatorname{logit}(\pi\_t) = \beta\_0 + \beta\_1 \mathbf{x}\_{1,t} + z\_{t\prime} \tag{45}$$

where x1, <sup>t</sup> is a covariate series generated from a standard uniform distribution. The true parameters for the most complicated model ZIB + AR(2) are as follows:

$$
\omega = 0.3, \beta\_0 = 2, \beta\_1 = -3, \phi\_1 = 0.8, \phi\_2 = -0.6, \text{and } \sigma = 0.5. \tag{46}
$$

For the rest of the models considered, the corresponding parameters are set to 0 if no such a form is included. Autoregressive (AR) coefficients are chosen to assure stationarity of the series. In fitting the models, the number of particle filters (N) is set to 500 and the number of particle smoothers (R) is set to 300. We stop the MCEM algorithm after 300 iterations. Table 1 presents the parameter estimates for the simulated data corresponding to the four parameterdriven models.

Figure 1 shows the trace plots of the log-likelihood for the four fitted parameter-driven models. Note that the log-likelihood of the MCEM algorithm is not strictly increasing at each iteration due to the introduction of Monte Carlo errors. However, the log-likelihood stabilizes after a few dozen iterations with slight fluctuations around the maximal value. Figure 2 shows the trace plots for the parameter estimates from the most complex fitted model, ZIB + AR(2). The plots indicate that the parameter estimates converge to the MLEs quickly with negligible


Table 1. True and estimated parameters for the simulated examples.

4. Simulation studies

138 Time Series Analysis and Applications

predictor:

driven models.

4.1. Evaluation of the MCEM algorithm

In this section, we investigate through simulation two salient issues pertaining to the proposed modeling frameworks. In the first part, we explore the convergence of the MCEM algorithm through simulated examples, and investigate the finite sample distributional properties of the parameter estimators through a comprehensive simulation study. In the second part, we present a simulation study to compare the performance of the proposed ZIB models to their

We consider time series data simulated from four different parameter-driven models: ZIB + AR (2), binomial + AR(2), ZIB + AR(1), and binomial + AR(1). The sample size is set to 300 and the number of cases nt for each time point is set to 30. All of the models feature the following linear

where x1, <sup>t</sup> is a covariate series generated from a standard uniform distribution. The true

For the rest of the models considered, the corresponding parameters are set to 0 if no such a form is included. Autoregressive (AR) coefficients are chosen to assure stationarity of the series. In fitting the models, the number of particle filters (N) is set to 500 and the number of particle smoothers (R) is set to 300. We stop the MCEM algorithm after 300 iterations. Table 1 presents the parameter estimates for the simulated data corresponding to the four parameter-

Figure 1 shows the trace plots of the log-likelihood for the four fitted parameter-driven models. Note that the log-likelihood of the MCEM algorithm is not strictly increasing at each iteration due to the introduction of Monte Carlo errors. However, the log-likelihood stabilizes after a few dozen iterations with slight fluctuations around the maximal value. Figure 2 shows the trace plots for the parameter estimates from the most complex fitted model, ZIB + AR(2). The plots indicate that the parameter estimates converge to the MLEs quickly with negligible

True 0.300 2.000 �3.000 0.800 �0.600 0.500 Binomial + AR(1) 1.984 �2.968 0.800 0.540 ZIB + AR(1) 0.283 2.124 �2.930 0.781 0.563 Binomial + AR(2) 1.989 �3.012 0.852 �0.620 0.499 ZIB + AR(2) 0.293 1.992 �2.872 0.831 �0.576 0.506

Table 1. True and estimated parameters for the simulated examples.

logitð Þ¼ π<sup>t</sup> β<sup>0</sup> þ β1x1,t þ zt, (45)

ω ¼ 0:3, β<sup>0</sup> ¼ 2, β<sup>1</sup> ¼ �3, φ<sup>1</sup> ¼ 0:8, φ<sup>2</sup> ¼ �0:6, and σ ¼ 0:5: (46)

ω β<sup>0</sup> β<sup>1</sup> φ<sup>1</sup> φ<sup>2</sup> σ

counterpart ZIP models in characterizing zero-inflated count time series.

parameters for the most complicated model ZIB + AR(2) are as follows:

Figure 1. Trace plots of the log-likelihood for fitted parameter-driven models based on simulated data.

fluctuations. The trace plots of the parameter estimates for the other three models exhibit similar patterns (results not shown). In practice, we recommend always checking the trace plots of the estimates to assess convergence of the MCEM algorithm.

We next investigate the finite sample distributional properties of the parameter estimators from the MCEM algorithm. We consider the same parameter-driven models presented in the preceding simulated example. For each model structure, 500 replications are generated based on sample sizes of 200 and 500. We employ the proposed MCEM algorithm to fit models based on these replications, and record the subsequent parameter estimates and their standard errors. As the MECM algorithm is computationally expensive, we set the number of particles for both filters and smoothers to 200, and the stopping iteration for the MCEM algorithm at 100. In Tables 2–3, we provide the simulation results based on the most complex model, ZIB + AR(2).

In general, the mean and median of the estimates converge to the true parameters, with a minor degree of negative bias associated with the estimation of the AR coefficients. The empirical

Figure 2. Trace plots of the estimated parameters for the fitted ZIB + AR(2) model.

standard deviations (ESDs) are reasonably close to the average asymptotic standard errors (ASEs). Therefore, the standard errors calculated by Louis's method prove to be sufficient. As the sample size increases from 200 to 500, the bias for the estimation of the AR coefficients attenuates, and the standard errors tend to diminish. The two behaviors indicate that weak convergence holds. The results for the other three parameter-driven models are analogous to those presented in Tables 2–3. Tables 4–9 show the simulation results for the binomial + AR(2) model, ZIB + AR(1) model, and binomial + AR(1) model, respectively.

The normality of the parameter estimators is assessed by Q-Q plots based on the sets of replicated estimates (figures not shown). For the most complex ZIB + AR(2) model, approximate normality holds for the finite sample distribution of the parameter estimators, with

#### State-Space Models for Binomial Time Series with Excess Zeros http://dx.doi.org/10.5772/intechopen.71336 141


Table 2. Summary statistics for replicated parameter estimates from fitted ZIB + AR(2) models with sample size 200.


Table 3. Summary statistics for replicated parameter estimates from fitted ZIB + AR(2) models with sample size 500.


Table 4. Summary statistics for replicated parameter estimates from fitted binomial + AR(2) models with sample size 200.

standard deviations (ESDs) are reasonably close to the average asymptotic standard errors (ASEs). Therefore, the standard errors calculated by Louis's method prove to be sufficient. As the sample size increases from 200 to 500, the bias for the estimation of the AR coefficients attenuates, and the standard errors tend to diminish. The two behaviors indicate that weak convergence holds. The results for the other three parameter-driven models are analogous to those presented in Tables 2–3. Tables 4–9 show the simulation results for the binomial + AR(2)

beta0

ar1

sigma

0.4

0.6 0.8

1.0

0.0

 0.4

 0.8

1.6

1.8

2.0 2.2

Iteration

Iteration

Iteration

0 50 100 150 200 250 300

0 50 100 150 200 250 300

0 50 100 150 200 250 300

The normality of the parameter estimators is assessed by Q-Q plots based on the sets of replicated estimates (figures not shown). For the most complex ZIB + AR(2) model, approximate normality holds for the finite sample distribution of the parameter estimators, with

model, ZIB + AR(1) model, and binomial + AR(1) model, respectively.

Iteration

Iteration

Iteration

Figure 2. Trace plots of the estimated parameters for the fitted ZIB + AR(2) model.

0 50 100 150 200 250 300

0 50 100 150 200 250 300

0 50 100 150 200 250 300

omega

beta1

ar2

−0.8

 −0.4

 0.0

−3.2

−2.8

−2.4

0.28

 0.30

 0.32

140 Time Series Analysis and Applications


Table 5. Summary statistics for replicated parameter estimates from fitted binomial + AR(2) models with sample size 500.


Table 6. Summary statistics for replicated parameter estimates from fitted ZIB + AR(1) models with sample size 200.


Table 7. Summary statistics for replicated parameter estimates from fitted ZIB + AR(1) models with sample size 500.


Table 8. Summary statistics for replicated parameter estimates from fitted binomial + AR(1) models with sample size 200.


Table 9. Summary statistics for replicated parameter estimates from fitted binomial + AR(1) models with sample size 500.

slightly non-normal tail behavior (thick or thin) evident for the estimated AR coefficients. As the sample size is increased from 200 to 500, this non-normal behavior is attenuated. Similar patterns are observed for the other three parameter-driven models.

### 4.2. Model comparison

slightly non-normal tail behavior (thick or thin) evident for the estimated AR coefficients. As the sample size is increased from 200 to 500, this non-normal behavior is attenuated. Similar

Table 9. Summary statistics for replicated parameter estimates from fitted binomial + AR(1) models with sample size 500.

β<sup>0</sup> 2.000 1.997 1.996 0.125 0.168 β<sup>1</sup> 3.000 2.997 2.994 0.106 0.106 φ<sup>1</sup> 0.800 0.787 0.789 0.035 0.035 σ 0.500 0.499 0.499 0.030 0.033

Table 8. Summary statistics for replicated parameter estimates from fitted binomial + AR(1) models with sample size 200.

True Mean Median ESD ASE

True Mean Median ESD ASE

True Mean Median ESD ASE

True Mean Median ESD ASE

ω 0.300 0.299 0.299 0.020 0.021 β<sup>0</sup> 2.000 1.984 1.989 0.135 0.168 β<sup>1</sup> 3.000 2.992 2.991 0.133 0.132 φ<sup>1</sup> 0.800 0.781 0.785 0.041 0.040 σ 0.500 0.500 0.499 0.035 0.040

Table 6. Summary statistics for replicated parameter estimates from fitted ZIB + AR(1) models with sample size 200.

ω 0.300 0.299 0.299 0.031 0.032 β<sup>0</sup> 2.000 1.971 1.971 0.208 0.251 β<sup>1</sup> 3.000 2.982 2.969 0.199 0.210 φ<sup>1</sup> 0.800 0.763 0.770 0.073 0.067 σ 0.500 0.500 0.502 0.056 0.063

142 Time Series Analysis and Applications

Table 7. Summary statistics for replicated parameter estimates from fitted ZIB + AR(1) models with sample size 500.

β<sup>0</sup> 2.000 2.006 2.024 0.192 0.233 β<sup>1</sup> 3.000 2.987 2.988 0.165 0.167 φ<sup>1</sup> 0.800 0.782 0.788 0.054 0.056 σ 0.500 0.497 0.496 0.051 0.052

patterns are observed for the other three parameter-driven models.

As previously mentioned, based on a Poisson mixture distribution, extensive methodology has been published to deal with count time series with excess zeros. In addition, the Poisson distribution provides an accurate approximation to the binomial distribution when the sample size is large and the success probability is small. Therefore, one may question whether Poissontype models are sufficient for approximating binomial-type models when data are generated from a binomial mixture distribution. In this section, we try to address this question through a simulation study.

Two different types of ZIB models are proposed in this work: the parameter-driven ZIB model, and the observation-driven ZIB model. To evaluate the propriety of the binomial-type models, we consider two corresponding Poisson-type counterparts: the parameter-driven ZIP model, and the observation-driven ZIP model. We assess the performance of the four models under two scenarios: first, where data are generated from the parameter-driven ZIB model, and second, where data are generated from the observation-driven ZIB model.

To denote the parameter-driven ZIB/ZIP model with an AR(p) latent process, we use PDZIB(p)/ PDZIP(p). Similarly, we use ODZIB(p)/ODZIP(p) to denote the observation-driven ZIB/ZIP model with p lagged responses employed as covariates.

In the first scenario, data are generated from a PDZIB(2) model having the same form as that provided in Section 4.1. To reduce the computational burden associated with fitting the models, 100 replicated series of length 200 are generated. We fit four different zero-inflated models to each of the series. For the two parameter-driven models, we specify a latent autoregressive process of order two, and employ the MECM algorithm to fit the models. For the two observation-driven models, we incorporate the lagged responses yt � <sup>1</sup> and yt � <sup>2</sup> to account for the temporal correlation, and employ the Newton–Raphson algorithm to fit the models.

In the second scenario, data are generated from an ODZIB(2) model featuring the following structures:

$$\log \text{fit}(\pi\_t) = \beta\_0 + \beta\_1 \mathbf{x}\_{1,t} + \phi\_1 \mathbf{y}\_{t-1} + \phi\_2 \mathbf{y}\_{t-2'} \text{ and } \log \text{fit}(\omega) = \boldsymbol{\gamma}\_0. \tag{47}$$

Here, x1, <sup>t</sup> is a covariate series generated from a standard uniform distribution, and φ<sup>1</sup> and φ<sup>2</sup> are the autoregressive coefficients for the lagged responses yt � <sup>1</sup> and yt � 2, respectively. The values of the true parameters are the same as those for the parameter-driven model.

Again, we generate 100 replications of length 200 based on the preceding model. The same four zero-inflated models are fit to each of the replications. The Akaike information criterion (AIC) [30] is used to guide the selection of an optimal model in both scenarios. To evaluate the magnitude of the absolute difference in AIC values, Burnham and Anderson [31] provide the following guidelines (Table 10).

Thus, a difference in AIC values of two or more is considered meaningful, and a difference of 10 or more is considered pronounced.


Table 10. Guidelines for assessing AIC differences.

Figure 3 illustrates the performance of the four zero-inflated models, in terms of AIC differences, when data are generated from a PDZIB(2) model. The PDZIB(2) model serves as the reference for model comparison. Each point represents the difference in the AIC value between the target model and the reference model. As evident from the figure, the PDZIB(2) model markedly outperforms the other three models for all 100 replications, with AIC differences over 50. Although vastly inferior to the PDZIB(2) model, the PDZIP(2) model performs better than the two observation-driven models. The ODZIB(2) performs the worst among the four models considered. Parameter-driven models clearly exhibit a substantial advantage over observation-driven models when the underlying data arise via a parameter-driven approach.

Figure 4 shows the performance of the four zero-inflated models, in terms of AIC differences, when data are generated from an ODZIB(2) model. Similarly, the ODZIB(2) model serves as

Figure 3. AIC differences of zero-inflated fitted models relative to parameter-driven ZIB fitted models.

Figure 3 illustrates the performance of the four zero-inflated models, in terms of AIC differences, when data are generated from a PDZIB(2) model. The PDZIB(2) model serves as the reference for model comparison. Each point represents the difference in the AIC value between the target model and the reference model. As evident from the figure, the PDZIB(2) model markedly outperforms the other three models for all 100 replications, with AIC differences over 50. Although vastly inferior to the PDZIB(2) model, the PDZIP(2) model performs better than the two observation-driven models. The ODZIB(2) performs the worst among the four models considered. Parameter-driven models clearly exhibit a substantial advantage over observation-driven models when the underlying data arise via a parameter-driven approach. Figure 4 shows the performance of the four zero-inflated models, in terms of AIC differences, when data are generated from an ODZIB(2) model. Similarly, the ODZIB(2) model serves as

Figure 3. AIC differences of zero-inflated fitted models relative to parameter-driven ZIB fitted models.

Difference in AIC Level of empirical support for model with larger AIC

02 Substantial 47 Considerably less >10 Essentially none

Table 10. Guidelines for assessing AIC differences.

144 Time Series Analysis and Applications

Figure 4. AIC differences of zero-inflated fitted models relative to observation-driven ZIB fitted models.

the reference. The ODZIB(2) model easily performs the best among all four models for all 100 replications, reflecting a substantial improvement in model fit over the other three models based on AIC differences (>20). Compared to the two parameter-driven models, the ODZIP(2) model accommodates the data much more appropriately. Between the two parameter-driven models, the PDZIB(2) model is substantially favored over the PDZIP(2) model. Thus, observation-driven models markedly outperform parameter-driven models when the underlying data arise via an observation-driven approach.

We close this section with a brief discussion of issues germane to model selection. These issues are relevant not only in evaluating the results of the preceding simulations, but also in facilitating the choice of a model in practice.

First, one may question which class of models should be considered when coping with binomial time series data with excess zeros. In the simulation sets, the fitted parameter-driven models markedly outperform the fitted observation-driven models when data are generated via a parameter-driven approach. Although parameter-driven models are computationally expensive to fit, observation-driven models do not appear to provide an adequate characterization of the data in such settings. Additionally, unlike observation-driven models, parameter-driven models provide a description of the underlying latent processes that govern the temporal correlation and zero inflation. Observation-driven models, in contrast, outperform parameter-driven models when the underlying data are generated via an observation-driven approach. In general, the selection of the class of models depends on the conceptualization of the model structure and the perceived value of recovering and investigating the underlying latent processes. However, in the context of zero-inflated count time series, since an understanding of the phenomenon that gives rise to the data will rarely inform the practitioner as to whether the parameter-driven or observation-driven conceptualization is more appropriate, we recommend the use of AIC or an alternate likelihood-based selection criterion in choosing between these two model classes.

Second, one may question which distribution should be used when dealing with count time series with excess zeros. The Poisson-type model with an offset is often considered an appropriate approximating model for a binomial-type model when the sample size is large and the success probability is low. However, in the presence of zero inflation, our simulation results indicate the necessity of using binomial-type models over their Poisson counterparts when the underlying distribution is actually a binomial mixture. In practice, if the dynamics of the phenomenon that gives rise to the data do not inform the underlying data generating distribution, we again recommend the use of AIC or another likelihood-based criterion in choosing an appropriate distribution.
