**2. Theoretical framework**

In time series analysis the observed values of a variable for different time points or intervals can be reasonably considered as a finite realization of a real-valued random process, denoted with {*Xt*, *t* ∈ *T* ⊆ **R**}.

Besides the common second-order moments used to describe the random process {*Xt*, *t* ∈ *T*}, such as the autocovariance function and the autocorrelation function, the variogram can also be considered and even preferred with respect to covariance function [10, 19].

Given a stochastic process {*Xt*, *t* ∈ *T*} over a temporal domain *T* ⊆ **R**, the corresponding variogram is defined as follows

$$\gamma(t, t + h\_l) = 0.5Var\left[\mathbf{X}\_t - \mathbf{X}\_{t + h\_l}\right], \qquad t, t + h\_l \in T. \tag{1}$$

Note that a function *γ*(·) is a variogram if and only if it is conditionally strictly negative definite [23].

As known in the literature [3–5], time series analysis is based on the theory of stationary processes. It is worth highlighting that the second-order stationarity implies the intrinsic stationarity, but the converse is not true [18, 22].

In particular, the stochastic process {*Xt*, *t* ∈ *T*} is intrinsically stationary if its variogram *γ*(*t*, *t* + *ht*) depends solely on the temporal lag *ht* and the expected value of the difference (*Xt* − *Xt*<sup>+</sup>*ht* ) is constant.

The variogram, widely used in geostatistical context, could be applied efficiently in time series analysis [14, 15], since

• it can describe a wider class of stochastic processes, i.e. the class of intrinsic stochastic processes, which includes the class of second-order stationary stochastic processes,


2 Current Air Quality Issues

purposes.

points.

been used.

definite [23].

(*Xt* − *Xt*<sup>+</sup>*ht*

**2. Theoretical framework**

variogram is defined as follows

with {*Xt*, *t* ∈ *T* ⊆ **R**}.

Different studies have suggested the use of geostatistical methods in time domain [7, 19]. In particular, De Iaco et al. [12] illustrated the role of variogram in this context for different

The aim of this paper is to analyze *PM*<sup>10</sup> air pollution in an area of South Italy characterized by high levels of industrial emissions and vehicular traffic, through geostatistical techniques. Thus, after a brief review on stochastic processes and geostatistical methods in time series analysis, the temporal evolution of *PM*<sup>10</sup> daily concentrations, for the period 2010-2013 has been assessed. After the identification of trend and periodicity, the reconstruction of the analyzed time series by estimation of missing values has been discussed, and predictions of *PM*<sup>10</sup> daily concentrations at some unsampled points have been produced. Moreover, the probability distributions of the variable under study have been estimated for future time

For interpolation and prediction purposes, a modified version of *GSLib* kriging routine has

In time series analysis the observed values of a variable for different time points or intervals can be reasonably considered as a finite realization of a real-valued random process, denoted

Besides the common second-order moments used to describe the random process {*Xt*, *t* ∈ *T*}, such as the autocovariance function and the autocorrelation function, the variogram can also

Given a stochastic process {*Xt*, *t* ∈ *T*} over a temporal domain *T* ⊆ **R**, the corresponding

*Xt* − *Xt*<sup>+</sup>*ht*

Note that a function *γ*(·) is a variogram if and only if it is conditionally strictly negative

As known in the literature [3–5], time series analysis is based on the theory of stationary processes. It is worth highlighting that the second-order stationarity implies the intrinsic

In particular, the stochastic process {*Xt*, *t* ∈ *T*} is intrinsically stationary if its variogram *γ*(*t*, *t* + *ht*) depends solely on the temporal lag *ht* and the expected value of the difference

The variogram, widely used in geostatistical context, could be applied efficiently in time

• it can describe a wider class of stochastic processes, i.e. the class of intrinsic stochastic processes, which includes the class of second-order stationary stochastic processes,

, *t*, *t* + *ht* ∈ *T*. (1)

be considered and even preferred with respect to covariance function [10, 19].

*γ*(*t*, *t* + *ht*) = 0.5*Var*

stationarity, but the converse is not true [18, 22].

) is constant.

series analysis [14, 15], since

Regarding this last aspect, geostatistical techniques provide different parametric and nonparametric prediction methods, among these the sample and ordinary kriging, the universal kriging and the indicator kriging. Further details can be found in the specialized literature [7, 12, 19]. Thus, the estimation of the unknown value *xt* of the stochastic process {*Xt*, *t* ∈ *T*}, using the data observed in the past (extrapolation mode), or the data observed before and after the time point *t* (interpolation mode) can be easily supported by geostatistical tools.

In the following, the ordinary kriging method and the indicator kriging approach are briefly reviewed, since these geostatistical tools are used for analyzing the variable under study.

Let *X<sup>t</sup>* the linear predictor of the intrinsic stationary process {*Xt*, *t* ∈ *T*}:

$$\widehat{X}\_{\mathbf{f}} = \sum\_{i=1}^{n} \lambda\_{\mathbf{i}}(\mathbf{t}) X\_{\mathbf{f}\_{\mathbf{i}'}} \tag{2}$$

where *λi*(*t*), *i* = 1, 2, . . . , *n*, are unknown real coefficients and *Xti* are random variables of the process *X* at the sampled time points *ti*. The unknown weights *λi*(*t*), *i* = 1, 2, . . . , *n*, of (2) are obtained by solving the following kriging system

$$
\begin{bmatrix}
\gamma\_{11} & \dots & \gamma\_{1n} & -1 \\
\gamma\_{21} & \dots & \gamma\_{2n} & -1 \\
\vdots & \ddots & \vdots & \vdots \\
\vdots & \ddots & \vdots & \vdots \\
\gamma\_{n1} & \dots & \gamma\_{nn} & -1 \\
1 & \dots & 1 & 0 \\
\end{bmatrix}
\begin{bmatrix}
\lambda\_{1} \\
\lambda\_{2} \\
\vdots \\
\lambda\_{n} \\
\mu \\
\end{bmatrix} = 
\begin{bmatrix}
\gamma\_{10} \\
\gamma\_{20} \\
\vdots \\
\vdots \\
\gamma\_{n0} \\
\mathbf{1} \\
\end{bmatrix} \tag{3}
$$

where *γij* = 0.5 *Var*(*Xti* − *Xtj* ), *γi*<sup>0</sup> = 0.5*Var*(*Xti* − *Xt*), *µ* is the Lagrange multiplier. If *γ* is conditionally strictly negative definite, then the above system presents one and only one solution.

The ordinary kriging [22] requires only the knowledge of the variogram model and it is used when the expected value of the process is constant and unknown. Since the kriging system can be expressed in terms of the variogram, as in (3), the kriging predictor can be used even when the stochastic process under study satisfies the intrinsic hypothesis. Moreover, using a predictor based on a variogram, rather than on a covariance, avoids the estimation of the expected value, if this last is unknown.

The usefulness of geostatistical techniques in time series analysis can be appreciated through nonparametric estimation of the variable under study.

The kriging approach, based on the knowledge of variogram, leads naturally to nonparametric estimation [17]. Indicator kriging is a nonparametric approach to estimate the posterior cumulative distribution function (c.d.f.) of the variable under study at an unsampled point [16, 25, 26].

In this context, given the observed time series *xti* , *ti*, *i* = 1, 2, ..., *n*, the conditional probability *Prob* {*Xt* ≤ *x*|H*n*}, with H*<sup>n</sup>* = {*xti* , *ti*, *i* = 1, 2, ..., *n*}, is interpreted as conditional expectation of an indicator random field *I*(*t*; *x*) [27], that is

$$\operatorname{Prob}\left\{\mathbf{X}\_{\mathbf{f}} \le \mathbf{x} | \mathcal{H}\_{\mathbf{n}}\right\} = E\left[I(\mathbf{t}; \mathbf{x} | \mathcal{H}\_{\mathbf{n}})\right]$$

where

$$I(t; \mathbf{x}) = \begin{cases} 1, & \text{if } |\mathbf{X}\_t \le \mathbf{x}| \\ 0, & \text{if } |\mathbf{X}\_t > \mathbf{x}| \end{cases}$$

In the case study presented hereafter, ordinary kriging and indicator kriging are applied for interpolation and prediction purposes of an environmental variable. Note that a *GSLib* routine for kriging, named "KT3DP" [12], has been used in order to define appropriate temporal search neighborhoods in presence of periodicity, since environmental time series, such as the ones for air pollution data, usually are characterized by a periodic behavior. Hence, the use of periodic and nonperiodic variogram models have been proposed through two different approaches:

