**2. The mixture transition distribution (MTDg) and B-MTD models: mathematical formulations in brief**

#### **2.1 The MTDg model with interactions between the covariates**

The MTD model with covariates was discussed in [6] and developed in [15, 19] to incorporate interactions between the covariates (e.g. rainfall, temperature

*Mixture Transition Distribution Modelling of Multivariate Time Series of Discrete State… DOI: http://dx.doi.org/10.5772/intechopen.88554*

variants in the case study discussed). The high-order MTD transition probabilities are computed as follows:

$$P(X\_t = i\_0 | X\_{t-1} = i\_1, ... \\ X\_{t-f} = i\_f, \mathcal{C}\_1 = c\_1, ... \\ \mathcal{C}\_\epsilon = c\_\epsilon, \mathcal{M}\_1 = m\_1, ... \\ \mathcal{M}\_l = m\_l)$$

$$= \sum\_{g=1}^f \lambda\_{g \neq i\_f \dot{o}} + \sum\_{h=1}^\epsilon \lambda\_{f+h} d\_{h \dot{y}\_h \dot{o}\_0} + \sum\_{u=1}^l \lambda\_{f+\epsilon+u} \mathfrak{s}\_{uv\_u \dot{o}\_0} \tag{1}$$

where *λf*þ*e*þ*<sup>u</sup>* is the weight for the interaction term, *qig <sup>i</sup>*<sup>0</sup> is the transition probability from modality *ig* observed at time *t*�*g* and modality *i*<sup>0</sup> observed at time *t* in the transition matrix *Q*, *suvui*<sup>0</sup> is transition probability between covariate *h*<sup>1</sup> and covariate *h*<sup>2</sup> interaction term (*vu* ¼ *dh*1*<sup>j</sup> h*1 � *dh*2*<sup>j</sup> h*2 ) and *Xt*, and where P*f*þ*e*þ*<sup>l</sup> <sup>g</sup>*¼<sup>1</sup> *<sup>λ</sup><sup>g</sup>* <sup>¼</sup> <sup>1</sup> and where *λ<sup>g</sup>* ≥ 0.

We refer the reader to [6], and further works in the seminal book by Hudson and Keatley [7] for further mathematical details.

### **2.2 The bivariate mixture transition distribution (B-MTD)**

Let f g *Xt* and f g *Yt* be sequences of random variables (say two (flowering intensity) time series) taking values in the finite set *N* = {1, …, *k*}. In a *f* th-order Markov chain, the probability that *Xt* f g *; Yt* ¼ *i*0*; i* 0 0 � �, (*i*0*, i*<sup>0</sup> <sup>0</sup> ∈ *N*) depends on the combination of values taken by *Xt*�*<sup>f</sup> ,* …*, Xt*�<sup>1</sup>*, Yt*�*<sup>f</sup> ,* …*, Yt*�1. In the MTD model, the contributions of the different lags are combined additively. Then a bivariate MTD model, which we denote by B-MTD, has the following formulation:

$$P\left(\left\{\mathbf{X}\_{t},\mathbf{Y}\_{t}\right\}=\left\{i\_{0},i'\_{0}\right\}\middle|\mathbf{X}\_{t-1}=i\_{1},\ldots,\mathbf{X}\_{t-f}=i\_{f},\mathbf{Y}\_{t-1}=i'\_{1},\ldots,\mathbf{Y}\_{t-f}=i'\_{f}\right)\tag{2}$$

$$=\sum\_{\mathbf{g}=1}^{f}\lambda\_{\mathbf{g}}q\_{i\_{\mathbf{i}},i'\_{\mathbf{j}},i\_{0},i'\_{0}}$$

where *if ,* …*, i*0*, i*<sup>0</sup> *<sup>f</sup> ,* …*, i*<sup>0</sup> <sup>0</sup> ∈ *N*, the probabilities *qig ,i* 0 *<sup>g</sup> ,i*0*,i* 0 0 are elements of an *<sup>m</sup>*<sup>2</sup> � *<sup>m</sup>*<sup>2</sup> transition matrix *<sup>Q</sup>* <sup>¼</sup> � *qig ,i* 0 *<sup>g</sup> ,i*0*,i* 0 0 � , each row of which is a probability distribution (i.e. each row sums to 1 and the elements are nonnegative) and *λ* ¼ *λ<sup>f</sup> ;* …*; λ*<sup>1</sup> � �<sup>0</sup> is a vector of lag parameters. To ensure that the results of the model are probabilities, that is, 0≤ P*<sup>f</sup> <sup>g</sup>*¼<sup>1</sup> *<sup>λ</sup>gqig <sup>i</sup>* 0 *<sup>g</sup> i*0*i* 0 0 ≤1 the vector *λ* is subject to the constraints P*<sup>f</sup> <sup>g</sup>*¼<sup>1</sup> *<sup>λ</sup><sup>g</sup>* <sup>¼</sup> <sup>1</sup> and *λ<sup>g</sup>* ≥0.

Covariates and interaction terms can be added to the bivariate MTD (B-MTD) as follows:

$$P(\{X\_{1,t},...,X\_{n,t}\} = \{i\_{1,0},...,i\_{n,0}\}|X\_{t-1} = i\_1,...,X\_{t-f} = i\_f, Y\_{t-1} = i'\_1,...,Y\_{t-f} = i'\_f,$$

$$\mathbf{C}\_1 = c\_1,...,\mathbf{C}\_\epsilon = c\_\sigma,\\M\_1 = m\_1,...,M\_l = m\_l)$$

$$= \sum\_{g=1}^f \lambda\_g q\_{i\_{1,t},...,i\_{n,t},i\_{1,0},...,i\_{n,0}} + \sum\_{h=1}^\epsilon \lambda\_{f+h} d\_{hj\_s,i\_{t,0},...,i\_{n,0}} + \sum\_{u=1}^l \lambda\_{f+\epsilon+u'} s\_{w\_\*i\_{t,0},...,i\_{n,0}}\tag{3}$$

overlap within or between species, due to the shortage of phenological data in Australia [3, 4, 7]. This chapter examines flowering synchrony over a 31 year period, 1940–1970, at the population level among four eucalypts species—*Eucalyp-*

A new approach to assess synchronicity developed in this chapter is a novel bivariate extension of the of the MTDg model [6, 9] (we coin this B-MTD). The aim of this chapter is to test mixture transition distribution (MTD) and an extended MTDg with interactions; and a novel bivariate extension of MTD (B-MTD) to investigate synchrony in phenological data. The MTDg model [6, 9] was the first approach developed to study the multivariate relationship between the probability of flowering with two states of rain and mean temperature via a mixture transition distribution (MTD), assuming, however a different transition matrix from each lag to the present time (our MTDg analysis), thus generalising the MTD approach in [13], (see also [10]) which led to the development of the MARCH software to perform MTD without covariates [11, 12]. The MTDg model is different to MARCH not only in terms of incorporating interactions between the covariates but also in its minimization process, namely using the AD Model BuilderTM [14], which uses auto-differentiation as a minimisation tool. This is shown to be computationally less intensive than MARCH. The assumption Berchtold's MTD model, namely the assumed equality of the transition matrices among different lags, was a strong assumption, so the idea of the mixture transition distribution model was to consider independently the effect of each lag to the present instead of considering the effect of the combination of lags as in pure Markov chain processes. Specifically, an extended model for MTDg analysis which accommodates interactions was developed in [6], and applied to MTDg modelling of the flowering of

*tus leucoxylon, E. microcarpa, E. polyanthemos* and *E. tricarpa* [3–8].

*Probability, Combinatorics and Control*

four eucalyptus species studied in this chapter, as multivariate time series.

differently in that *E. microcarpa* flowers at high temperature.

statistic, incorporating MTDg residuals [17–19].

**mathematical formulations in brief**

**54**

This work extends both MARCH and the work in [15, 16] to allow for differing transition matrices among the lags, i.e. our B-MTD method builds on this approach of the MTDg with interaction model [6, 9]. The MTDg model with interactions showed that the flowering of *E. leucoxylon* and *E. tricarpa* behave similarly with temperature (both flower at low temperature) and both have a positive relationship with flowering intensity 11 months ago. The flowering of *E. microcarpa* behaves

Our B-MTD approach developed in this chapter allows us the derive rules of thumb for synchrony and asynchrony between pairs of species. The latter B-MTD rules are based on transition probabilities between all possible on and off flowering states from previous to current time. Synchronisation is also tested using residuals from the resultant models via an adaptation of Moran's [17, 18] classical synchrony

We also apply the earlier MTDg modelling in [6] using climate covariates and lagged flowering states as predictors to model flowering states (on/off) and thus assess synchronisation using an adaptation of the approach of Moran to the resultant MTDg model and fitted residuals. We compare these MTDg (with covariates) based synchrony measures with our B-MTD results in addition to those using the extended Kalman filter (EKF) [15, 19], based residuals obtained earlier in [21].

**2. The mixture transition distribution (MTDg) and B-MTD models:**

to incorporate interactions between the covariates (e.g. rainfall, temperature

The MTD model with covariates was discussed in [6] and developed in [15, 19]

**2.1 The MTDg model with interactions between the covariates**

where *λf*þ*e*þ*<sup>u</sup>* is the weight for the interaction term, *suvui*1*,*0*,*…*,in,*<sup>0</sup> is the transition probability between covariate *h*<sup>1</sup> and covariate *h*<sup>2</sup> interaction term (*vu* ¼ *dh*1*<sup>j</sup> h*1 � *dh*2*<sup>j</sup> h*2 ) and *Xt* ð Þ *; Yt* , and where <sup>P</sup>*f*þ*e*þ*<sup>l</sup> <sup>g</sup>*¼<sup>1</sup> *<sup>λ</sup><sup>g</sup>* <sup>¼</sup> 1. For example, if both *Xt* and *Yt* are time series that constitute random realizations of two states {0, 1} and the covariates *C*1*,* …*, Ce* are also defined by bivariate states {0, 1}, then the set of all possible states for *Xt* ð Þ *; Yt* is {(0, 0), (0, 1), (1, 0), (1, 1)}. Hence the transition matrix *Q* ¼ *qig <sup>i</sup>* 0 *<sup>g</sup> i*0*i* 0 0 h i is a 4 � 4 matrix as specified below.

• Calculate the residuals for say, *E. leucoxylon* (Leu) using its AR (2) model.

*Mixture Transition Distribution Modelling of Multivariate Time Series of Discrete State…*

• Calculate residuals for say, *E. tricarpa* (Tri) using this same model. We

• Calculate the Pearson correlation coefficient between the residual series R1

Further details on how Moran's method is used and adapted in the case of the MTDg-based models are given in [15] (see also Section 4.8). We use the functionals and parameterisations from the mixture transition distribution (MTD) analysis as the basis of our EKF modelling approach. EKF is likewise a method to estimate the past, present and future status of non-linear time series data by minimising the mean square error. We will also test whether EKF better detects asynchronous species pairs, given EKF estimates the Kalman gain and covariance matrix at each

Flowering data were sourced from the Box-Ironbark Forest near Maryborough,

Flowering intensity scores were dichotomised into two discrete states, namely on and off (1/0) flowering (**Figure 1**) as in [6]. One temperature variant, mean monthly diurnal temperature (MeanT), in addition to the monthly rainfall (Rain) were included as climate covariates in the MTDg models; along with the temperature by rain interaction effect. We used discrete state low/high (lower than median temperature *vs* higher than median temperature) for the temperature variable dichotomies and less/more (less than the median rainfall *vs* more than the median

Victoria, in particular the flowering records of *E. leucoxylon*, *E. microcarpa*, *E. polyanthemos* and *E. tricarpa* (1940 and 1971). Flowering intensity was calculated by using a rank score (from 0 to 5) based on the quantity and distribution of flowering

We denote this residual series by R1.

and R2 and test for its significance at *p* < 0.05.

denote this residual series by R2.

*DOI: http://dx.doi.org/10.5772/intechopen.88554*

time point [15, 19].

**3. Data**

[4, 20, 23].

**Figure 1.**

**57**

*Flowering of the four eucalypts species.*


The transition matrices *Dh* ¼ *dhjhi*0*<sup>i</sup>* 0 0 h i, *<sup>h</sup>* = 1,…, *<sup>e</sup>*, are 2 � 4 matrices as below.


### **2.3 Synchrony analysis using Moran's approach**

Moran in [17, 18] suggested that if two series xt and *yt* are synchronous, and if *xt* can be estimated by a model *f*(*x*), the residuals from series *xt* fitted to *f*(*x*), and the residuals from series *yt*, fitted with the same model, but with observations, *yt*, then, *f* (*y*) will be positively correlated. The synchrony of two series can then be examined by testing the significance of the correlation of these two series of residuals (using the same model). Moran used an autoregressive integrated moving average (ARIMA) model to test synchrony. Moran's theorem suggests that if two (or more) populations sharing a common linear density-dependence (in a so-called renewal process) are disturbed with correlated noise, they will become synchronised with a correlation matching the noise correlation (see details in [4], and also [6, 15, 21]).

In this chapter we adopt the *k*th order linear stochastic difference to assess synchrony. Goodness of fit of the second order AR (*k* = 2) model is obtained. The series of residuals can then be found by subtracting the predicted (fitted species) value from the observed series. In summary, synchrony (or otherwise) of two series can be established by performing a test of significance on the correlation coefficient calculated from the two series of residuals as follows:

*Mixture Transition Distribution Modelling of Multivariate Time Series of Discrete State… DOI: http://dx.doi.org/10.5772/intechopen.88554*


Further details on how Moran's method is used and adapted in the case of the MTDg-based models are given in [15] (see also Section 4.8). We use the functionals and parameterisations from the mixture transition distribution (MTD) analysis as the basis of our EKF modelling approach. EKF is likewise a method to estimate the past, present and future status of non-linear time series data by minimising the mean square error. We will also test whether EKF better detects asynchronous species pairs, given EKF estimates the Kalman gain and covariance matrix at each time point [15, 19].
