3. STSA in applied sciences

We can identify two main groups in time series econometrics: univariate time series analysis concerning with techniques for the analysis of dependence in adjacent observations. It has increased importance since 1970 based on the main ideas underlying in [1]; multivariate time series analysis based on the vector autoregressive (VAR) models, made popular by [2]. In the first group, we find all the autoregressive integrated moving average (ARIMA) models and the related generalized autoregressive conditional heteroscedasticity (GARCH) models developed by [3]. The second group is a generalization of the AR models and we can find two important developments based on this: cointegration proposed by [4] focusing on finding a statistical relationship between variables; and noncausality test developed by [5], which takes the concept of predetermination try to test if a variable causes another. Much of the development in

In summary, dependence and causation are two important topics in time series econometrics and time series analysis. These topics are related with the importance of inference and forecasting in social sciences. Econometrics has been focused in developing powerful test considering the available small samples. Most of these developments are based on linear models even

Time series analysis in econometrics is mostly based on observations belonging to the set of the real numbers. Some variables can be categorical such as dummy variables. However, in this chapter, we will talk about a different approach that is known as symbolic time series analysis (STSA). It has been originally applied to physics and engineering as a statistical methodology to detect the very dynamic of highly noise time series. The application to social sciences such as

As mentioned before, the application of STSA in social sciences requires a different approach due to data limitation. In this sense, the design of powerful test considering the availability of data is crucial. As abovementioned, dependence and causation are two important topics. In this sense, we review an independence test and a first approach on testing noncausality, both based on STSA. The information theory was adopted as an approach to analyze the symbolic time series and the approximation of Shannon Entropy as an important measure, applied to

The chapter is organized as follows. Section 2 presents the symbolic time series approach and its relation with the symbolic dynamics. In Section 3, we review some of the literature of STSA applied to the sciences. In Section 4, the information theory approach and Shannon Entropy measure is explained. Section 5 presents a review of the independence symbolic test. Section 6 focuses on causality test based on STSA. Section 7 discusses the difference between the proposed symbolic noncausality test and the traditional and well-known Granger noncausality test. Finally, in Section 8, we draw some conclusions and present some future lines of research.

The concept of symbolization has its roots in dynamical systems theory, particularly in the study of nonlinear systems, which can exhibit bifurcation and chaos. In [21], it is asserted that

if there are some developments considering nonlinearities; see for instance [19, 20].

economics or finance is very recent and there are some novel developments.

time series econometrics is found in books such as [6–18].

108 Time Series Analysis and Applications

test design.

2. Symbolic time series analysis

In [23], the applications of STSA techniques to the different fields of science are reviewed. According to the authors, the different applications suggest that symbolization can increase the efficiency of finding and quantifying information from the systems. Mechanical systems were one of the first applications where symbolic analysis was successfully used to characterize complex dynamics. In [24–26], symbolic methods to the analysis of experimental combustion data from internal combustion engines are applied. Their objective was to study the onset of combustion instabilities as the fueling mixture was leaned. STSA has also been applied in Astrophysics and Geophysics. For instance, [27] analyzes weak-reflected radar signals from the planet Venus to measure the rotational period. In [28], a binary symbolization to analyze solar flare events is utilized. Biology and Medicine is another field where STSA has been applied. There have been many recent applications of symbolic analysis for biological systems, most notably for laboratory measurements of neural systems and clinical diagnosis of neural pathologies. STSA has been applied in neurosciences. In [29, 30], symbolization data is applied to equal-sized interval to partition EEG signals to identify seizure precursors in electroencephalograms. [31] proposed a new damage localization method based on STSA to detect and localize a gradually evolving deterioration in the system. They assert that this method could be demanded for implementation in real-time observation application such as structural health monitoring. In [32], the STSA is used in human gait dynamics. The results of this study can have implication modeling physiological control mechanism and for quantifying human gait dynamics in physiological and stressed conditions. In [33], the heart-rate dynamics is studied by using partitions aligned on the data mean and 1 and 2 sample standard deviations, for a symbol-set size of 6. In [34], the prevalence of irreversibility in human heartbeat is analyzed applying STSA.

Application of symbolization to fluid flow measurements has spanned a wide range of data types from global measurements of flow and pressure drop, to formation and coalescence of bubbles and drops, to spatiotemporal measurements of turbulence. In [35], an approach for transforming images of complex flow fields (as well as other textured fields) into a symbolic representation is developed. In [36], STSA is applied to the networks of genes, which is important underlying the normal development and function of organisms. Information about the structure of the genome of humans and other organisms is increasing exponentially. In [37], equiprobable symbols are used for analyzing measurements from free liquid jets in order to readily discriminate between random and nonrandom behavior. In [38], STSA is applied to the detection of incipient fault in commercial aircraft gas turbine engines. In [39], combustion instability in a swirl-stabilized combustor is investigated using STSA. Chemistry-related applications of symbolic techniques have been developed for chemical systems involving spontaneous oscillations or propagating reaction fronts. In [40], a type of symbolization for improving the performance of Fourier-transform ion-cyclotron mass spectrometry is applied. Artificial Intelligence, Control, and Communication are fields where symbolization has been incorporated. In [41], a phase-space partitioning to model communication is used. An example application of symbolization to communication is found in [42], utilizing small perturbations to encode messages in oscillations of the Belousov-Zhabotinsky (BZ) reaction. In robotics, a symbolic time series–based statistical learning method to construct the generative models of the gaits (i.e., the modes of walking) for a robot, see [43], has been developed. Efficacy of the proposed algorithm is demonstrated by laboratory experimentation to model and then infer the hidden dynamics of different gaits for the T-hex walking robot. In [44], an algorithm to intuitively cluster groups of agent trails from networks based on STSA is proposed. The authors assert that temporal trails generated by agents traveling to various locations at different time epochs are becoming more prevalent in large social networks. The algorithm was applied to real world network trails obtained from merchant marine ships GPS locations. It is able to intuitively detect and extract the underlying patterns in the trails and form clusters of similar trails.

The methods of data symbolization have also been applied for data mining, classification, and rule discovery. In [45], rule discovery techniques to real-valued time series via a process of symbolization are applied. Finally, we find some applications of STSA in Social Science. In [46–48], STSA and minimal spanning tree (MST) are applied to construct cluster of financial asset with application to portfolio theory. Utilizing a similar methodology, in [49], the dynamics of exchange market is studied, and in [50], the international hotel industry in Spain is analyzed. In [51, 52], STSA and entropy are applied to measure informational efficiency in financial markets.

### 4. Information theory and Shannon entropy

The term entropy was first used by Rudolf Clausius in [53] related to the second law of thermodynamics. Subsequently, the communication theory [54] used the Shannon entropy as a measure of uncertainty where the maximum entropy corresponds to the maximum degree of uncertainty. In this sense, a random process will take the maximum entropy value. In fact, English language is not a random process; some patterns such as "THE" are more probable than sequences such as "DXC". Note, that in a random process, the two sequences should have the same probability. This principle is very relevant because if a symbolic string is random, the entropy should be the maximum.

The entropy measure (H) must meet the following conditions:

symbol-set size of 6. In [34], the prevalence of irreversibility in human heartbeat is analyzed

Application of symbolization to fluid flow measurements has spanned a wide range of data types from global measurements of flow and pressure drop, to formation and coalescence of bubbles and drops, to spatiotemporal measurements of turbulence. In [35], an approach for transforming images of complex flow fields (as well as other textured fields) into a symbolic representation is developed. In [36], STSA is applied to the networks of genes, which is important underlying the normal development and function of organisms. Information about the structure of the genome of humans and other organisms is increasing exponentially. In [37], equiprobable symbols are used for analyzing measurements from free liquid jets in order to readily discriminate between random and nonrandom behavior. In [38], STSA is applied to the detection of incipient fault in commercial aircraft gas turbine engines. In [39], combustion instability in a swirl-stabilized combustor is investigated using STSA. Chemistry-related applications of symbolic techniques have been developed for chemical systems involving spontaneous oscillations or propagating reaction fronts. In [40], a type of symbolization for improving the performance of Fourier-transform ion-cyclotron mass spectrometry is applied. Artificial Intelligence, Control, and Communication are fields where symbolization has been incorporated. In [41], a phase-space partitioning to model communication is used. An example application of symbolization to communication is found in [42], utilizing small perturbations to encode messages in oscillations of the Belousov-Zhabotinsky (BZ) reaction. In robotics, a symbolic time series–based statistical learning method to construct the generative models of the gaits (i.e., the modes of walking) for a robot, see [43], has been developed. Efficacy of the proposed algorithm is demonstrated by laboratory experimentation to model and then infer the hidden dynamics of different gaits for the T-hex walking robot. In [44], an algorithm to intuitively cluster groups of agent trails from networks based on STSA is proposed. The authors assert that temporal trails generated by agents traveling to various locations at different time epochs are becoming more prevalent in large social networks. The algorithm was applied to real world network trails obtained from merchant marine ships GPS locations. It is able to intuitively detect and extract the underlying patterns in the trails and form clusters of

The methods of data symbolization have also been applied for data mining, classification, and rule discovery. In [45], rule discovery techniques to real-valued time series via a process of symbolization are applied. Finally, we find some applications of STSA in Social Science. In [46–48], STSA and minimal spanning tree (MST) are applied to construct cluster of financial asset with application to portfolio theory. Utilizing a similar methodology, in [49], the dynamics of exchange market is studied, and in [50], the international hotel industry in Spain is analyzed. In [51, 52],

The term entropy was first used by Rudolf Clausius in [53] related to the second law of thermodynamics. Subsequently, the communication theory [54] used the Shannon entropy as

STSA and entropy are applied to measure informational efficiency in financial markets.

4. Information theory and Shannon entropy

applying STSA.

110 Time Series Analysis and Applications

similar trails.


In [54], the Shannon entropy function is proposed:

$$H\_n(P) = -\sum p\_i \log\_2(p\_i) \tag{1}$$

The entropy is frequently measured in bits by using log base 2 satisfying all the properties already mentioned. Note that the maximum property is confirmed solving the following Lagrangian expression (2).

$$-\sum p\_i \log\_2(p\_i) - \lambda \left(\sum p\_i = 1\right) \tag{2}$$

The Shannon entropy is concaved with a global maximum when all the probabilities are equal. In addition, when pi = 0, the convention that 0.log0 = 0 is used. Thus, adding zero, probability terms do not change the entropy value.

In order to clarify the concept of Shannon, consider two possible events and their respective probabilities p and q = 1�p. The Shannon entropy will be defined by Eq. (3).

$$H = -(p.\log\left(p\right) + q.\log\left(q\right))\tag{3}$$

Figure 1 shows graphically the function shape, note that the maximum is obtained when the probability is 0.5 for each event. This case corresponds to a random event; on the other hand, note that a certain event (when probability of one event is 1) will produce entropy equal to 0.

Figure 1. Shape of the Shannon entropy function. Note that maximum happens when the process is random (p = 0.5).

In general, [55] showed that any measure satisfying all the properties must take the following form:

$$-c\sum p\_i \log\_2(p\_i) \tag{4}$$

In order to normalize the Shannon entropy, c usually takes the value 1/log2(n) allowing to compare events of different sizes.

### 5. Symbolic independence test

STSA seems to present a good performance when detecting independence in time series. A variety of dynamical processes are present in economics. Linearity, nonlinearity, deterministic chaos, and stochastic models have been applied when modeling a complex reality. In [56], a runs test is designed, asserting that the problem of testing randomness arises frequently in quality control of manufactured products. It is remarked that detecting dependence in time series is an essential task for econometricians and applied economist. In [57], the well-known BDS test is introduced, considered as a powerful test to detect nonlinearity. In [58], a simple and powerful test based on STSA is proposed and the results are compared with the BDS and runs test. On one hand, it is found that BDS is not able to detect processes such as the chaotic Anosov and the stochastic processes nonlinear sign model (NLSIGN), nonlinear autoregressive model (NLAR), and nonlinear moving average model (NLMA). On the other hand, runs test cannot detect the chaotic Anosov, the logistic process, the bilinear, the NLAR, and the NLMA stochastic processes. The experiments show that the test based on STSA has no problem detecting all these dynamics. It is concluded that proposed test is simple, easy to compute, and is powerful with respect to the other two tests. In particular, for small samples, it is the only one able to detect models such as chaotic Anosov and nonlinear moving average (NLMA). Besides, the test is applied to financial time series to detect nonlinearity on the residuals after applying a GARCH model. In this case, the BDS rejected the independence few times whereas the SRS test still detects nonlinearity in the residuals. It seems that BDS considers that the GARCH(1,1) model is a good model most of the time. However, the symbolic test suggests that GARCH(1,1) would not be a good model considering all the nonlinear components.

Here, we review briefly the test and repeat some experiments comparing the results with the well-known BDS and runs tests. At first, let us consider a finite time series generated by an independent or random process-sized T\* {xt}t = 1,2,…,T\*. Define a partition in the series in "a" equiprobable regions obtaining the symbolized time series {st}t = 1,2,…,T\*, where each symbol st takes a symbolic value from the alphabet A = {A1,A2,…,Aa}. Since, we want to derive a general statistic for different alphabet sizes a and different subsequences lengths w, we have to make two considerations: (1) from now, we will call n to the quantity of possible events. That is n=a<sup>w</sup> , where for the simplest case (w = 1) implies n = a, then the quantity of events is equal to the symbol-set size; (2) in practice, we have a finite sample size T\*, there is no problem for w = 1, but when we compute subsequences or time-windows w of consecutive symbols we loss observations. For example, when we compute the frequency for two consecutive symbols, we have a total sample size T\*�1. In general, we can define the sample size T = T\* + w�1, again for the trivial case w = 1, T\* = T

Note that defining Si for i = 1,2,…,n as the sum of the total i events in the time series, we can derive the multidimensional variable S = {Si /T} being distributed as a multinomial with E(Si /T) = (1/n), Var(Si /T) = (1/n)(n-1)/nT and Cov(Si /T,Sj /T) = �(1/n)(1/nT) ∀i ¼6 j. As we will see, frequencies of the events should be important in the statistic and the vector of the n frequencies Si /T could be approximated by a multivariate normal distribution N(1/n,σ<sup>2</sup> Σ) where σ<sup>2</sup> is (1/nT) and Σ is a idempotent matrix as in (5)

In general, [55] showed that any measure satisfying all the properties must take the following

Figure 1. Shape of the Shannon entropy function. Note that maximum happens when the process is random (p = 0.5).

In order to normalize the Shannon entropy, c usually takes the value 1/log2(n) allowing to

STSA seems to present a good performance when detecting independence in time series. A variety of dynamical processes are present in economics. Linearity, nonlinearity, deterministic chaos, and stochastic models have been applied when modeling a complex reality. In [56], a runs test is designed, asserting that the problem of testing randomness arises frequently in quality control of manufactured products. It is remarked that detecting dependence in time series is an essential task for econometricians and applied economist. In [57], the well-known BDS test is introduced, considered as a powerful test to detect nonlinearity. In [58], a simple and powerful test based on STSA is proposed and the results are compared with the BDS and runs test. On one hand, it is found that BDS is not able to detect processes such as the chaotic Anosov and the stochastic processes nonlinear sign model (NLSIGN), nonlinear autoregressive model (NLAR), and nonlinear moving average model (NLMA). On the other hand, runs test cannot detect the chaotic Anosov, the logistic process, the bilinear, the NLAR, and the NLMA stochastic processes. The experiments show that the test based on STSA has no problem

log<sup>2</sup> pi

� � (4)

c <sup>X</sup>pi

form:

compare events of different sizes.

112 Time Series Analysis and Applications

5. Symbolic independence test

$$
\sum\_{n \le n} \equiv \begin{bmatrix}
(n-1)/n & -1/n & \dots & -1/n \\
\vdots & \vdots & \ddots & \vdots \\
\end{bmatrix} \tag{5}
$$

For convenience, we can define the normalized vector variable {εi} = {(Si/T)-(1/n)}i = 1,2,…,<sup>n</sup> having a multivariate normal distribution N(ø,σ<sup>2</sup> Σ), being ø, the null vector. Then, the statistic can be defined as a quadratic form in random normal variables (6).

$$\left\{\frac{\sum\_{i=1}^{i=n} \varepsilon\_i^2}{\sigma^2}\right\}\tag{6}$$

In [58] is applied the distribution of quadratic forms in normal variables presented in [59]. X = (ε1/σ,ε2/σ,…,εn/σ) is distributed multivariate normal N(ø,Σ). The theorem indicates that tr(ΑΣ) = n�1, and thus X'ΑX distributes Chi-square with (n�1) degrees of freedom. In this case, Α is the identity matrix I, and Σ is symmetric, singular, and idempotent. Remembering that σ<sup>2</sup> = (1/nT), then we obtain that the distribution of the symbolic randomness statistic (SRS) as in (7).

$$\text{SRS} \equiv \text{Tr}\left\{ \sum\_{i=1}^{i=n} \left( \frac{\mathbf{S}\_i}{T} - \frac{1}{n} \right)^2 \right\} \text{ asymptotically distributions } \chi^2\_{n-1} \tag{7}$$

Note that in practice computing the statistic is very simple. We just have to consider the symbols (a) and subsequences or length (w) and compute the frequencies for each event (n=aw) in the time series.

The algorithm to compute the test is as follows:

Step 1: Considering time series {xt}t = <sup>1</sup>,2,…,T\*, compute the empirical distribution, and define equiprobable regions according to the quantity of symbols or the alphabet size.

Step 2: According to the partition, translate {xt}t = <sup>1</sup>,2,…,T\* into {st}t = <sup>1</sup>,2,…,T\*, the symbolic time series when w = 1.

Step 3: Compute different symbolic time series for different lengths w, remember that the obtained series in step 2 corresponds to w = 1.

Step 4: For each w, compute the frequency of the n different events Si/T for i = 1,2,…,n.

Step 5: For each w, compute the SRS(a,w) = Tn{Σ(Si/T - 1/n) 2 } as shown in Eq. (7).

Step 6: Compare the SRS(a,w) with the Chi-2 with n-1 degree of freedom at 0.05 of significance, under the independence null hypothesis. When SRS(a,w) is larger than the critical value we reject the null hypothesis.

In [58], it is found that the statistic introduced in (7) is related to the Shannon entropy (H). We can derive the approximation expressed in Eq. (8).

$$\text{SRS} \approx (1 - H).T.\ln\left(n\right) \tag{8}$$

Note the generalization implied in STSA permits to study different dynamical process. For instance, consider a string of the first 3000 letters from the book "A Christmas Carol", s<sup>1</sup> = {marleywasdeadtobeginwith…scroogecar} and a random string of 3000 letters from an alphabet of 26, s<sup>2</sup> = {iskynbmhjp…vbbihjfkk}. Imagine testing this kind of process with BDS or runs test. However, note that would be easy to test this dynamics with the symbolic test. In this case, we can define an alphabet of 26 letters and the string. On the one hand, applying the SRS (26,1) and SRS(26,2) for the s1, we obtain the following values 2102.40 and 12331.26, respectively. On the other hand, SRS(26,1) and SRS(26,2) for the string s<sup>2</sup> are 25.79 and 690.26, respectively. Considering that a Chi-2 with 25 degree of freedom at 95% is 37.65 and a Chi-2 with 675 degree of freedom (262 –1) at 95% is 736.55. Since, the statistics for s<sup>1</sup> are large than the critical value, we can conclude that the process is not random. However, since the statistics for s<sup>2</sup> are less than critical values, we cannot reject the hypothesis of independence.

In [58] is shown that the test is conservative, rejecting the null hypothesis less time than expected. However, it is powerful in detecting nonrandom and nonlinear processes. Considering the four sample sizes, selecting two symbols and length 4 presents decent results in most of the cases. Selecting three symbols seems to be a relative good option for size of 200 or larger and three symbols for a sample size of 500 or larger. The best result is given for a sample of 2000 applying three symbols and length 4. Table 1 presents the experiments using 1000 Monte Carlo simulations on Normal, Logistic, NLMA, Anosov, and NLSIGN processes reproducing the experiments in [58].

tr(ΑΣ) = n�1, and thus X'ΑX distributes Chi-square with (n�1) degrees of freedom. In this case, Α is the identity matrix I, and Σ is symmetric, singular, and idempotent. Remembering that σ<sup>2</sup> = (1/nT), then we obtain that the distribution of the symbolic randomness statistic

Note that in practice computing the statistic is very simple. We just have to consider the symbols (a) and subsequences or length (w) and compute the frequencies for each event

Step 1: Considering time series {xt}t = <sup>1</sup>,2,…,T\*, compute the empirical distribution, and define

Step 2: According to the partition, translate {xt}t = <sup>1</sup>,2,…,T\* into {st}t = <sup>1</sup>,2,…,T\*, the symbolic time

Step 3: Compute different symbolic time series for different lengths w, remember that the

Step 6: Compare the SRS(a,w) with the Chi-2 with n-1 degree of freedom at 0.05 of significance, under the independence null hypothesis. When SRS(a,w) is larger than the critical value we

In [58], it is found that the statistic introduced in (7) is related to the Shannon entropy (H). We

Note the generalization implied in STSA permits to study different dynamical process. For instance, consider a string of the first 3000 letters from the book "A Christmas Carol", s<sup>1</sup> = {marleywasdeadtobeginwith…scroogecar} and a random string of 3000 letters from an alphabet of 26, s<sup>2</sup> = {iskynbmhjp…vbbihjfkk}. Imagine testing this kind of process with BDS or runs test. However, note that would be easy to test this dynamics with the symbolic test. In this case, we can define an alphabet of 26 letters and the string. On the one hand, applying the SRS (26,1) and SRS(26,2) for the s1, we obtain the following values 2102.40 and 12331.26, respectively. On the other hand, SRS(26,1) and SRS(26,2) for the string s<sup>2</sup> are 25.79 and 690.26, respectively. Considering that a Chi-2 with 25 degree of freedom at 95% is 37.65 and a Chi-2

critical value, we can conclude that the process is not random. However, since the statistics for

s<sup>2</sup> are less than critical values, we cannot reject the hypothesis of independence.

Step 4: For each w, compute the frequency of the n different events Si/T for i = 1,2,…,n.

equiprobable regions according to the quantity of symbols or the alphabet size.

assymptotically distributes χ<sup>2</sup>

2

} as shown in Eq. (7).

SRS ≈ ð Þ 1 � H :T:ln ð Þ n (8)

–1) at 95% is 736.55. Since, the statistics for s<sup>1</sup> are large than the

<sup>n</sup>�<sup>1</sup> (7)

(SRS) as in (7).

114 Time Series Analysis and Applications

(n=aw) in the time series.

series when w = 1.

reject the null hypothesis.

with 675 degree of freedom (262

SRS � Tn <sup>X</sup>

The algorithm to compute the test is as follows:

obtained series in step 2 corresponds to w = 1.

Step 5: For each w, compute the SRS(a,w) = Tn{Σ(Si/T - 1/n)

can derive the approximation expressed in Eq. (8).

i¼n

Si <sup>T</sup> � <sup>1</sup> n

� �<sup>2</sup> ( )

i¼1

Note that the symbolic test is more conservative than BDS and Runs test when rejecting independence in a normal random process. However, the symbolic test is powerful in detecting nonlinearities in the studied processes. For a sample of 50, Logistic model is detected 100% by the symbolic test, but BDS detects 68%, and Runs test rejects independence 23.90% of the time. Logistic model is still hard to be detected by the run test when sample increases to 2000. Note that NLMA model is detected by the symbolic test when sample is 500 or larger, but it is not detected by BDS and Runs test. It is interesting to note that the chaotic process of Anosov is detected by the symbolic test for a sample larger than 500 but both BDS and Runs tests reject independence less than 6% of the cases. NLSIGN is hard to be detected, for a sample of 2000 the symbolic test detects more than 90% of the cases and Runs test detects 84% of the cases. However, BDS cannot detect the NLSIGN process. In [58] similar results are obtained, the proposed SRS is the only one that is able to detect chaotic Anosov and nonlinear process NLMA when T = 2000.


Table 1. Simulated size of the SRS, Runs and BDS statistics.
