**Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes**

Ronny Vallejos and Silvia Ojeda

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/50513

## **1. Introduction**

During the past decades, image segmentation and edge detection have been two important and challenging topics. The main idea is to produce a partition of an image such that each category or region is homogeneous with respect to some measures. The processed image can be useful for posterior image processing treatments.

Spatial autoregressive moving average (ARMA) processes have been extensively used in several applications in image/signal processing. In particular, these models have been used for image segmentation, edge detection and image filtering. Image restoration algorithms based on robust estimation of a two-dimensional process have been developed (Kashyap & Eom 1988). Also the two-dimensional autoregressive model has been used to perform unsu‐ pervised texture segmentation (Cariou & Chehdi, 2008). Generalizations of the previous al‐ gorithms using the generalized M estimators to deal with the effect caused by additive contamination was also addressed (Allende et al., 2001). Later on, robust autocovariance (RA) estimators for two dimensional autoregresive (AR-2D) processes were introduced (Oje‐ da, 2002). Several theoretical contributions have been suggested in the literature, including the asymptotic properties of a nearly unstable sequence of stationary spatial autoregressive processes (Baran et al., 2004). Other contributions and applications of spatial ARMA proc‐ esses have been considered in many publications (Basu & Reinsel, 1993, Bustos 2009a, Fran‐ cos & Friendlaner1998, Guyon 1982, Ho 2011, Illig & Truong-Van 2006, Martin1996, Vallejos & Mardesic 2004).

A new approach to perform image segmentation based on the estimation of AR-2D process‐ es has been recently suggested (Ojeda 2010). First an image is locally modeled using a spatial autoregressive model for the image intensity. Then the residual autoregressive image is computed. This resulting image possesses interesting texture features. The borders and edges

© 2012 Vallejos and Ojeda; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 Vallejos and Ojeda; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

are highlighted, suggesting that the algorithm can be used for border detection. Experimen‐ tal results with real images clarify how the algorithm works in practice. A robust version of the algorithm was also proposed, to be used when the original image is contaminated with additive outliers. Applications in the context of image inpainting were also offered.

**2. Image Segmentation Through Estimation of Spatial ARMA Processes**

Spatial ARMA processes have been studied in the context of random fields indexed overℤ*<sup>d</sup>* , *<sup>d</sup>* <sup>≥</sup>2, where ℤ*<sup>d</sup>* is endowed with the usual partial order. That is, for*<sup>s</sup>* =(*s*1, *<sup>s</sup>*2, …, *sd* ),

over *S* 0, *p* is supposed to be zero, and the process is called a spatial moving average MA (*q*) random field. Similarly, if*q* =0, the process is called a spatial autoregressive AR(*p*)ran‐ dom field. The ARMA random field is labeled as causal if it has the following unilateral rep‐

<sup>|</sup> *<sup>ϕ</sup>j*<sup>|</sup> <sup>&</sup>lt;*∞*. Similar to the time series case, there are conditions on the (AR or MA)

*θj z j*

. A sufficient condition for the random field to be causal is that the AR poly‐

polynomials that ensure stationarity and invertibility, respectively. Let

nomial *Φ*(*z*) has no zeros in the closure of the open disc*D <sup>d</sup>* inℂ*<sup>d</sup>* . For example, if *d* =2, the process is causal if *Φ*(*z*1, *z*2)is not zero for any *z*1and *z*2that simultaneously satisfy | *z*<sup>1</sup> | <1

Applications of spatial ARMA processes have been developed, including analysis of yield trials in the context of incomplete block designs (Cullis & Glesson 1991, Grondona et al. 1996) and the study of spatial unilateral first-order ARMA model (Basu & Reinsel, 1993). Other theoretical extensions of time series and spatial ARMA models can be found in (Baran et al., 2004, Bustos et al., 2009b, Gaetan & Guyon 2010, Choi 2000, Genton & Koul 2008, Guo

and *Θ*(*z*)=1−∑ *<sup>j</sup>*∈*<sup>S</sup>* 0,*<sup>q</sup>*

*Xs* −∑ *<sup>j</sup>*∈*<sup>S</sup>* 0, *<sup>p</sup> <sup>ϕ</sup>jXs*<sup>−</sup> *<sup>j</sup>* <sup>=</sup>*ε<sup>t</sup>* <sup>+</sup> ∑*<sup>k</sup>*∈*<sup>S</sup>* 0,*<sup>q</sup>*

*s*∈ℤ*<sup>d</sup>* is said to be a spatial ARMA(*p*, *<sup>q</sup>*)with parameters *p*, *<sup>q</sup>* ∈ℤ*<sup>d</sup>* if it is

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

*θj εs*<sup>−</sup> *<sup>j</sup>*

, respectively, denote the autoregressive and moving aver‐

*s*∈ℤ*<sup>d</sup>*denotes a sequence of independent and identi‐

For *a*, *b*∈ℤ*<sup>d</sup>* , such that *a* ≤*b*and *a* ≠*b*, we

, (1)

http://dx.doi.org/10.5772/50513

27

. Notice that if *p* =0, the sum

, where *z* =(*z*1, *z*2, …, *zd* )and

**2.1. The Spatial ARMA Processes**

A random field (*Xs*)

) *j*∈*S* 0, *p*

where (*ϕ<sup>j</sup>*

resentation. *Xs* <sup>=</sup> ∑ *j*∈*S* 0,*∞*

with∑ *j*

*z j* = *z*<sup>1</sup> *j* 1 *z*2 *j* 2…*zd j d*

*ψj εs*<sup>−</sup> *<sup>j</sup>*

*<sup>Φ</sup>*(*z*)=1−∑ *<sup>j</sup>*∈*<sup>S</sup>* 0, *<sup>p</sup> <sup>ϕ</sup><sup>j</sup>*

and | *z*<sup>2</sup> | <1 (Jain et al., 1999).

*<sup>u</sup>* =(*u*1, *<sup>u</sup>*2, …, *ud* )inℤ*<sup>d</sup>* ,*<sup>s</sup>* <sup>≤</sup>*u* if for*<sup>i</sup>* =1, 2, .…, *<sup>d</sup>*,*si* <sup>≤</sup>*ui*

weakly stationary and satisfies the equation

and(*ε<sup>j</sup>* ) *k*∈*S* 0,*q*

age parameters with *ϕ*<sup>0</sup> =*θ*<sup>0</sup> =1, and (*εs*)

*z j*

1998, Vallejos and Garccía-Donato 2006).

define *S a*, *b* ={*x* ∈ℤ*<sup>d</sup>* |*a* ≤ *x* ≤*b*}and *S a*, *b* =*S a*, *b* \{*a*}.

cally distributed centered random variables with variance*σ* <sup>2</sup>

Another concern that has been pointed out in the context of spatial statistics is the develop‐ ment of coefficients to compare two spatial processes. Coefficients that take into account the spatial association between two processes have been proposed in the literature. (Tjostheim, 1978) suggested a nonparametric coefficient to assess the spatial association between two spatial variables. Later on, (Clifford et al. 1989) proposed an hypothesis testing procedure to study the spatial dependence between two spatial sequences. Rukhin & Vallejos (2008) stud‐ ied asymptotic properties of the codispersion coefficient first introduced by Matheron(1965). The performance and impact of this coefficient to quantify the spatial association between two images is currently under study Ojeda et al. (2012). An adaptation of this coefficient to time series analysis was studied in Vallejos (2008).

In the context of clustering time series Chouakria & Nagabhushan (2007) proposed a dis‐ tance measure that is a function of the codispersion coefficient. This measure includes the correlation behavior and the proximity of two time series. They proposed to combine these distances in a multiplicative way, introducing a tuning constant controlling the weight of each quantity in the final product. This makes the measure flexible to model sequences with different behaviors, comparing them in terms of both correlation and dissimilarity between the values of the series.

The structure of this chapter consist in two parts. In the first part we review some theoretical aspects of the spatial ARMA processes. Then the algorithm suggested by Ojeda(2010), its limitations and advantages are briefly described. In order to propose a more efficient algo‐ rithm new variants of this algorithm are suggested specially to address the problem of de‐ termining the most convenient (in terms of the quality of the segmentation) prediction window of unilateral AR-2D processes. The computation of the distance between the filtered images and the original one will be done by using the codispersion coefficient and other im‐ age quality measures (Wang and Bovik 2002). Examples with real images will highlight the features of the modified algorithm. In the second part, the codispersion coefficient previous‐ ly used to measure the closeness between images is utilized in a distance measure to per‐ form cluster analysis of time series. The distance measure introduced in Chouakria & Nagabhushan (2007) is generalized in the sense that considers an arbitrary lag *h* that allows us to capture a higher serial correlation of two temporal or spatial sequences. Examples and numerical studies are presented to explore our proposal in several different scenarios. We explore the performance of hierarchical methods to classify correlated sequences when the proposed proximity measure is used, employing the Monte Carlo simulation. An applica‐ tion is discussed for time series measuring the Normalized Difference Vegetation Index (NDVI) in four locations of Argentina. The clusters formed using hierarchical classification techniques with the proposed distance measure preserve the geographical location where the series were obtained, providing information that is unavailable when using hierarchical methods with conventional distance measures.

## **2. Image Segmentation Through Estimation of Spatial ARMA Processes**

#### **2.1. The Spatial ARMA Processes**

are highlighted, suggesting that the algorithm can be used for border detection. Experimen‐ tal results with real images clarify how the algorithm works in practice. A robust version of the algorithm was also proposed, to be used when the original image is contaminated with

Another concern that has been pointed out in the context of spatial statistics is the develop‐ ment of coefficients to compare two spatial processes. Coefficients that take into account the spatial association between two processes have been proposed in the literature. (Tjostheim, 1978) suggested a nonparametric coefficient to assess the spatial association between two spatial variables. Later on, (Clifford et al. 1989) proposed an hypothesis testing procedure to study the spatial dependence between two spatial sequences. Rukhin & Vallejos (2008) stud‐ ied asymptotic properties of the codispersion coefficient first introduced by Matheron(1965). The performance and impact of this coefficient to quantify the spatial association between two images is currently under study Ojeda et al. (2012). An adaptation of this coefficient to

In the context of clustering time series Chouakria & Nagabhushan (2007) proposed a dis‐ tance measure that is a function of the codispersion coefficient. This measure includes the correlation behavior and the proximity of two time series. They proposed to combine these distances in a multiplicative way, introducing a tuning constant controlling the weight of each quantity in the final product. This makes the measure flexible to model sequences with different behaviors, comparing them in terms of both correlation and dissimilarity between

The structure of this chapter consist in two parts. In the first part we review some theoretical aspects of the spatial ARMA processes. Then the algorithm suggested by Ojeda(2010), its limitations and advantages are briefly described. In order to propose a more efficient algo‐ rithm new variants of this algorithm are suggested specially to address the problem of de‐ termining the most convenient (in terms of the quality of the segmentation) prediction window of unilateral AR-2D processes. The computation of the distance between the filtered images and the original one will be done by using the codispersion coefficient and other im‐ age quality measures (Wang and Bovik 2002). Examples with real images will highlight the features of the modified algorithm. In the second part, the codispersion coefficient previous‐ ly used to measure the closeness between images is utilized in a distance measure to per‐ form cluster analysis of time series. The distance measure introduced in Chouakria & Nagabhushan (2007) is generalized in the sense that considers an arbitrary lag *h* that allows us to capture a higher serial correlation of two temporal or spatial sequences. Examples and numerical studies are presented to explore our proposal in several different scenarios. We explore the performance of hierarchical methods to classify correlated sequences when the proposed proximity measure is used, employing the Monte Carlo simulation. An applica‐ tion is discussed for time series measuring the Normalized Difference Vegetation Index (NDVI) in four locations of Argentina. The clusters formed using hierarchical classification techniques with the proposed distance measure preserve the geographical location where the series were obtained, providing information that is unavailable when using hierarchical

additive outliers. Applications in the context of image inpainting were also offered.

time series analysis was studied in Vallejos (2008).

methods with conventional distance measures.

the values of the series.

26 Advances in Image Segmentation

Spatial ARMA processes have been studied in the context of random fields indexed overℤ*<sup>d</sup>* , *<sup>d</sup>* <sup>≥</sup>2, where ℤ*<sup>d</sup>* is endowed with the usual partial order. That is, for*<sup>s</sup>* =(*s*1, *<sup>s</sup>*2, …, *sd* ), *<sup>u</sup>* =(*u*1, *<sup>u</sup>*2, …, *ud* )inℤ*<sup>d</sup>* ,*<sup>s</sup>* <sup>≤</sup>*u* if for*<sup>i</sup>* =1, 2, .…, *<sup>d</sup>*,*si* <sup>≤</sup>*ui* For *a*, *b*∈ℤ*<sup>d</sup>* , such that *a* ≤*b*and *a* ≠*b*, we define *S a*, *b* ={*x* ∈ℤ*<sup>d</sup>* |*a* ≤ *x* ≤*b*}and *S a*, *b* =*S a*, *b* \{*a*}.

A random field (*Xs*) *s*∈ℤ*<sup>d</sup>* is said to be a spatial ARMA(*p*, *<sup>q</sup>*)with parameters *p*, *<sup>q</sup>* ∈ℤ*<sup>d</sup>* if it is weakly stationary and satisfies the equation

$$X\_s - \sum\_{j \in S \: \{0, p\}} \clubsuit\_j X\_{s-j} = \varepsilon\_t + \sum\_{k \in S \: \{0, q\}} \theta\_j \varepsilon\_{s-j} \tag{1}$$

where (*ϕ<sup>j</sup>* ) *j*∈*S* 0, *p* and(*ε<sup>j</sup>* ) *k*∈*S* 0,*q* , respectively, denote the autoregressive and moving aver‐ age parameters with *ϕ*<sup>0</sup> =*θ*<sup>0</sup> =1, and (*εs*) *s*∈ℤ*<sup>d</sup>*denotes a sequence of independent and identi‐ cally distributed centered random variables with variance*σ* <sup>2</sup> . Notice that if *p* =0, the sum over *S* 0, *p* is supposed to be zero, and the process is called a spatial moving average MA (*q*) random field. Similarly, if*q* =0, the process is called a spatial autoregressive AR(*p*)ran‐ dom field. The ARMA random field is labeled as causal if it has the following unilateral rep‐ resentation.

$$X\_s = \sum\_{\substack{\mathfrak{p} \in S \ \mathfrak{d} \text{ odd}}} \psi\_{\mathfrak{p}} \varepsilon\_{s-j}$$

with∑ *j* <sup>|</sup> *<sup>ϕ</sup>j*<sup>|</sup> <sup>&</sup>lt;*∞*. Similar to the time series case, there are conditions on the (AR or MA) polynomials that ensure stationarity and invertibility, respectively. Let *<sup>Φ</sup>*(*z*)=1−∑ *<sup>j</sup>*∈*<sup>S</sup>* 0, *<sup>p</sup> <sup>ϕ</sup><sup>j</sup> z j* and *Θ*(*z*)=1−∑ *<sup>j</sup>*∈*<sup>S</sup>* 0,*<sup>q</sup> θj z j* , where *z* =(*z*1, *z*2, …, *zd* )and *z j* = *z*<sup>1</sup> *j* 1 *z*2 *j* 2…*zd j d* . A sufficient condition for the random field to be causal is that the AR poly‐ nomial *Φ*(*z*) has no zeros in the closure of the open disc*D <sup>d</sup>* inℂ*<sup>d</sup>* . For example, if *d* =2, the process is causal if *Φ*(*z*1, *z*2)is not zero for any *z*1and *z*2that simultaneously satisfy | *z*<sup>1</sup> | <1 and | *z*<sup>2</sup> | <1 (Jain et al., 1999).

Applications of spatial ARMA processes have been developed, including analysis of yield trials in the context of incomplete block designs (Cullis & Glesson 1991, Grondona et al. 1996) and the study of spatial unilateral first-order ARMA model (Basu & Reinsel, 1993). Other theoretical extensions of time series and spatial ARMA models can be found in (Baran et al., 2004, Bustos et al., 2009b, Gaetan & Guyon 2010, Choi 2000, Genton & Koul 2008, Guo 1998, Vallejos and Garccía-Donato 2006).

## **2.2. An Image Segmentation Algorithm**

In this section, we describe an image segmentation algorithm that is based on a previous fit‐ ting of spatial autoregressive models to an image. This fitted image is constructed by divid‐ ing the original image into squared sub-images (e.g.,8×8) and then fitting a spatial autoregressive model to each sub-image (i.e., block). Then, we generate a sub-image from each local fitted model, preserving intensities on the boundary to smooth the edges between blocks. The final fitted image is yielded by putting together all generated sub-images.

well represent the patterns that are present in the original image, then the residual image will contain useful information that has not been explained by the model. Thus, to imple‐ ment an algorithm based on these notions, we must characterize which patterns are present in the residual image when the fitted image is not a good representation of the original one, and we must develop a technique to produce a fitting that is satisfactory in terms of segmen‐ tation but not a very good estimation in that the residual image still contains valuable infor‐ mation. (Ojeda et al. 2010) investigated these concerns and, based on several numerical experiments with images, determined that the residual image associated with a good local fitting is in fact poor in terms of structure (i.e., it is very similar to a white noise). However, when the fitted image is poor in terms of estimation, the residual image is useful for high‐ lighting the boundaries and edges of the original image. Moreover, a bad fitting is related to the size of the block (or window) used in Algorithm 1. The best performance is attained for the maximum block size, which would be the size of the original image. The image segmen‐

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

http://dx.doi.org/10.5772/50513

29

tation algorithm introduced by (Ojeda et al. 2010) can be summarized as follows.

segmentations highlight the borders and boundaries present in the original image.

In all experiments carried out in (Ojeda et al., 2010) and (Quintana et al., 2011), Algorithm 1 was implemented using the same prediction window for the AR-2D process, which contains only two elements belonging to a strongly causal region on the plane. Here, we consider other prediction windows to observe the effect on the performance of Algorithm 2. A description

Example 1. We present examples with real images to illustrate the performance of Algo‐ rithms 1 and 2. These images were taken from the database http://sipi.usc.edu/database. Fig‐ ure 1(a) shows an original image of size512×512 (aerial), and Figure 1(b) shows the image generated by Algorithm 1 when a moving window of size 512×512is used to define an AR-2D process on the plane. It is not possible to visualize the differences between the origi‐ nal and fitted images. However, the residual image (Fig 1(c)) shows patterns that the model is not able to capture. Basically, the AR-2D model does not capture the changes in the tex‐ ture produced by lines, borders and object boundaries. These features are contained in the residual image produced by Algorithm 2 such that the good performance of Algorithm 2 is associated with a moderate fitting of the AR-2D model. Another image (peepers) was proc‐ essed by Algorithm 2 to show the effect of the size of the moving window. Figure 2(b) shows the segmentation produced by Algorithm 2 using a moving window of size 128×128. Another segmentation with a moving of size 512×512is shown in Figure 2. In both cases, the

^ of *Z*.

^

1. Use Algorithm 1 to generate an approximated image *Z*

**2.3. Improving the Segmentation Algorithm**

2. Compute the residual autoregressive image given by *Z* −*Z*

Algorithm 2.

Let*Z* =*Zm*,*n*,0≤*m*≤*M* −1 ,0≤*n* ≤ *N* −1 , be the original image, and letX=X*m*,*n*, <sup>0</sup>≤*m*≤*<sup>M</sup>* <sup>−</sup>1, <sup>0</sup>≤*<sup>n</sup>* <sup>≤</sup> *<sup>N</sup>* <sup>−</sup>1, where for all0≤*m*≤*<sup>M</sup>* <sup>−</sup>1,0≤*<sup>n</sup>* <sup>≤</sup> *<sup>N</sup>* <sup>−</sup><sup>1</sup> ,*Xm*,*<sup>n</sup>* <sup>=</sup>*Zm*,*<sup>n</sup>* <sup>−</sup>*<sup>Z</sup>*¯, and *Z*¯is the mean ofZ. Let 4≤*k* ≤min(*M* , *N* )and consider the rearrange images Z=Z*m*,*n*, X=X*<sup>m</sup>*,*<sup>n</sup>*,

where0≤*m*≤*M* ′ −1,0≤*n* ≤ *N* ′ −1 , *M* ′ <sup>=</sup> *<sup>M</sup>* <sup>−</sup> <sup>1</sup> *<sup>k</sup>* <sup>−</sup> <sup>1</sup> (*<sup>k</sup>* <sup>−</sup>1) <sup>+</sup> 1,*<sup>N</sup>* ′ <sup>=</sup> *<sup>N</sup>* <sup>−</sup> <sup>1</sup> *<sup>k</sup>* <sup>−</sup> <sup>1</sup> (*k* −1) + 1. For all *i <sup>b</sup>* =1, <sup>⋯</sup>, *<sup>M</sup>* <sup>−</sup> <sup>1</sup> *<sup>k</sup>* <sup>−</sup> <sup>1</sup> and for all *j <sup>b</sup>* =1, <sup>⋯</sup>, *<sup>N</sup>* <sup>−</sup> <sup>1</sup> *<sup>k</sup>* <sup>−</sup> <sup>1</sup> the (*k* −1)×(*k* −1)block (*i <sup>b</sup>*, *j <sup>b</sup>*)of the image *X* is de‐ fined as *BX* (*i <sup>b</sup>*, *j <sup>b</sup>*) = *Xr*,*s*,

where (*k* −1)(*i <sup>b</sup>* −1) + 1≤*r* ≤(*k* −1)*i <sup>b</sup>*and(*k* −1)( *j <sup>b</sup>* −1) + 1≤*s* ≤(*k* −1) *j <sup>b</sup>*. Then, the approximated im‐ age *X* ^ of *<sup>X</sup>* is provided by Algorithm 1.

Algorithm 1.

For each block *BX* (*i <sup>b</sup>*, *j b*)

1. Compute estimators*ϕ* ^ 1(*i <sup>b</sup>*, *j <sup>b</sup>*), *ϕ* ^ 2(*i <sup>b</sup>*, *j <sup>b</sup>*)of *ϕ*1 and *ϕ*2corresponding to the block *BX* (*i <sup>b</sup>*, *j <sup>b</sup>*)ex‐ tended to:

$$\begin{array}{l} \mathrm{B}\_{X}^{\prime} \left( \mathbf{i}\_{b^{\prime}} \cdot \mathbf{j}\_{b} \right) = \mathrm{X}\_{r,s^{\prime}} \\ \text{where} \\ \mathrm{where} \\ \begin{array}{l} 2 \text{ Let } \hat{\mathbf{X}} \text{ be defined on the block } \mathcal{B}\_{\mathbf{X}} \left( \mathbf{i}\_{b^{\prime}} \cdot \mathbf{j}\_{b^{\prime}} \right) \mathbf{X} \\ \text{X}\_{r,s} = \hat{\phi}\_{1}^{\prime} \mathbf{i}\_{b^{\prime}} \cdot \mathbf{j}\_{b} \mathbf{j}\_{r-1,s} + \hat{\phi}\_{2}^{\prime} \mathbf{i}\_{b^{\prime}} \mathbf{j}\_{b} \mathbf{j}\_{r,s-1} \\ \end{array} \\ \text{where } (k-1)(\mathbf{i}\_{b} - 1) + 1 \le r \le (k-1) \mathbf{i}\_{b} \text{ and } (k-1)(\mathbf{j}\_{b} - 1) + 1 \le s \le (k-1) \mathbf{j}\_{b} \\ \text{Then the approximated image } Z \text{ of the original image } Z \text{ is:} \\ \hat{Z}\_{m,n} = \hat{X}\_{m,n} + \overline{Z}\_{r} \quad 0 \le m \le M \ \mathrm{-} 1 \text{-} \text{ } 0 \le n \le N \ \mathrm{-} 1. \end{array}$$

The image segmentation algorithm we describe below is supported by a widely known no‐ tion in regression analysis. If a fitted image very well represents the patterns on the original image, then the residual image (i.e., the fitted image minus the observed image) will not contain useful information about the original patterns because the model already explains the features that are present in the original image. On the contrary, if the model does not

*b*.

well represent the patterns that are present in the original image, then the residual image will contain useful information that has not been explained by the model. Thus, to imple‐ ment an algorithm based on these notions, we must characterize which patterns are present in the residual image when the fitted image is not a good representation of the original one, and we must develop a technique to produce a fitting that is satisfactory in terms of segmen‐ tation but not a very good estimation in that the residual image still contains valuable infor‐ mation. (Ojeda et al. 2010) investigated these concerns and, based on several numerical experiments with images, determined that the residual image associated with a good local fitting is in fact poor in terms of structure (i.e., it is very similar to a white noise). However, when the fitted image is poor in terms of estimation, the residual image is useful for high‐ lighting the boundaries and edges of the original image. Moreover, a bad fitting is related to the size of the block (or window) used in Algorithm 1. The best performance is attained for the maximum block size, which would be the size of the original image. The image segmen‐ tation algorithm introduced by (Ojeda et al. 2010) can be summarized as follows.

Algorithm 2.

**2.2. An Image Segmentation Algorithm**

28 Advances in Image Segmentation

−1,0≤*n* ≤ *N* ′

*<sup>b</sup>* −1) + 1≤*r* ≤(*k* −1)*i*

^ of *<sup>X</sup>* is provided by Algorithm 1.

*<sup>b</sup>*, *j b*)

*<sup>b</sup>* −1)≤*r* ≤(*k* −1)*i*

*<sup>b</sup>*)*Xr*−1,*<sup>s</sup>* + *ϕ*

Then the approximated image *Z*

*<sup>m</sup>*,*<sup>n</sup>* <sup>+</sup> *<sup>Z</sup>*¯, <sup>0</sup>≤*m*≤*<sup>M</sup>* ′

^ be defined on the block *BX* (*<sup>i</sup>*

^ 2(*i <sup>b</sup>*, *j*

*<sup>b</sup>* −1) + 1≤*r* ≤(*k* −1)*i*

^ 1(*i <sup>b</sup>*, *j <sup>b</sup>*), *ϕ* ^ 2(*i <sup>b</sup>*, *j*

*<sup>k</sup>* <sup>−</sup> <sup>1</sup> and for all *j*

Z=Z*m*,*n*, X=X*<sup>m</sup>*,*<sup>n</sup>*,

fined as *BX* (*i <sup>b</sup>*, *j*

age *X*

*i*

where0≤*m*≤*M* ′

*<sup>b</sup>* =1, <sup>⋯</sup>, *<sup>M</sup>* <sup>−</sup> <sup>1</sup>

where (*k* −1)(*i*

Algorithm 1.

tended to:

where(*k* −1)(*i*

where (*k* −1)(*i*

2. Let *X*

*BX* ′ (*i <sup>b</sup>*, *j*

*X* ^ *<sup>r</sup>*,*<sup>s</sup>* =*ϕ* ^ 1(*i <sup>b</sup>*, *j*

*Z* ^ *<sup>m</sup>*,*<sup>n</sup>* = *X* ^

For each block *BX* (*i*

1. Compute estimators*ϕ*

*<sup>b</sup>*) = *Xr*,*s*,

*<sup>b</sup>*) = *Xr*,*s*,

In this section, we describe an image segmentation algorithm that is based on a previous fit‐ ting of spatial autoregressive models to an image. This fitted image is constructed by divid‐ ing the original image into squared sub-images (e.g.,8×8) and then fitting a spatial autoregressive model to each sub-image (i.e., block). Then, we generate a sub-image from each local fitted model, preserving intensities on the boundary to smooth the edges between blocks. The final fitted image is yielded by putting together all generated sub-images.

Let*Z* =*Zm*,*n*,0≤*m*≤*M* −1 ,0≤*n* ≤ *N* −1 , be the original image, and letX=X*m*,*n*, <sup>0</sup>≤*m*≤*<sup>M</sup>* <sup>−</sup>1, <sup>0</sup>≤*<sup>n</sup>* <sup>≤</sup> *<sup>N</sup>* <sup>−</sup>1, where for all0≤*m*≤*<sup>M</sup>* <sup>−</sup>1,0≤*<sup>n</sup>* <sup>≤</sup> *<sup>N</sup>* <sup>−</sup><sup>1</sup> ,*Xm*,*<sup>n</sup>* <sup>=</sup>*Zm*,*<sup>n</sup>* <sup>−</sup>*<sup>Z</sup>*¯, and *Z*¯is the

<sup>=</sup> *<sup>M</sup>* <sup>−</sup> <sup>1</sup>

*<sup>b</sup>* −1)≤*s* ≤(*k* −1) *j*

^ of the original image *Z*is:

The image segmentation algorithm we describe below is supported by a widely known no‐ tion in regression analysis. If a fitted image very well represents the patterns on the original image, then the residual image (i.e., the fitted image minus the observed image) will not contain useful information about the original patterns because the model already explains the features that are present in the original image. On the contrary, if the model does not

−1.

*<sup>b</sup>*, *j <sup>b</sup>*)by

*<sup>b</sup>*and (*k* −1)( *j*

*<sup>k</sup>* <sup>−</sup> <sup>1</sup> (*<sup>k</sup>* <sup>−</sup>1) <sup>+</sup> 1,*<sup>N</sup>* ′

*<sup>k</sup>* <sup>−</sup> <sup>1</sup> the (*k* −1)×(*k* −1)block (*i*

*<sup>b</sup>* −1) + 1≤*s* ≤(*k* −1) *j*

*b*.

*<sup>b</sup>* −1) + 1≤*s* ≤(*k* −1) *j*

<sup>=</sup> *<sup>N</sup>* <sup>−</sup> <sup>1</sup>

*<sup>b</sup>*)of *ϕ*1 and *ϕ*2corresponding to the block *BX* (*i*

*b*.

*<sup>b</sup>*, *j*

*<sup>k</sup>* <sup>−</sup> <sup>1</sup> (*k* −1) + 1. For all

*<sup>b</sup>*. Then, the approximated im‐

*<sup>b</sup>*)of the image *X* is de‐

*<sup>b</sup>*, *j <sup>b</sup>*)ex‐

mean ofZ. Let 4≤*k* ≤min(*M* , *N* )and consider the rearrange images

−1 , *M* ′

*<sup>b</sup>* =1, <sup>⋯</sup>, *<sup>N</sup>* <sup>−</sup> <sup>1</sup>

*<sup>b</sup>*,(*k* −1)( *j*

*<sup>b</sup>*)*Xr*,*s*−<sup>1</sup>

−1, 0≤*n* ≤ *N* ′

*<sup>b</sup>*and(*k* −1)( *j*


Example 1. We present examples with real images to illustrate the performance of Algo‐ rithms 1 and 2. These images were taken from the database http://sipi.usc.edu/database. Fig‐ ure 1(a) shows an original image of size512×512 (aerial), and Figure 1(b) shows the image generated by Algorithm 1 when a moving window of size 512×512is used to define an AR-2D process on the plane. It is not possible to visualize the differences between the origi‐ nal and fitted images. However, the residual image (Fig 1(c)) shows patterns that the model is not able to capture. Basically, the AR-2D model does not capture the changes in the tex‐ ture produced by lines, borders and object boundaries. These features are contained in the residual image produced by Algorithm 2 such that the good performance of Algorithm 2 is associated with a moderate fitting of the AR-2D model. Another image (peepers) was proc‐ essed by Algorithm 2 to show the effect of the size of the moving window. Figure 2(b) shows the segmentation produced by Algorithm 2 using a moving window of size 128×128. Another segmentation with a moving of size 512×512is shown in Figure 2. In both cases, the segmentations highlight the borders and boundaries present in the original image.

## **2.3. Improving the Segmentation Algorithm**

In all experiments carried out in (Ojeda et al., 2010) and (Quintana et al., 2011), Algorithm 1 was implemented using the same prediction window for the AR-2D process, which contains only two elements belonging to a strongly causal region on the plane. Here, we consider other prediction windows to observe the effect on the performance of Algorithm 2. A description

1 and 2 were implemented using the prediction windows *W*1, *W*2, *W*3, and *W*4, with two

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

Visually, the best segmentation for the aerial image is yielded by the prediction window *W*2. The lines and edges are better highlighted in this segmentation (Figure 4(b)) than in the oth‐ er segmentations. The dark regions are also stressed, which provides a more intense and

To gain insight on image quality measures, the fitted images produced by Algorithm 1 asso‐ ciated with the images shown in Figure 4(a) -(d) were compared aerially with the original image using three coefficients described in (Ojeda et al., 2012). These coefficients are briefly

*VX* (*h* )*VY* (*h* )

∑*<sup>s</sup>*,*s*+*<sup>h</sup>* <sup>∈</sup>*<sup>D</sup>* ' *asbs*

*<sup>X</sup>* (*<sup>h</sup>* )=∑*<sup>s</sup>*,*s*+*<sup>h</sup>* <sup>∈</sup>*<sup>D</sup>* ′ *as*

2 , and *V* ^

*V* ^ *<sup>X</sup>* (*h* )*V* ^

where *s*, *<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* <sup>∈</sup>*D*, *<sup>γ</sup>*(*<sup>h</sup>* )=<sup>E</sup> *<sup>X</sup>* (*<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* )<sup>−</sup> *<sup>X</sup>* (*s*) *<sup>Y</sup>* (*<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* )−*<sup>Y</sup>* (*s*) , *VX* (*<sup>h</sup>* )=<sup>E</sup> *<sup>X</sup>* (*<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* )<sup>−</sup> *<sup>X</sup>* (*s*) <sup>2</sup>

*<sup>ρ</sup>*(*<sup>h</sup>* )= *<sup>γ</sup>*(*<sup>h</sup>* )

*<sup>s</sup>*∈*<sup>D</sup> and* (*Ys*)

*<sup>s</sup>*∈*D*, *<sup>D</sup>* ⊂ℤ*<sup>d</sup>* .For a given *<sup>h</sup>* <sup>∈</sup>*D*,

http://dx.doi.org/10.5772/50513

31

(and

, (5)

*<sup>Y</sup>* (*<sup>h</sup>* ) (6)

*<sup>Y</sup>* (*<sup>h</sup>* )=∑*<sup>s</sup>*,*s*+*<sup>h</sup>* <sup>∈</sup>*<sup>D</sup>* ′ *bs*

2 .

)<*∞*, *as* =X(*s*<sup>1</sup> + *h*1, s2 + *h*2) − *X* (*s*1, s2),

elements each (Figure 3(a)).

**Figure 3.** Strongly causal prediction windows.

brighter partition of the original features.

Consider two weakly stationary processes, (*Xs*)

For *d* =2, the sample codispersion coefficient is defined by

*ρ* ^(*<sup>h</sup>* )=

⊆*D*, # (*D* ′

^

the codispersion coefficient is defined as

described below.

similarly forV*<sup>Y</sup>* (*h* )).

with*<sup>s</sup>* =(*s*1, *<sup>s</sup>*2), *<sup>h</sup>* =(*h*1, *<sup>h</sup>*2), *<sup>D</sup>* ′

*bs* =Y(*s*<sup>1</sup> + *h*1, s2 + *h*2) −*Y* (*s*1, s2), *V*

**Figure 1.** Images generated by Algorithms 1 and 2.

**Figure 2.** (b)-(c) Images generated by Algorithm 2 with prediction windows of 128×128and 512×512respectively.

of the most commonly used prediction windows in statistical image processing is in Bustos et al., (2009a). A brief description of the strongly causal prediction windows is given below.

For all (*m*, *n*)∈ℤ<sup>2</sup> , a strongly causal region at (*m*, *n*) is defined as

$$S \subset \mathfrak{k}\mathfrak{m}, \mathfrak{n}\mathfrak{h} = \left\{ (k, l) \in \mathbb{Z}^2 : k \le m, \ l \le n \right\} - \left\{ (m, n) \right\} \tag{2}$$

For a given *M* ∈ℕ, a strongly causal prediction window is

$$\mathcal{W} = \{(k, l) \in \text{s} \,(m, n) \colon m - M \le k \le m, \; n - M \le l \le n\}. \tag{3}$$

In particular, if*M* =1, then a strongly causal prediction window containing three elements is

$$\mathcal{W}\_1 = \left\{ (k, l) \in s(m, n) \colon m - 1 \le k \le m, \ n - 1 \le l \le n \right\} \tag{4}$$

The set *W*1 is shown in Figure 3 (b). Similarly, strongly causal prediction windows can be defined by considering not only the top left quadrant on the plane ℤ<sup>2</sup> .The definition of such sets generates the prediction windows *W*2, *W*3, and*W*4, as shown in Figure3(b). Algorithms 1 and 2 were implemented using the prediction windows *W*1, *W*2, *W*3, and *W*4, with two elements each (Figure 3(a)).

**Figure 3.** Strongly causal prediction windows.

**Figure 1.** Images generated by Algorithms 1 and 2.

30 Advances in Image Segmentation

For all (*m*, *n*)∈ℤ<sup>2</sup>

**Figure 2.** (b)-(c) Images generated by Algorithm 2 with prediction windows of 128×128and 512×512respectively.

, a strongly causal region at (*m*, *n*) is defined as

For a given *M* ∈ℕ, a strongly causal prediction window is

defined by considering not only the top left quadrant on the plane ℤ<sup>2</sup>

of the most commonly used prediction windows in statistical image processing is in Bustos et al., (2009a). A brief description of the strongly causal prediction windows is given below.

In particular, if*M* =1, then a strongly causal prediction window containing three elements is

The set *W*1 is shown in Figure 3 (b). Similarly, strongly causal prediction windows can be

sets generates the prediction windows *W*2, *W*3, and*W*4, as shown in Figure3(b). Algorithms

*S*(*m*, *n*)={(*k*, *l*)∈ℤ<sup>2</sup> : *k* ≤*m*, *l* ≤*n*)} −{(*m*, *n*)} (2)

*W* ={(*k*, *l*)∈*s*(*m*, *n*):*m*−*M* ≤*k* ≤*m*, *n* −*M* ≤*l* ≤*n*}. (3)

*W*<sup>1</sup> ={(*k*, *l*)∈*s*(*m*, *n*):*m*−1≤*k* ≤*m*, *n* −1≤*l* ≤*n*} (4)

.The definition of such

Visually, the best segmentation for the aerial image is yielded by the prediction window *W*2. The lines and edges are better highlighted in this segmentation (Figure 4(b)) than in the oth‐ er segmentations. The dark regions are also stressed, which provides a more intense and brighter partition of the original features.

To gain insight on image quality measures, the fitted images produced by Algorithm 1 asso‐ ciated with the images shown in Figure 4(a) -(d) were compared aerially with the original image using three coefficients described in (Ojeda et al., 2012). These coefficients are briefly described below.

Consider two weakly stationary processes, (*Xs*) *<sup>s</sup>*∈*<sup>D</sup> and* (*Ys*) *<sup>s</sup>*∈*D*, *<sup>D</sup>* ⊂ℤ*<sup>d</sup>* .For a given *<sup>h</sup>* <sup>∈</sup>*D*, the codispersion coefficient is defined as

$$\rho(h) = \frac{\gamma(h)}{\sqrt{V\_X(h)V\_Y(h)}},\tag{5}$$

where *s*, *<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* <sup>∈</sup>*D*, *<sup>γ</sup>*(*<sup>h</sup>* )=<sup>E</sup> *<sup>X</sup>* (*<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* )<sup>−</sup> *<sup>X</sup>* (*s*) *<sup>Y</sup>* (*<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* )−*<sup>Y</sup>* (*s*) , *VX* (*<sup>h</sup>* )=<sup>E</sup> *<sup>X</sup>* (*<sup>s</sup>* <sup>+</sup> *<sup>h</sup>* )<sup>−</sup> *<sup>X</sup>* (*s*) <sup>2</sup> (and similarly forV*<sup>Y</sup>* (*h* )).

For *d* =2, the sample codispersion coefficient is defined by

$$\stackrel{\frown}{\rho}(h \text{ \textquotedblleft} h \text{ \textquotedblright}) = \frac{\sum\_{s,s \text{\*}h \in D\_s} a\_s b\_s}{\sqrt{V\_X(h \text{ \textquotedblleft} V\_Y(h \text{ \textquotedblright})}}\tag{6}$$

with*<sup>s</sup>* =(*s*1, *<sup>s</sup>*2), *<sup>h</sup>* =(*h*1, *<sup>h</sup>*2), *<sup>D</sup>* ′ ⊆*D*, # (*D* ′ )<*∞*, *as* =X(*s*<sup>1</sup> + *h*1, s2 + *h*2) − *X* (*s*1, s2), *bs* =Y(*s*<sup>1</sup> + *h*1, s2 + *h*2) −*Y* (*s*1, s2), *V* ^ *<sup>X</sup>* (*<sup>h</sup>* )=∑*<sup>s</sup>*,*s*+*<sup>h</sup>* <sup>∈</sup>*<sup>D</sup>* ′ *as* 2 , and *V* ^ *<sup>Y</sup>* (*<sup>h</sup>* )=∑*<sup>s</sup>*,*s*+*<sup>h</sup>* <sup>∈</sup>*<sup>D</sup>* ′ *bs* 2 .

*CQ*(*h* )=*ρ*

The correlation coefficient and the coefficients defined in (6), (7) and (8) were computed to compare the fitted images, which were generated with a prediction window with two ele‐ ments and associated with the images shown in Figure 4(a) -(f), and the original images. The results are shown in Table 1. In all cases, the highest values of the image quality measures are attained for the image fitted using the prediction window *W*2.This means that the resid‐ ual image shown in Figure 4 (b) is the best segmentation yielded by Algorithm 2. The same

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

**Table 1.** Image quality measures between the fitted and original (aerial) images related to the residual images shown

experiment was carried out for the image shown in Figure 2(a). Table 2 summarizes the values of the image quality coefficients for the fitted images generated by Algorithm 2 with predic‐ tion windows *W*1, *W*2, *W*3, and *W*4. In this case, the best performance is for the fitted image

generated with prediction window *W*4.In general, the performance of Algorithm 2 depends on the choice of the prediction window. One way to choose the prediction window that yields the best segmentation is to maximize the association between the fitted and original im‐ ages. Indeed, if we denote the original image by *Z*and the fitted image generated by Algo‐

**Table 2.** Image quality measures between the fitted and original (peppers) images.

where *M* and *V* are defined as in (7).

in Figure 4.

^(*<sup>h</sup>* ) · *<sup>M</sup>* · *<sup>V</sup>* , (8)

http://dx.doi.org/10.5772/50513

33

**Figure 4.** a)-(d) Images generated by Algorithm 2 with prediction windows *W*<sup>1</sup> −*W*4respectively.

The index *Q* (Wang and Bovik, 2002) is

$$\mathcal{Q} = \frac{4S\_{XY}\,\overline{X}\,\overline{Y}}{(S\_X^2 + S\_Y^2)\|\overline{X}^2 + \overline{Y}^2\|} = \frac{S\_{XY}}{S\_X\,S\_Y} \cdot \frac{2\,\overline{X}\,\overline{Y}}{\overline{X}^2 + \overline{Y}^2} \cdot \frac{2S\_XS\_Y}{S\_X^2 + S\_Y^2} = \mathbb{C} \cdot M \cdot V\_{\prime} \tag{7}$$

where *<sup>X</sup>*¯ is the mean of(*Xs*) *<sup>s</sup>*∈*D*, *SX* is the standard deviation of (*Xs*) *<sup>s</sup>*∈*D*, and *SXY* is the co‐ variance between (*Xs*) *<sup>s</sup>*∈*D*and (*Ys*) *<sup>s</sup>*∈*D*(and similarly for *<sup>Y</sup>*¯and*SY* ). The quantity *C* =*SXY* / *SX SY* models the linear correlation between (*Xs*) *<sup>s</sup>*∈*<sup>D</sup>* and(*Ys*) *<sup>s</sup>*∈*D*, *<sup>M</sup>* =2*<sup>X</sup>*¯*Y*¯/ (*<sup>X</sup>*¯2 <sup>+</sup> *<sup>Y</sup>*¯2 )measures the similarity between the sample means (luminance) of (*Xs*) *<sup>s</sup>*∈*D*and(*Ys*) *<sup>s</sup>*∈*D*, and *V* =2*SX SY* / (*SX* <sup>2</sup> <sup>+</sup> *SY* <sup>2</sup> )measures the similarity related to the contrast between the images. Coefficient *Q* is defined as a function of the correlation coefficient; hence, it is able to capture only the linear association between (*Xs*) *<sup>s</sup>*∈*D*and(*Ys*) *<sup>s</sup>*∈*D* It is un‐ able to account for other types of relationships between these sequences, including the spa‐ tial association in a specific direction *h* . Ojeda et al. (2012) suggested by the *CQ*index, which is defined as:

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes http://dx.doi.org/10.5772/50513 33

$$\mathbf{CQ}(h) = \stackrel{\frown}{\rho}(h) \cdot M \cdot V \,. \tag{8}$$

where *M* and *V* are defined as in (7).

**Figure 4.** a)-(d) Images generated by Algorithm 2 with prediction windows *W*<sup>1</sup> −*W*4respectively.

*SX SY*

· <sup>2</sup>*<sup>X</sup>*¯*Y*¯ *<sup>X</sup>*¯2 <sup>+</sup> *<sup>Y</sup>*¯2 ·

*<sup>s</sup>*∈*D*, *SX* is the standard deviation of (*Xs*)

2*SX SY SX* <sup>2</sup> <sup>+</sup> *SY*

)measures the similarity between the sample means (luminance) of

*<sup>s</sup>*∈*D*(and similarly for *<sup>Y</sup>*¯and*SY* ). The quantity

<sup>2</sup> )measures the similarity related to the contrast

*<sup>s</sup>*∈*D*and(*Ys*)

<sup>2</sup> =*C* · *M* · *V* , (7)

*<sup>s</sup>*∈*D*, and *SXY* is the co‐

*<sup>s</sup>*∈*<sup>D</sup>* and(*Ys*)

*<sup>s</sup>*∈*D*,

*<sup>s</sup>*∈*D* It is un‐

<sup>2</sup> ) *<sup>X</sup>*¯2 <sup>+</sup> *<sup>Y</sup>*¯2 <sup>=</sup> *SXY*

*<sup>s</sup>*∈*D*and (*Ys*)

*<sup>s</sup>*∈*D*, and *V* =2*SX SY* / (*SX*

*C* =*SXY* / *SX SY* models the linear correlation between (*Xs*)

hence, it is able to capture only the linear association between (*Xs*)

<sup>2</sup> <sup>+</sup> *SY*

between the images. Coefficient *Q* is defined as a function of the correlation coefficient;

able to account for other types of relationships between these sequences, including the spa‐ tial association in a specific direction *h* . Ojeda et al. (2012) suggested by the *CQ*index, which

The index *Q* (Wang and Bovik, 2002) is

*<sup>Q</sup>* <sup>=</sup> <sup>4</sup>*SXY <sup>X</sup>*¯*Y*¯

(*SX* <sup>2</sup> <sup>+</sup> *SY*

where *<sup>X</sup>*¯ is the mean of(*Xs*)

variance between (*Xs*)

32 Advances in Image Segmentation

*<sup>M</sup>* =2*<sup>X</sup>*¯*Y*¯/ (*<sup>X</sup>*¯2 <sup>+</sup> *<sup>Y</sup>*¯2

*<sup>s</sup>*∈*D*and(*Ys*)

is defined as:

(*Xs*)

The correlation coefficient and the coefficients defined in (6), (7) and (8) were computed to compare the fitted images, which were generated with a prediction window with two ele‐ ments and associated with the images shown in Figure 4(a) -(f), and the original images. The results are shown in Table 1. In all cases, the highest values of the image quality measures are attained for the image fitted using the prediction window *W*2.This means that the resid‐ ual image shown in Figure 4 (b) is the best segmentation yielded by Algorithm 2. The same


**Table 1.** Image quality measures between the fitted and original (aerial) images related to the residual images shown in Figure 4.

experiment was carried out for the image shown in Figure 2(a). Table 2 summarizes the values of the image quality coefficients for the fitted images generated by Algorithm 2 with predic‐ tion windows *W*1, *W*2, *W*3, and *W*4. In this case, the best performance is for the fitted image


**Table 2.** Image quality measures between the fitted and original (peppers) images.

generated with prediction window *W*4.In general, the performance of Algorithm 2 depends on the choice of the prediction window. One way to choose the prediction window that yields the best segmentation is to maximize the association between the fitted and original im‐ ages. Indeed, if we denote the original image by *Z*and the fitted image generated by Algo‐ rithm 1 with the prediction window *Wi* by *Z* ^ *Wi* , then the prediction window that produces the best segmentation can be obtained by finding the maximum value of one of the quality measures (6), (7) or (8) between *Z*and*Z* ^ *Wi* . This criterion is summarized in the following algorithm.

Dynamic time warping (DTW) is a variant of the Fréchet distance that considers mapping


these distances disregard both the temporal dependence between the sequences *x*and *y* and

Several distance measures that are functions of the correlation between two sequences

where *β*is a parameter related to the fuzzy *c*-means classification algorithm (Macqueen, 1967). However, none of these measures takes into account the serial association between the sequences because the correlation coefficient is a crude measure of association. This ap‐ proach requires the study of coefficients that are capable of capturing the spatial or serial

Consider two weakly stationary processes, *X* ={*Xs* :*s* ∈*D* ⊂ℤ}and *Y* ={*Ys* :*s* ∈*D* ⊂ℤ}, and let *x*and *y*be realizations of these processes as in Section 3.1. For *d* =1, the estimator (6) becomes

*t*∈*N* (*h* )

where *N* (*h* )={*t* :*t* + *h* ∈*D*}, *N* = | *N* (*h* )|is the cardinality of *N* (*h* ), and sequences *x*and *y*

ant, positive homogeneous, symmetric in its arguments, positive definite for a sequence and lagged versions of itself, and interpretable as the cosine of the angle between the vectors formed by the first difference of the sampled series. As in the case of classic correlation, a codispersion coefficient of +1indicates that the compared functions or processes are rescaled or retranslated versions of one another. Similarly, a profile matched with its reflection across

^(*<sup>h</sup>* )= <sup>∑</sup>*<sup>t</sup>*∈*<sup>N</sup>* (*<sup>h</sup>* ) (*xt*+*<sup>h</sup>* <sup>−</sup> *xt*)(*yt*+*<sup>h</sup>* <sup>−</sup> *yt*)

(*xt*+*<sup>h</sup>* <sup>−</sup> *xt*)<sup>2</sup> ∑

∑ *t*∈*N* (*h* )

are realizations of processes *x*and*y*, respectively. The coefficient *ρ*

coefficient shares a number of its standard properties. For example, *ρ*

(*x*, *y*)=2(1−*Cor*(*x*, *y*)),

∑ *i*=1,2,…,*m* <sup>|</sup> *xai*

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

− *ybi*|. (11)

(*yt*+*<sup>h</sup>* <sup>−</sup> *yt*)<sup>2</sup> (12)

^(*<sup>h</sup>* )=0expresses that there is no monoto‐

^(*<sup>h</sup>* )is not the correlation coefficient, the codispersion

^(*<sup>h</sup>* ) is called the comove‐

^(*<sup>h</sup>* )is translation invari‐

− *ybi*

http://dx.doi.org/10.5772/50513

35


length as the sum of the spans of all coupled observations. That is,

*r*∈*M*

The distances defined above are based on the proximity of the values | *xai*

(*Cor*(*x*, *y*)) have been suggested. For example, (Golay et al.,1998) proposed

*dDTW* (*x*, *y*)=min

<sup>|</sup>*<sup>r</sup>* <sup>|</sup> <sup>=</sup> ∑ *i*=1,2,…,*m* <sup>|</sup> *xai*

*dcc*(*x*, *<sup>y</sup>*)=( <sup>1</sup>−*Cor*(*x*, *<sup>y</sup>*)

1 + *Cor*(*x*, *y*)

correlation between two sequences.

− *ybi*|.

Dynamic time warping is then defined as

the correlation structure of each sequence.

) *β anddcc* 2

**3.2. The Codispersion Coefficient for Time Series**

*ρ*

the time axis yields a codispersion of −1. The value *ρ*

ment coefficient when *h* =1. Although *ρ*

Algorithm 3.

1. Use Algorithm 1 to generate the approximated images *Z* ^ *Wi* of *Z*, *i* =1, 2, 3, 4.

2. Compute an image quality index between *Z*and *Z* ^ *Wi* for all *i* =1, 2, 3, 4.Suppose that the maximum value for the image quality index is attained for *Z* ^ *Wj* , 1≤ *j* ≤4.Then, the best fitted image is *Z* ^ *Wj* .

3. Compute the residual autoregressive image *Z* −*Z* ^ *Wj* .

## **3. Clustering Time series**

#### **3.1. Measuring Closeness and Association Between Time Series**

Let *x* =(*x*1, *x*2, …, *xp*)and *y* =(*y*1, *y*2, …, *yq*) be two time series. There are several convention‐ al distance measures between time series. For example, if*p* =*q* =*n*, then the Euclidean dis‐ tance between *x*and *y*is defined as *dE*(*x*, *<sup>y</sup>*)=(∑ *i*=1 *n* (*xi* − *yi* )2 ) 1/2 .As is evident, *dE*ignores information about the dependence between *x*and *y*. The Minkowski distance is a generaliza‐ tion of the Euclidean distance, which is defined as

$$d\_{\mathcal{M}}\left(\mathbf{x}\_{\prime}\mid\mathbf{y}\right) = \left(\sum\_{i=1}^{n} (\mathbf{x}\_{i} - y\_{i})^{q}\right)^{1/q},\tag{9}$$

where *q*is a positive integer. The Fréchet distance was introduced to measure the proxim‐ ity between continuous curves. Let *m* be a natural number such that *m*≤min(*p*, *q*). Let *M* be the set of all mappings *r*between *x*and *y*such that *r*is a sequence of *m*pairs preserving the order

$$r = ((\propto\_{a\_1} y\_{b\_1}), (\propto\_{a\_2} y\_{b\_2}), \dots, (\propto\_{a\_m} y\_{b\_m}))\_{\prime}$$

where *ai* ∈{1, 2, ..., *p*},*bj* ∈{1, 2, ...*q*} with *a*<sup>1</sup> =1, *b*<sup>1</sup> =1, *am* = *p*, *bm* =*q* and for*i* ∈{1, 2, …, *m*−1}, *ai*+1 =(*ai* or *ai* + 1)and *bi*+1 =(*bi* or *bi* + 1).Note that |*r* | =max*i*=1,2,…,*<sup>m</sup>* | *xai* − *ybi* | is the mapping length representing the maximum span between two coupled observations. Thus, the Fré‐ chet distance between the series *x*and*y*is given by

$$d\_F(\mathbf{x}, \ y) = \min\_{r \in M} \left\lfloor r \right\rfloor = \min\_{r \in M} \{ \max\_{i \in \mathbf{1}, \mathbf{2}, \dots, n} \left\lfloor x\_{\mathbf{e}\_i} - y\_{\mathbf{b}\_i} \right\rfloor \}. \tag{10}$$

Dynamic time warping (DTW) is a variant of the Fréchet distance that considers mapping length as the sum of the spans of all coupled observations. That is,

$$\left\lfloor r \right\rfloor = \sum\_{i=1,2,\dots,m} \left\lfloor \left\lfloor \mathbf{x}\_{a\_i} - \mathbf{y}\_{b\_i} \right\rfloor \right\rfloor.$$

rithm 1 with the prediction window *Wi*

measures (6), (7) or (8) between *Z*and*Z*

1. Use Algorithm 1 to generate the approximated images *Z*

maximum value for the image quality index is attained for *Z*

**3.1. Measuring Closeness and Association Between Time Series**

*dM* (*x*, *<sup>y</sup>*)=(∑

*i*=1 *n*

2. Compute an image quality index between *Z*and *Z*

3. Compute the residual autoregressive image *Z* −*Z*

tance between *x*and *y*is defined as *dE*(*x*, *<sup>y</sup>*)=(∑

tion of the Euclidean distance, which is defined as

), …, (*xam*

chet distance between the series *x*and*y*is given by

, *ybm* )),

*dF* (*x*, *y*)=min

*ai*+1 =(*ai* or *ai* + 1)and *bi*+1 =(*bi* or *bi* + 1).Note that |*r* | =max*i*=1,2,…,*<sup>m</sup>* | *xai*

*r*∈*M*

algorithm.

image is *Z*

the order *r* =((*xa*<sup>1</sup>

, *yb*<sup>1</sup> ), (*xa*<sup>2</sup> , *yb*<sup>2</sup>

^ *Wj* .

34 Advances in Image Segmentation

**3. Clustering Time series**

Algorithm 3.

 by *Z* ^ *Wi*

^ *Wi*

the best segmentation can be obtained by finding the maximum value of one of the quality

, then the prediction window that produces

of *Z*, *i* =1, 2, 3, 4.

for all *i* =1, 2, 3, 4.Suppose that the

, 1≤ *j* ≤4.Then, the best fitted

.As is evident, *dE*ignores

, (9)

− *ybi*

− *ybi* |). (10)


. This criterion is summarized in the following

^ *Wi*

> ^ *Wj*

^ *Wi*

^ *Wj* .

Let *x* =(*x*1, *x*2, …, *xp*)and *y* =(*y*1, *y*2, …, *yq*) be two time series. There are several convention‐ al distance measures between time series. For example, if*p* =*q* =*n*, then the Euclidean dis‐

information about the dependence between *x*and *y*. The Minkowski distance is a generaliza‐

(*xi* − *yi* )*q* ) 1/*q*

where *q*is a positive integer. The Fréchet distance was introduced to measure the proxim‐ ity between continuous curves. Let *m* be a natural number such that *m*≤min(*p*, *q*). Let *M* be the set of all mappings *r*between *x*and *y*such that *r*is a sequence of *m*pairs preserving

where *ai* ∈{1, 2, ..., *p*},*bj* ∈{1, 2, ...*q*} with *a*<sup>1</sup> =1, *b*<sup>1</sup> =1, *am* = *p*, *bm* =*q* and for*i* ∈{1, 2, …, *m*−1},

length representing the maximum span between two coupled observations. Thus, the Fré‐

(max *i*=1,2,…,*m* | *xai*


*i*=1 *n*

(*xi* − *yi* )2 ) 1/2 Dynamic time warping is then defined as

$$\text{ad}\_{DTW}(\mathbf{x}, \ y) = \min\_{r \in \mathcal{M}} \left| r \right| = \min\_{r \in \mathcal{M}} \sum\_{i=1,2,\ldots,m} \left| \mathbf{x}\_{\mathbf{a}\_i} - \mathbf{y}\_{\mathbf{b}\_i} \right|. \tag{11}$$

The distances defined above are based on the proximity of the values | *xai* − *ybi* |. However, these distances disregard both the temporal dependence between the sequences *x*and *y* and the correlation structure of each sequence.

Several distance measures that are functions of the correlation between two sequences (*Cor*(*x*, *y*)) have been suggested. For example, (Golay et al.,1998) proposed

 $d\_{cc}(\mathbf{x}, \ y) = \left(\frac{1 - \operatorname{Corr}(\mathbf{x}, \ y)}{1 + \operatorname{Corr}(\mathbf{x}, \ y)}\right)^{\beta}$  $and \ d\_{cc}^{2}(\mathbf{x}, \ y) = 2(1 - \operatorname{Corr}(\mathbf{x}, \ y)) \dots$ 

where *β*is a parameter related to the fuzzy *c*-means classification algorithm (Macqueen, 1967). However, none of these measures takes into account the serial association between the sequences because the correlation coefficient is a crude measure of association. This ap‐ proach requires the study of coefficients that are capable of capturing the spatial or serial correlation between two sequences.

#### **3.2. The Codispersion Coefficient for Time Series**

Consider two weakly stationary processes, *X* ={*Xs* :*s* ∈*D* ⊂ℤ}and *Y* ={*Ys* :*s* ∈*D* ⊂ℤ}, and let *x*and *y*be realizations of these processes as in Section 3.1. For *d* =1, the estimator (6) becomes

$$\widehat{\rho}\left(\widehat{h}\right) = \frac{\sum\_{t \in N\left(\widehat{h}\right)} (\mathbf{x}\_{t \mapsto h} - \mathbf{x}\_t)(y\_{t \mapsto h} - y\_t)}{\sqrt{\sum\_{t \in N\left(\widehat{h}\right)} (\mathbf{x}\_{t \mapsto h} - \mathbf{x}\_t)^2 \sum\_{t \in N\left(\widehat{h}\right)} (y\_{t \mapsto h} - y\_t)^2}}\tag{12}$$

where *N* (*h* )={*t* :*t* + *h* ∈*D*}, *N* = | *N* (*h* )|is the cardinality of *N* (*h* ), and sequences *x*and *y* are realizations of processes *x*and*y*, respectively. The coefficient *ρ* ^(*<sup>h</sup>* ) is called the comove‐ ment coefficient when *h* =1. Although *ρ* ^(*<sup>h</sup>* )is not the correlation coefficient, the codispersion coefficient shares a number of its standard properties. For example, *ρ* ^(*<sup>h</sup>* )is translation invari‐ ant, positive homogeneous, symmetric in its arguments, positive definite for a sequence and lagged versions of itself, and interpretable as the cosine of the angle between the vectors formed by the first difference of the sampled series. As in the case of classic correlation, a codispersion coefficient of +1indicates that the compared functions or processes are rescaled or retranslated versions of one another. Similarly, a profile matched with its reflection across the time axis yields a codispersion of −1. The value *ρ* ^(*<sup>h</sup>* )=0expresses that there is no monoto‐ nicity between *x*and *y*and that their growth rates are stochastically linearly independent. More details can be found in (Rukhin & Vallejos, 2008 , Vallejos, 2008).

#### **3.3. Dissimilarity Index for Time Series**

This coefficient involves a distance measure and a correlation-type measure that addresses both the correlation behavior and the proximity of two time series. The dissimilarity index depends on similarity behaviors, which should be specified in advance. The suggested dis‐ similarity index *D*(*x*, *y*, *h* )for the realizations *x*and *y*and *d* =1is given by

$$D(\mathbf{x}, \ y, h) = f(\stackrel{\frown}{\rho}(h)) \cdot d(\mathbf{x}, \ y), \tag{13}$$

In the next section, we present two simulation examples to illustrate the capabilities of the hierarchical methods using the distance measure (13) under the tuning function given by (14). All else being constant, the clusters produced using traditional distances are usually

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

In this example, we simulate observations from six first-order autoregressive models to il‐ lustrate the clustering produced by hierarchical methods when the sequences exhibit serial

}*t*∈ℤdefine the *<sup>i</sup>* <sup>−</sup>model, and the sequence *<sup>ε</sup> <sup>i</sup>*

, *i* =4, 5, 6, are assumed to be white noise processes with variance 1

, *i* ≠ *j*, is not null due to the correlation structure between *ε <sup>i</sup>*

and *ε* <sup>2</sup>

and are uncorrelated with all other error sequences. If*i*, *j* ≤3, the correlation structure be‐

Two hundred observations were generated from each model for *ϕ*<sup>1</sup> = −0.5, *ϕ*<sup>2</sup> = −0.3, *ϕ*<sup>3</sup> = −0.7, *ϕ*<sup>4</sup> =0.1, *ϕ*<sup>5</sup> = −0.9, and *ϕ*<sup>6</sup> = −0.2.The goal was to perform time series cluster‐ ing with the Euclidean distance and (13). Hierarchical methods with complete linkage using both measures were implemented to evaluate whether the methods are capable of capturing the correlation structure between the sequences described above. We used the distance measure (13) under the tuning function (14) for *h* =1, and*k* =3. That is, the correlation structure contributes 90.5% to *D*, whereas the proximity with respect to val‐

={*ε<sup>t</sup> i*

http://dx.doi.org/10.5772/50513

37

1 and *ε<sup>t</sup>* 3 , with

and *X <sup>j</sup>*

have the covariance structure

}*t*∈ℤis zero-

and

vanishes.

different from those yielded using the distance measure (13).

correlation. To generate the series, we consider the following models.

with *<sup>σ</sup>* <sup>2</sup> <sup>=</sup>*<sup>τ</sup>* <sup>2</sup> =1, and*<sup>ρ</sup>* =0.9. The same covariance structure is assumed for *ε<sup>t</sup>*

. Instead, if*i*, *j* ≥4, *i* ≠ *j*, the correlation structure between processes *X <sup>i</sup>*

**3.4. Simulations**

*<sup>i</sup>* <sup>+</sup> *<sup>ε</sup><sup>t</sup> i*

where ∀*i* =1, 2, ..., 6, *X <sup>i</sup>*

*σ* <sup>2</sup> =*τ* <sup>2</sup> =1and *ρ* =0.7. *ε <sup>i</sup>*

tween processes *X <sup>i</sup>*

ues contributes 9.5%.

, *i* =1, 2, ...6,

mean white noise. Note that the sequences *ε* <sup>1</sup>

and*X <sup>j</sup>*

*ρστ*, *s* =*t*, 0, otherwise,

={*Xt i*

*Xt i* =*ϕiXt*−<sup>1</sup>

*Cov*(*ε<sup>t</sup>* 1 , *ε<sup>s</sup>* 2 )={

*ε j*

where *f* is an adaptive tuning function, and *d*(*x*, *y*)is one of the conventional distances de‐ scribed in Section 3.1 that summarizes the closeness of sequences *x*and*y*. There are many possible ways to choose a function *f* . Here, we follow the guidelines given in (Chouakria & Nagabhushan, 2007), according to which *f* is considered an exponential adaptive tuning function given by

$$f\_k(t) = \frac{2}{1 + \exp(kt)},\tag{14}$$

where *k* ≥0modulates the contributions of the proximity with respect to values and behav‐ ior. For example, when |*ρ* ^(1)|is large and *<sup>k</sup>* =2, the proximity with respect to behavior con‐ tributes 76.2% to*D*. The flexibility of *D*allows us to choose *k*such that for highly dependent sequences, the correlation structure can have a large weight in (13).

Note that (13) is a generalization of the dissimilarity index introduced in Chouakria & Na‐ gabhushan, (2007). The dissimilarity index (13) can capture high-order serial correlations be‐ tween the sequences because the distance lag *h* is arbitrary, while Chouakria's index only captures the first-order correlation.

The dependence of (13) on *h* is crucial, and in some specific cases, *h* can be chosen using an optimal criterion. For example, for two AR(1) processes with parameters *ϕ*1and*ϕ*2, respec‐ tively, and a correlation structure between the errors (Rukhin & Vallejos,2008 ), it is possible to find *ϕ*1, *ϕ*2such that *Var*(*ρ* ^(1))*Var*(*<sup>ρ</sup>* ^(2)).In other words, for those processes in which the asymptotic variance of the codispersion coefficient is known, we suggest setting the value of *h* ^ to produce the minimum variance. That is,

$$\hat{h} \triangleq \operatorname\*{argmin}\_{h} \{ Var(\hat{\rho}(h)) \}. \tag{15}$$

When the variance of the codispersion coefficient is difficult to compute, resampling meth‐ ods can be use to estimate the variance of the sample codispersion coefficient (Politis & Ro‐ mano, 1994, Vallejos, 2008).

In the next section, we present two simulation examples to illustrate the capabilities of the hierarchical methods using the distance measure (13) under the tuning function given by (14). All else being constant, the clusters produced using traditional distances are usually different from those yielded using the distance measure (13).

#### **3.4. Simulations**

nicity between *x*and *y*and that their growth rates are stochastically linearly independent.

This coefficient involves a distance measure and a correlation-type measure that addresses both the correlation behavior and the proximity of two time series. The dissimilarity index depends on similarity behaviors, which should be specified in advance. The suggested dis‐

where *f* is an adaptive tuning function, and *d*(*x*, *y*)is one of the conventional distances de‐ scribed in Section 3.1 that summarizes the closeness of sequences *x*and*y*. There are many possible ways to choose a function *f* . Here, we follow the guidelines given in (Chouakria & Nagabhushan, 2007), according to which *f* is considered an exponential adaptive tuning

where *k* ≥0modulates the contributions of the proximity with respect to values and behav‐

tributes 76.2% to*D*. The flexibility of *D*allows us to choose *k*such that for highly dependent

Note that (13) is a generalization of the dissimilarity index introduced in Chouakria & Na‐ gabhushan, (2007). The dissimilarity index (13) can capture high-order serial correlations be‐ tween the sequences because the distance lag *h* is arbitrary, while Chouakria's index only

The dependence of (13) on *h* is crucial, and in some specific cases, *h* can be chosen using an optimal criterion. For example, for two AR(1) processes with parameters *ϕ*1and*ϕ*2, respec‐ tively, and a correlation structure between the errors (Rukhin & Vallejos,2008 ), it is possible

asymptotic variance of the codispersion coefficient is known, we suggest setting the value of

*Var*(*ρ*

When the variance of the codispersion coefficient is difficult to compute, resampling meth‐ ods can be use to estimate the variance of the sample codispersion coefficient (Politis & Ro‐

^(*<sup>h</sup>* ))· *<sup>d</sup>*(*x*, *<sup>y</sup>*), (13)

<sup>1</sup> <sup>+</sup> *exp*(*kt*) , (14)

^(2)).In other words, for those processes in which the

^(*<sup>h</sup>* )) . (15)

^(1)|is large and *<sup>k</sup>* =2, the proximity with respect to behavior con‐

More details can be found in (Rukhin & Vallejos, 2008 , Vallejos, 2008).

similarity index *D*(*x*, *y*, *h* )for the realizations *x*and *y*and *d* =1is given by

*D*(*x*, *y*, *h* )= *f* (*ρ*

*<sup>f</sup> <sup>k</sup>* (*t*)= <sup>2</sup>

sequences, the correlation structure can have a large weight in (13).

^(1))*Var*(*<sup>ρ</sup>*

*h*

^ =argmin *h*

**3.3. Dissimilarity Index for Time Series**

36 Advances in Image Segmentation

function given by

ior. For example, when |*ρ*

captures the first-order correlation.

to find *ϕ*1, *ϕ*2such that *Var*(*ρ*

mano, 1994, Vallejos, 2008).

to produce the minimum variance. That is,

*h* ^ In this example, we simulate observations from six first-order autoregressive models to il‐ lustrate the clustering produced by hierarchical methods when the sequences exhibit serial correlation. To generate the series, we consider the following models.

$$X\_t^{\ i} = \phi\_i X\_{t-1}^{\ i} + \varepsilon\_{t\prime}^{\ i} \qquad i = 1, \ 2, \ \dots \ 6, \ \mu$$

where ∀*i* =1, 2, ..., 6, *X <sup>i</sup>* ={*Xt i* }*t*∈ℤdefine the *<sup>i</sup>* <sup>−</sup>model, and the sequence *<sup>ε</sup> <sup>i</sup>* ={*ε<sup>t</sup> i* }*t*∈ℤis zeromean white noise. Note that the sequences *ε* <sup>1</sup> and *ε* <sup>2</sup> have the covariance structure

$$\text{Cov}(\varepsilon\_t^{-1}, \varepsilon\_s^2) = \begin{cases} \rho \sigma \tau\_{\prime} & s = t\_{\prime} \\ 0\_{\prime} & \text{otherwise}\_{\prime} \end{cases}$$

with *<sup>σ</sup>* <sup>2</sup> <sup>=</sup>*<sup>τ</sup>* <sup>2</sup> =1, and*<sup>ρ</sup>* =0.9. The same covariance structure is assumed for *ε<sup>t</sup>* 1 and *ε<sup>t</sup>* 3 , with *σ* <sup>2</sup> =*τ* <sup>2</sup> =1and *ρ* =0.7. *ε <sup>i</sup>* , *i* =4, 5, 6, are assumed to be white noise processes with variance 1 and are uncorrelated with all other error sequences. If*i*, *j* ≤3, the correlation structure be‐ tween processes *X <sup>i</sup>* and*X <sup>j</sup>* , *i* ≠ *j*, is not null due to the correlation structure between *ε <sup>i</sup>* and *ε j* . Instead, if*i*, *j* ≥4, *i* ≠ *j*, the correlation structure between processes *X <sup>i</sup>* and *X <sup>j</sup>* vanishes.

Two hundred observations were generated from each model for *ϕ*<sup>1</sup> = −0.5, *ϕ*<sup>2</sup> = −0.3, *ϕ*<sup>3</sup> = −0.7, *ϕ*<sup>4</sup> =0.1, *ϕ*<sup>5</sup> = −0.9, and *ϕ*<sup>6</sup> = −0.2.The goal was to perform time series cluster‐ ing with the Euclidean distance and (13). Hierarchical methods with complete linkage using both measures were implemented to evaluate whether the methods are capable of capturing the correlation structure between the sequences described above. We used the distance measure (13) under the tuning function (14) for *h* =1, and*k* =3. That is, the correlation structure contributes 90.5% to *D*, whereas the proximity with respect to val‐ ues contributes 9.5%.

**Figure 5.** a) Time series clustering using the Euclidean distance, (b) Time series clustering using*D*.

In Figure 5, we see that the dendrogram obtained using hierarchical methods with the Eucli‐ dean distance does not recognize the correlation structure between *X* <sup>1</sup> and *X* <sup>3</sup> .In this case, sequences *X* <sup>1</sup> , *X* <sup>2</sup> , *X* <sup>4</sup> , and *X* <sup>5</sup> are pulled together before sequence *X* <sup>3</sup> .However, hierarchi‐ cal methods using (13) yield the expected results, combining sequences *X* <sup>1</sup> , *X* <sup>2</sup> , and *X* <sup>3</sup> be‐ fore the rest of the series.

To obtain better insight into the classification process using the proposed distance measure (13), we carried out a second simulation study that involves clustering measures based on other distances (but using the same setup). Observations from models 1-6 were generated using Gaussian white noise sequences for the errors, thereby preserving the same correla‐ tion structure used in the first study. The goal was to explore the ability of the distance measure (13) to group strongly correlated series first. A total of 1000 runs were considered for this

**Table 4.** Percentage of correct clustering of the correlated series (1,2, and 3). *DDTW* (*h* , *k*)is distance measure (13) with the DTW distance. *DM* (*h* , *k*)and *DF* (*h* , *k*)denote distance measure (10) with Minkowski and Frechet distan‐

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

http://dx.doi.org/10.5772/50513

39

Note from Table 3 that the traditional distance measures failed to group the correlated sequen‐ ces, with the exception of the Minkowski distance, which correctly grouped the correlated series 99% of the time. The hierarchical algorithm that uses the distance measure (13) has a higher percentage of well-clustered correlated sequences than the same algorithm using the traditional distance measures described in Section 2 (see Table 4). The percentage of correct clusters increased in all cases with the distance measure (13), suggesting that hierarchical algorithms can be improved by including coefficients of association that consider high-or‐

In this section, we consider time series from four different locations in Argentina. The data set consists of 15 monthly NDVI series measured during a period of 19 years (i.e., January 1982-December 2000). The observed values correspond to a transformation to the interval

ces respectively.

der cross-correlation.

**3.5. The NDVI Data Set**


**Table 3.** Percentage of correct clustering of the correlated series 1,2 and 3.

experiment, and 200 observations were generated in each run. We used measure (13) under the tuning function (14) for *h* =1, 2and *k* =1, 2, 3, 4.We counted the number of times that the hierarchical algorithm with complete linkage was able to pull together series 1, 2 and 3 be‐ fore connecting them with other sequences. The traditional distances described in Section 2 were also implemented. After the 1000 simulation runs were finished, the percentage of times that the algorithm was able to recognize the correlated series was recorded. The re‐ sults of the experiment are summarized in Table 3 and 4.


Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes http://dx.doi.org/10.5772/50513 39

**Table 4.** Percentage of correct clustering of the correlated series (1,2, and 3). *DDTW* (*h* , *k*)is distance measure (13) with the DTW distance. *DM* (*h* , *k*)and *DF* (*h* , *k*)denote distance measure (10) with Minkowski and Frechet distan‐ ces respectively.

Note from Table 3 that the traditional distance measures failed to group the correlated sequen‐ ces, with the exception of the Minkowski distance, which correctly grouped the correlated series 99% of the time. The hierarchical algorithm that uses the distance measure (13) has a higher percentage of well-clustered correlated sequences than the same algorithm using the traditional distance measures described in Section 2 (see Table 4). The percentage of correct clusters increased in all cases with the distance measure (13), suggesting that hierarchical algorithms can be improved by including coefficients of association that consider high-or‐ der cross-correlation.

#### **3.5. The NDVI Data Set**

**Figure 5.** a) Time series clustering using the Euclidean distance, (b) Time series clustering using*D*.

dean distance does not recognize the correlation structure between *X* <sup>1</sup>

cal methods using (13) yield the expected results, combining sequences *X* <sup>1</sup>

, and *X* <sup>5</sup>

**Table 3.** Percentage of correct clustering of the correlated series 1,2 and 3.

sults of the experiment are summarized in Table 3 and 4.

sequences *X* <sup>1</sup>

, *X* <sup>2</sup> , *X* <sup>4</sup>

fore the rest of the series.

38 Advances in Image Segmentation

In Figure 5, we see that the dendrogram obtained using hierarchical methods with the Eucli‐

To obtain better insight into the classification process using the proposed distance measure (13), we carried out a second simulation study that involves clustering measures based on other distances (but using the same setup). Observations from models 1-6 were generated using Gaussian white noise sequences for the errors, thereby preserving the same correla‐ tion structure used in the first study. The goal was to explore the ability of the distance measure (13) to group strongly correlated series first. A total of 1000 runs were considered for this

experiment, and 200 observations were generated in each run. We used measure (13) under the tuning function (14) for *h* =1, 2and *k* =1, 2, 3, 4.We counted the number of times that the hierarchical algorithm with complete linkage was able to pull together series 1, 2 and 3 be‐ fore connecting them with other sequences. The traditional distances described in Section 2 were also implemented. After the 1000 simulation runs were finished, the percentage of times that the algorithm was able to recognize the correlated series was recorded. The re‐

are pulled together before sequence *X* <sup>3</sup>

and *X* <sup>3</sup>

, *X* <sup>2</sup>

.In this case,

, and *X* <sup>3</sup>

be‐

.However, hierarchi‐

In this section, we consider time series from four different locations in Argentina. The data set consists of 15 monthly NDVI series measured during a period of 19 years (i.e., January 1982-December 2000). The observed values correspond to a transformation to the interval

0, 255 of the original NDVI series, which commonly resides in the interval −1, 1 .The data were collected by a NOAA sensor at 1 km resolution and provided by the Comisión Nacio‐ nal de Actividades Espaciales (CONAE) in Córdoba, Argentina. Fifteen time series were collected from the following: the Amazon region in the northeast of Argentina (1, 2, 3), the Patagonia region in the south of Argentina (4, 5, 6, 7), the Pampean region in the center of Argentina (8, 9, 10, 11) and the Pre-Andean zone of Argentina (12, 13, 14, 15). The time series are shown in Figure 6.

**Figure 6.** Fifteen NDVI series collected from four different regions in South America.

We can observe a variety of different patterns in Figure 6. In particular, the data collect‐ ed during the period 1994-1995 show irregular behavior. Additionally, the original data lack some information (less than one percent) for all series over the period 1999-2000. An imputation technique based on moving averages, which takes into account past and fu‐ ture values of the series, was used to replace missing values. The series were grouped by geographical region and then plotted (Figure 7). Similar patterns are observed for the ser‐ ies across each group.

**Figure 7.** The fifteen NDVI series grouped by area.

Using the NVDI data set described in Section 3.5, the distance measure *D*in (13) was comput‐ ed for all possible pairs. Then, dendrogram plots were constructed using a hierarchical pro‐ cedure (i.e., complete linkage) to compare the mergers and clusters produced using *D*with those produced using the conventional distances described in Section 2.2 and (13). In Figure 8, we observe the agglomeration produced by using the Euclidean distance (top left); the other five plots show the results produced by distance (13) for different values of *k*and *h* .The agglomeration algorithm using the Euclidean distance merges series 5 together with series 12-15 and thus does not preserve location when grouping series. However, with *k* =1and *h* =1 in (13), the location dependence of the 15 series is captured. Higher values of *k*and *h* do not modify the original clusters formed using*D*. In Figure 9, we see the clusters and merges yielded by using the Minkowski distance in (13). Note that series 4 is classified together with series 8, 9, 10 and 11 in the top left plot, but with *k* =1and *h* =1, the algorithm handles the series differently (shown in the top right plot) by merging together those series that are in the same

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

http://dx.doi.org/10.5772/50513

41

**3.6. Clustering**

An exploratory data analysis was carried out for each of the 15 series. There exists signifi‐ cant autocorrelation of order of at least one in all series. Seasonal components are present in most of partial autocorrelations. Because there is no large departure from the weakly station‐ ary assumptions (i.e., constant means and variances), all series can be modeled using the Box-Jenkins approach. Specifically, seasonal ARIMA models can be fitted to each single series with a small number of parameters (i.e., *p* ≤5, *q* ≤5and*d* ≤2).

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes http://dx.doi.org/10.5772/50513 41

**Figure 7.** The fifteen NDVI series grouped by area.

#### **3.6. Clustering**

0, 255 of the original NDVI series, which commonly resides in the interval −1, 1 .The data were collected by a NOAA sensor at 1 km resolution and provided by the Comisión Nacio‐ nal de Actividades Espaciales (CONAE) in Córdoba, Argentina. Fifteen time series were collected from the following: the Amazon region in the northeast of Argentina (1, 2, 3), the Patagonia region in the south of Argentina (4, 5, 6, 7), the Pampean region in the center of Argentina (8, 9, 10, 11) and the Pre-Andean zone of Argentina (12, 13, 14, 15). The time series

**Figure 6.** Fifteen NDVI series collected from four different regions in South America.

a small number of parameters (i.e., *p* ≤5, *q* ≤5and*d* ≤2).

We can observe a variety of different patterns in Figure 6. In particular, the data collect‐ ed during the period 1994-1995 show irregular behavior. Additionally, the original data lack some information (less than one percent) for all series over the period 1999-2000. An imputation technique based on moving averages, which takes into account past and fu‐ ture values of the series, was used to replace missing values. The series were grouped by geographical region and then plotted (Figure 7). Similar patterns are observed for the ser‐

An exploratory data analysis was carried out for each of the 15 series. There exists signifi‐ cant autocorrelation of order of at least one in all series. Seasonal components are present in most of partial autocorrelations. Because there is no large departure from the weakly station‐ ary assumptions (i.e., constant means and variances), all series can be modeled using the Box-Jenkins approach. Specifically, seasonal ARIMA models can be fitted to each single series with

are shown in Figure 6.

40 Advances in Image Segmentation

ies across each group.

Using the NVDI data set described in Section 3.5, the distance measure *D*in (13) was comput‐ ed for all possible pairs. Then, dendrogram plots were constructed using a hierarchical pro‐ cedure (i.e., complete linkage) to compare the mergers and clusters produced using *D*with those produced using the conventional distances described in Section 2.2 and (13). In Figure 8, we observe the agglomeration produced by using the Euclidean distance (top left); the other five plots show the results produced by distance (13) for different values of *k*and *h* .The agglomeration algorithm using the Euclidean distance merges series 5 together with series 12-15 and thus does not preserve location when grouping series. However, with *k* =1and *h* =1 in (13), the location dependence of the 15 series is captured. Higher values of *k*and *h* do not modify the original clusters formed using*D*. In Figure 9, we see the clusters and merges yielded by using the Minkowski distance in (13). Note that series 4 is classified together with series 8, 9, 10 and 11 in the top left plot, but with *k* =1and *h* =1, the algorithm handles the series differently (shown in the top right plot) by merging together those series that are in the same location. The effects of higher values of *k*and *h* .are again negligible. The Fréchet distance produced unsatisfactory results. In this case, the hierarchical algorithm does not take into account the geographical location when using both the conventional distances measures and (13). For example, series 1, 2 and 3 were merged in different stages. Nevertheless, when *k*and *h* are increased, the algorithm using (13) still clusters the series by geographical location. Indeed, if our goal is to produce four clusters as before, the hierarchical algorithm with *h* =2and *k* =3 produces a geographically consistent agglomeration (dendrograms not shown here). The same analysis was performed using the hierarchical algorithm with the dynamic time warping distance measure. In this case, this distance measure produced an agglomeration by geograph‐ ical location and thus did not need to be modified to capture serial correlation. The result yielded with (13) produced the same outcome for all values of *k*and *h* .

**Figure 9.** Clusters produced by using a hierarchical method with the Fréchet distance and (13)

This chapter described two problems. The first problem involved image segmentation, while the second problem involved clustering time series. For the first problem, a new algorithm was proposed that enhances the segmentation yielded by a previous algorithm (Ojeda et al., 2010). Identifying the best prediction window improves segmentation based on the estima‐ tion of AR-2D processes and generalizes the previous algorithm to different prediction win‐ dows associated with unilateral processes on the plane. An analysis of the association between the original and fitted images relies on the selection of a suitable image quality

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

http://dx.doi.org/10.5772/50513

43

**4. Concluding Remarks and Future Work**

**Figure 8.** Clusters produced by using a hierarchical method with the Minkowski distance and (13).

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes http://dx.doi.org/10.5772/50513 43

**Figure 9.** Clusters produced by using a hierarchical method with the Fréchet distance and (13)

## **4. Concluding Remarks and Future Work**

location. The effects of higher values of *k*and *h* .are again negligible. The Fréchet distance produced unsatisfactory results. In this case, the hierarchical algorithm does not take into account the geographical location when using both the conventional distances measures and (13). For example, series 1, 2 and 3 were merged in different stages. Nevertheless, when *k*and *h* are increased, the algorithm using (13) still clusters the series by geographical location. Indeed, if our goal is to produce four clusters as before, the hierarchical algorithm with *h* =2and *k* =3 produces a geographically consistent agglomeration (dendrograms not shown here). The same analysis was performed using the hierarchical algorithm with the dynamic time warping distance measure. In this case, this distance measure produced an agglomeration by geograph‐ ical location and thus did not need to be modified to capture serial correlation. The result

yielded with (13) produced the same outcome for all values of *k*and *h* .

42 Advances in Image Segmentation

**Figure 8.** Clusters produced by using a hierarchical method with the Minkowski distance and (13).

This chapter described two problems. The first problem involved image segmentation, while the second problem involved clustering time series. For the first problem, a new algorithm was proposed that enhances the segmentation yielded by a previous algorithm (Ojeda et al., 2010). Identifying the best prediction window improves segmentation based on the estima‐ tion of AR-2D processes and generalizes the previous algorithm to different prediction win‐ dows associated with unilateral processes on the plane. An analysis of the association between the original and fitted images relies on the selection of a suitable image quality measure. Using three image quality coefficients that are commonly used in image segmenta‐ tion, we carried out experiments that support our algorithm. Specifically, a set of images be‐ longing to the image database (http://sipi.usc.edu/database/) were processed and provided satisfactory results (not shown here) in terms of image segmentation.

**Author details**

**References**

53 -61.

Ronny Vallejos1\* and Silvia Ojeda2

\*Address all correspondence to: ronny.vallejos@usm.cl

2 FAMAF, Universidad Nacional de Córdoba,, Argentina

essing. *Pattern Recognition Letters*, 22(11), 1219 -1231.

model. *Advances in Applied Probbability*, 25(3), 631 -648.

gorithm. *Pattern Recognition Letters*, 29(7), 905-917.

sion to two dimensions. *Biometrics*, 47(4), 1449-1460.

*ning and Inference*, 139(10), 3649-3664.

1 Department of Mathematics, Universidad Técnica Federico Santa María,, Chile

[1] Allende, H., Galbiati, J., & Vallejos, R. (2001). Robust image modeling on image proc‐

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

http://dx.doi.org/10.5772/50513

45

[2] Baran, S., Pap, G. , & Zuijlen, M. C. A. (2004). Asymptotic inference for a nearly un‐ stable sequence of stationary spatial AR models. *Statistics & Probability Letters*, 69(1),

[3] Basu, S., & Reinsel, G. (1993). Properties of the spatial unilateral first-order ARMA

[4] Bustos, O., Ojeda, S., & Vallejos, R. (2009a). Spatial ARMA models and its applica‐ tions to image filtering. *Brazilian Journal of Probability and Statistics*, 23(2), 141 -165 . [5] Bustos, O., Ojeda, S., Ruiz, M., Vallejos, R., & Frery, A. (2009b). Asymptotic Behavior of RA-estimates in Autoregressive 2D Gaussian Processes. *Journal of Statistical Plan‐*

[6] Cariou, C., & Chehdi, K. (2008). Unsupervised texture segmentation/classification usind 2-D autoregressive modeling and the stochastic expectation-maximization al‐

[7] Choi, B. (2000). On the asymptotic distribution of mean, autocovariance, autocorrela‐ tion, crosscovariance and impulse response estimators of a stationary multidimen‐ sional random field. *Communications in Statistics-Theory and Methods*, 29(8), 1703-1724.

[8] Chouakria, A. D., & Nagabhushan, P. N. (2007). Adaptive dissimilarity for measur‐ ing time series proximity. *Advances in Data Analysis and Classification*, 1(1), 5 -21 . [9] Clifford, P., Richardson, S., & Hémon, D. (1989). Assessing the significance of the cor‐

[10] Cullis, B. R., & Glesson, A. C. (1991). Spatial analysis of field experiments-an exten‐

[11] Francos, J., & Friedlander, B. (1998). Parameter estimation of two-dimensional mov‐ ing average random fields. *IEEE Transaction Signal Processing*, 46(8), 2157-2165.

relation between two spatial processes. *Biometrics*, 45(1), 123 -134.

This chapter also proposed an extension of the dissimilarity measure first introduced in (Chouakria & Nagabhushan,2007). The simulation experiments performed and the data analysis carried out for relevant ecological series show that the distance lag *h* plays an im‐ portant role in capturing the higher-order correlation of each series. Cluster analysis per‐ formed using the proposed distance measure produced different merges and dendrograms. Furthermore, the percentage of times that the hierarchical algorithms correctly classified the highly correlated sequences increased in all cases in which the distance measure (13) was used. For the NDVI series discussed in Section 3.5, the distance measure *D*improved the performance of the Euclidean, Fréchet and Minkowski distances in the presence of high-or‐ der autocorrelation in the series. The dynamic time warping distance measure showed the best performance in capturing the serial correlation between the NDVI series, and thus, it was not necessary to introduce modified distance measures such as (13) to ensure agglomer‐ ation by geographical location.

Now, further research for the topics presented in this chapter is outlined.

Following the notation used in the Algorithm 3, consider the following residual image. *RWi* =*Z* −*Z* ^ *Wi* .

One interesting open problem involves the characterization of the types of images and dis‐ tributions associated with the segmentation produced by Algorithms 2 and 3. In addition, the definition and study of linear combinations of residual images produced by distinct pre‐ diction windows is also of interests. For example,

$$I = \sum\_{j=1}^{4} a\_j R\_{W\_{i'}}$$

where *aj* is a weight associated with the residual image *RWi* .

Regarding the clustering technique problem, the distribution of *D*can be studied from a parametric point of view. This is an open problem that we expect to address in future research.

## **Acknowledgements**

The first author was partially supported by Fondecyt grant 1120048, UTFSM under grant 12.12.05, and Proyecto Basal CMM, Universidad de Chile. The second author was supported in part by CIEM-FAMAF, UNC, Argentina.

## **Author details**

measure. Using three image quality coefficients that are commonly used in image segmenta‐ tion, we carried out experiments that support our algorithm. Specifically, a set of images be‐ longing to the image database (http://sipi.usc.edu/database/) were processed and provided

This chapter also proposed an extension of the dissimilarity measure first introduced in (Chouakria & Nagabhushan,2007). The simulation experiments performed and the data analysis carried out for relevant ecological series show that the distance lag *h* plays an im‐ portant role in capturing the higher-order correlation of each series. Cluster analysis per‐ formed using the proposed distance measure produced different merges and dendrograms. Furthermore, the percentage of times that the hierarchical algorithms correctly classified the highly correlated sequences increased in all cases in which the distance measure (13) was used. For the NDVI series discussed in Section 3.5, the distance measure *D*improved the performance of the Euclidean, Fréchet and Minkowski distances in the presence of high-or‐ der autocorrelation in the series. The dynamic time warping distance measure showed the best performance in capturing the serial correlation between the NDVI series, and thus, it was not necessary to introduce modified distance measures such as (13) to ensure agglomer‐

satisfactory results (not shown here) in terms of image segmentation.

Now, further research for the topics presented in this chapter is outlined.

is a weight associated with the residual image *RWi*

Following the notation used in the Algorithm 3, consider the following residual image.

One interesting open problem involves the characterization of the types of images and dis‐ tributions associated with the segmentation produced by Algorithms 2 and 3. In addition, the definition and study of linear combinations of residual images produced by distinct pre‐

Regarding the clustering technique problem, the distribution of *D*can be studied from a parametric point of view. This is an open problem that we expect to address in future research.

The first author was partially supported by Fondecyt grant 1120048, UTFSM under grant 12.12.05, and Proyecto Basal CMM, Universidad de Chile. The second author was supported

.

ation by geographical location.

44 Advances in Image Segmentation

diction windows is also of interests. For example,

*RWi*

*I* =∑ *j*=1 4 *aj RWi* ,

where *aj*

**Acknowledgements**

in part by CIEM-FAMAF, UNC, Argentina.

=*Z* −*Z* ^ *Wi* . Ronny Vallejos1\* and Silvia Ojeda2


## **References**


[28] Ojeda, S. M., Vallejos, R., & Lamberti, W. P. (2012). A Measure of Similarity Between

Image Segmentation and Time Series Clustering Based on Spatial and Temporal ARMA Processes

http://dx.doi.org/10.5772/50513

47

[29] Quintana, C., Ojeda, S., Tirao, G., & Valente, M. (2011). Mammography image detec‐ tion processing for automatic micro-calcification recognition. *Chilean Journal of Statis‐*

[30] Rukhin, A., & Vallejos, R. (2008). Codispersion coefficient for spatial and temporal

[31] Vallejos, R., & Mardesic, T. (2004). A recursive algorithm to restore images based on robust estimation of NSHP autoregressive models. *Journal of Computational and*

[32] Vallejos, R., & Garcia-Donato, G. (2006). Bayesian analysis of contaminated quarter plane moving average models. *Journal of Statistical Computation and Simulation*, 76(2),

[33] Vallejos, R. (2008). Assessing the association between two spatial or temporal se‐

[34] Tjostheim, D. (1978). A measure of association for spatial variables. *Biometrika*, 65(1),

[35] Wang, Z., & Bovik, A. (2002). A universal image quality index. *IEEE Signal Processing*

Images. , in press Journal of Electronic Imaging.

*Graphical Statistics*, 13(3), 674 -682 .

series. *Statistics and Probability Letters*, 78(11), 1290 -1300 .

quences. *Journal of Applied Statistics*, 35(12), 1323 -1343 .

*tics*, 2(2), 69-79.

131-147.

109 -114.

*Letters*, 9(3), 81 -84 .


[28] Ojeda, S. M., Vallejos, R., & Lamberti, W. P. (2012). A Measure of Similarity Between Images. , in press Journal of Electronic Imaging.

[12] Gaetan, C., & Guyon, X. (2010). *Spatial Statistics and Modelling*, Springer, New York.

egressive laticce processes. *Statistica Sinica*, 18, 617-631.

es on a plane. *Journal of Time Series Analysis*, 19(6), 681-691.

*munications in Statistics-Theory and Methods*, 35(4), 671 -688 .

age Segmentation edited by Pei-Gee Ho. InTech.

249-260.

46 Advances in Image Segmentation

763-770.

lattice. *Biometrika*, 69(1), 95-105.

*put. Surveys,*, 31(3), 264-323.

[13] Genton, M. G., & Koul, H. L. (2008). Minimum distance inference in unilateral autor‐

[14] Golay, X., Kollias, S., Stoll, G., Meier, D., & Valavanis, A. (1998). A new correlationbased fuzzy logic clustering algorithm of FMRI. *Magnetic Resonance in Medicine*, 40(2),

[15] Grondona, M. R., Crossa, J., Fox, P. N., & Pfeiffer, W. H. (1996). Analysis of variety yield trials using two-dimensional separable ARIMA processes. *Biometrics*, 52(2),

[16] Guo, J., & Billard, L. (1998). Some inference results for causal autoregressive process‐

[17] Guyon, X. (1982). Parameter estimation for a stationary process on a d-dimensional

[18] Ho, P. G. P. (2011). Image segmentation by autoregressive time series model. In Im‐

[19] Illig, A., & Truong Van, B. (2006). Asymptotic results for spatial ARMA Models. *Com‐*

[20] Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data Clustering: A Review. *ACM Com‐*

[21] Kashyap, R., & Eom, K. (1988). Robust images techniques with an image restoration

[22] Mac, Queen. J. B. (1967). Some Methods for classification and Analysis of Multivari‐ ate Observations. *Proceedings of 5 -th Berkeley Symposium on Mathematical Statistics and*

[23] Martin, R. J. (1996). Some results on unilateral ARMA laticce processes. *Journal of*

[25] Politis, D. N., & Romano, J. P. (1994). The stationary bootstrap. *Journal of the American*

[26] Ojeda, S. M., Vallejos, R. O., & Lucini, M. (2002). Performance of RA Estimator for Bidimensional Autoregressive Models. *Journal of Statistical Simulation and Computa‐*

[27] Ojeda, S. M., Vallejos, R., & Bustos, O. (2010). A New Image Segmentation Algorithm with Applications to Image Inpainting. .Computational Statistics & Data Analysis ,

[24] Matheron, G. (1965). Les Variables Régionalisées et leur Estimation Masson Paris.

application. *IEEE Trans. Acoust. Speech Signal Process*, 36(8), 1313 -1325 .

*Probability*, Berkeley, University of California Press, 1, 281-297.

*Statistical Planning and Inference*, 50(3), 395-411.

*Statistical Association*, 89(428), 1303 -1313.

*tion*, 72(1), 47 -62 .

54(9), 2082 -2093 .


**Chapter 3**

**Image Segmentation Through an Iterative Algorithm**

Image analysis is a scientific discipline providing theoretical foundations and methods for solving problems appearing in a range of areas as diverse as biology, medicine, physics, astronomy, geography, chemistry, meteorology, robotics and industrial manufac‐

Inside any image analysis system, an aspect of vital importance for pattern recognition and image interpretation that has to be taken into account is segmentation and contour extrac‐ tion. Both problems can be really difficult to face due to the variability in the form of the objects and the variation in the image quality. An example can be found in the case of biomedical images which are frequently affected by noise and sampling, that can cause consid‐ erable difficulties when rigid segmentation methods are applied [Chin-Hsing et al., 1998;

Many segmentation techniques are available in the literature and some of them have been widely used in different application problems. Most of these segmentation techniques were motivated by specific application purposes. Many different approaches for image segmenta‐ tion there are; which mainly differ in the criterion used to measure the similarity of two regions and in the strategy applied to guide the segmentation process. The definition of suitable simi‐ larity and homogeneity measures is a fundamental task in many important applications, rang‐

Segmentation is an important part of any computer vision and image analysis system, wherein regions of interest are identified and extracted for future processing. Of the quality

> © 2012 Rodríguez Morales et al.; licensee InTech. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 Rodríguez Morales et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

**of the Mean Shift**

Esley Torres and Juan H. Sossa

http://dx.doi.org/10.5772/50474

**1. Introduction**

turing, among others.

Roberto Rodríguez Morales, Didier Domínguez,

Kenong & Levine, 1995; Koss et al., 1999; Rodríguez et al., 2002].

ing from remote sensing to similarity-based retrieval in large image databases.

Additional information is available at the end of the chapter
