**3. Results and discussion**

In this section, we present the results and discuss the three analyses performed in this chapter described above.

### **3.1. k-Means used with Euclidean distance**

In this section we present how results of appliance of k-means clustering with Euclidean distance function over NDVI monthly values extracted from the study area can assist the monitoring of sugarcane fields.

Months from December to May correspond to the period of maximum vegetative growth of sugarcane. In **Figure 3J**, **L** and **B**, pixels that appear in yellow and red colors correspond to the maximum NDVI values, being included in the clusters 3 and 4, respectively. On the other hand,

calculate the distance between multidimensional arrays and vectors. The dynamic time warping (DTW) is a very efficient distance function to compare time series [22]. Its main objective is to keep close time series that have similar behavior but are delayed or distorted along the time axis. Thus, this technique presents a proper way of working to warping, because the comparisons between corresponding points are not rigid. DTW is a tool with two of the main issues raised by high-temporal-resolution satellite image time series, namely, the irregular sampling in the temporal dimension and the need for comparison of pairs of time series hav-

First: k-Means used with Euclidean distance, when we considered only monthly NDVI values. These values of sugarcane fields were extracted using geographical coordinates (latitude and longitude) provided by the Canasat/INPE Project (www.canasat.inpe.br). In this approach, each element of the dataset corresponds to one NDVI value, which refers to a month value in a given location (pixel), in order to obtain monthly analysis of the region of interest. Considering similarity among NDVI values, elements were assigned to different clusters. Five clusters were generated for each month of the crop season (2004–2005), being able to follow the development stage of the crop per month. For example, whether crop is in maturing phase, it has already been harvested, and there are not spectral mixing with other

Second: k-Means used with DTW distance function, when we have generated series of NDVI values corresponding to one or more sugarcane crop series. The clustering was determined by five clusters for each crop season (2001–2010) for annual crop monitoring according to the type of planting in each crop season, for example, sugarcane ratoon, sugarcane expansion,

Third: k-Means used with DTW distance function of three dimensional (multivariate) time series database, extracted from 324 monthly images of NDVI, albedo and surface temperature. Since DTW calculates the distance between pairs of data points using Euclidean distance, DTW method can be applied to multivariate time series. The whole dataset had 220,238 data series, being each observation a triplet of NDVI, albedo and surface temperature values

In this section, we present the results and discuss the three analyses performed in this chapter

In this section we present how results of appliance of k-means clustering with Euclidean distance function over NDVI monthly values extracted from the study area can assist the

sugarcane renewed, sugarcane under renewing and not defined [13, 24].

of study area in a given month, with 108 values per time series [25].

ing different numbers of samples [23].

28 Time Series Analysis and Applications

crops or vegetation;

**3. Results and discussion**

monitoring of sugarcane fields.

**3.1. k-Means used with Euclidean distance**

described above.

We will show next the three clustering analyses performed:

**Figure 3.** Monthly MVC NDVI images and clustering of NDVI (five clusters) of sugarcane planting area in the state of São Paulo for months from April 2004 to march 2005. (A) NDVI/NOAA 2004 April; (B) Clustering 2004 April; (C) NDVI/ NOAA 2004 July; (D) Clustering 2004 July; (E) NDVI/NOAA 2004 September; (F) Clustering 2004 September; (G) NDVI/ NOAA 2004 November; (H) Clustering 2004 November; (I) NDVI/NOAA 2005 January; (J) Clustering 2005 January; (K) NDVI/NOAA 2005 March; (L) Clustering 2005 March.

months of August, September and October correspond to harvest season. In these months (**Figure 3F**), pixels in magenta and blue, with minimum NDVI values, correspond to clusters 0 and 1, respectively. Cluster 2 (green) corresponds to sugarcane intermediate stage of growth.

These clusters can be validated in the MVC NDVI images. The black squares over the satellite images in the left correspond to the main sugarcane planting areas. Analyzing the MVC NDVI images in the northeastern region of São Paulo, the evolution of the sugarcane vegetative growth cycle can be seen (**Figure 3**). Planting begins in August represented in the images by pixels in shades of green and blue located in the northeastern region of the state. These colors represent low NDVI values (around 0.2) characterizing areas with exposed soil and sparse vegetation. Similar pattern also occurs in the months from September to November. From December, when sugarcane begins to grow up and acquire more biomass, these regions are shades of yellow, orange and red. Months from January to May show shades of dark red, when sugarcane reaches the highest stage of growth with maximum NDVI values (between 0.7 and 0.8). The dark areas in images represent pixels covered by clouds and water.

There is no predominance of one or two clusters in all producing regions if we consider all months of the crop season. As we can observe, both plant and ratoon sugarcane are grown throughout the state, and the five clusters appear in all months. There is a higher percentage of pixels in the clusters with higher NDVI during some months. However, in other months, the largest number of pixels is included in clusters with lower NDVI (**Figure 3**).

**Figure 4** has the temporal profile of clusters showing dynamics of crop planting and harvesting throughout the growing season. Analyzing the temporal profile of **Figure 4**, we can observe that in months from December to May, the NDVI values are higher and represent a larger percentage of pixels for clusters 2, 3 and 4 (from 20 to 40% of the pixels). For the months from August to November, the NDVI values are lower, representing higher percentages for clusters 0 and 1 (around 30% of the pixels). Each month features a sugarcane planting area at a certain stage of growth, appearing in clusters 0 or 1 (harvested or bare soil) and in clusters 2, 3 and 4 (in growth or ready to be harvested) (**Figure 3**).

**Figure 4.** Temporal profile of five NDVI clusters of sugarcane fields for the months from April 2004 to March 2005.

Although the k-means method is simpler and more widely used, their application in satellite image time series of low spatial resolution allows the regional study of crop, even with the difficulty in the analysis due to the possibility of spectral mixing in pixels.

## **3.2. k-Means was used with DTW distance**

months of August, September and October correspond to harvest season. In these months (**Figure 3F**), pixels in magenta and blue, with minimum NDVI values, correspond to clusters 0 and 1, respectively. Cluster 2 (green) corresponds to sugarcane intermediate stage of growth. These clusters can be validated in the MVC NDVI images. The black squares over the satellite images in the left correspond to the main sugarcane planting areas. Analyzing the MVC NDVI images in the northeastern region of São Paulo, the evolution of the sugarcane vegetative growth cycle can be seen (**Figure 3**). Planting begins in August represented in the images by pixels in shades of green and blue located in the northeastern region of the state. These colors represent low NDVI values (around 0.2) characterizing areas with exposed soil and sparse vegetation. Similar pattern also occurs in the months from September to November. From December, when sugarcane begins to grow up and acquire more biomass, these regions are shades of yellow, orange and red. Months from January to May show shades of dark red, when sugarcane reaches the highest stage of growth with maximum NDVI values (between

0.7 and 0.8). The dark areas in images represent pixels covered by clouds and water.

the largest number of pixels is included in clusters with lower NDVI (**Figure 3**).

2, 3 and 4 (in growth or ready to be harvested) (**Figure 3**).

30 Time Series Analysis and Applications

There is no predominance of one or two clusters in all producing regions if we consider all months of the crop season. As we can observe, both plant and ratoon sugarcane are grown throughout the state, and the five clusters appear in all months. There is a higher percentage of pixels in the clusters with higher NDVI during some months. However, in other months,

**Figure 4** has the temporal profile of clusters showing dynamics of crop planting and harvesting throughout the growing season. Analyzing the temporal profile of **Figure 4**, we can observe that in months from December to May, the NDVI values are higher and represent a larger percentage of pixels for clusters 2, 3 and 4 (from 20 to 40% of the pixels). For the months from August to November, the NDVI values are lower, representing higher percentages for clusters 0 and 1 (around 30% of the pixels). Each month features a sugarcane planting area at a certain stage of growth, appearing in clusters 0 or 1 (harvested or bare soil) and in clusters

**Figure 4.** Temporal profile of five NDVI clusters of sugarcane fields for the months from April 2004 to March 2005.

Results of the MVC NDVI image time series analysis about the period 2001–2010 for the state of São Paulo are presented hereafter. Maps and temporal profiles correspond to results of clusters (k-means with DTW distance function), pixels with NDVI values from year to year. In general, clusters that were identified as sugarcane may be (i) related to the type of planting carried out each year, for example, identifying areas of sugarcane ratoon (the sugarcane available for harvest after one or more cuts), sugarcane expansion (the sugarcane planted in new areas that will be harvested for the first time), sugarcane renewed (the year-and-half sugarcane plant that has undergone renovation during the previous crop year and will be available for harvest in the current crop year), sugarcane under renewing (the sugarcane area is not harvested due to renovation, not available for that specific crop year) and not defined area, and (ii) related to the quantity produced. Clusters, which were determined by clustering analysis, do not remain constant from year to year as the sugarcane planting is dynamic along the time series.

Thus, applying the k-means clustering analysis, we can verify sugarcane planting type from the years analyzed. Cluster 4 (red) indicates the maximum NDVI values in the month, corresponding to areas with higher biomass. Cluster 0 (magenta) shows the lower NDVI values, corresponding to bare soil. The k-means method showed more homogeneous temporal profiles (**Figure 5**). Low peaks in NDVI profiles during the months of December and January (**Figure 5**) match NDVI values related to clouds, because this period of year is the rainy season in the state.

Analyzing every year, we found that each cluster corresponds to different types of sugarcane planting (**Table 1**). For example, in crop season 2001–2002, 2003–2004, 2006–2007 and 2008–2009, cluster 2 (green; **Figure 6A**, **C**, **F** and **H**) corresponds to the type of sugarcane ratoon, and this cluster (29–47% of the pixels) is correlated (between R = 0.74 and R = 0.87) with the crop production (**Figure 7**). In crop seasons 2002–2003 and 2009–2010 (**Figure 6B** and **I**), sugarcane ratoon corresponds to cluster 1 (blue), with a correlation of R = 0.84 and R = 0.73 with the production and 36 and 33% of the sugarcane pixels (**Figure 7**). Crop season 2004–2005 (**Figure 6D**) corresponds to cluster 3 (yellow), with correlation

**Figure 5.** Temporal profiles of each cluster for each crop season in the period 2001–2002 to 2009–2010.


**Table 1.** Type of sugarcane planting in each crop season and pixels number percentage for each cluster by k-means with DTW. Agricultural Monitoring in Regional Scale Using Clustering on Satellite Image Time Series http://dx.doi.org/10.5772/intechopen.71148 33

**Figure 6.** k-Means clustering with DTW distance function for each crop season in the period 2001–2002 to 2009–2010. (A) Clustering 2001–2002; (B) Clustering 2002–2003; (C) Clustering 2003–2004; (D) Clustering 2004–2005; (E) Clustering 2005–2006; (F) Clustering 2006–2007; (G) Clustering 2007–2008; (H) Clustering 2008–2009; (I) Clustering 2009–2010.

**Cluster2001–2002**

0 1 2 3

Not defined

Renewed 15%

Not defined

Ratoon 32%

Expansion 35%

Under

Expansion 28%

Expansion 22%

Under renewing

19%

renewing 21%

13%

19%

4

Under

Not defined

Renewed 9%

Renewed 24%

Not defined

Renewed 14%

Ratoon 29%

Not defined 7%

Not defined 6%

20%

Type of sugarcane planting in each crop season and pixels number percentage for each cluster by k-means with DTW.

renewing 24%

**Table 1.**

15%

Ratoon 29%

Expansion 13%

Ratoon 41%

Under

Under

Ratoon 29%

Renewed 21%

Ratoon 47%

Expansion 18%

renewing 20%

renewing 18%

Renewed 17%

Ratoon 36%

Under

Not defined

Ratoon 21%

Expansion 20%

Not defined 7%

Under renewing

Ratoon 33%

32 Time Series Analysis and Applications

11%

renewing 27%

17%

Expansion 9%

Under

Expansion 7%

Expansion 4%

Renovated 3%

Not defined

Under renewing

Renewed 11%

Renewed 21%

14%

12%

renewing 18%

**2002–2003**

**2003–2004**

**2004–2005**

**2005–2006**

**2006–2007**

**2007–2008**

**2008–2009**

**2009–2010**

**Figure 7.** Graph of pixels' number percentage for each cluster regarding each crop seasons in the period 2001–2002 to 2009–2010. Correlation values of the clusters with the sugarcane production.

index R = 0.81 and 32% of the sugarcane pixels (**Figure 7**). In most crop seasons, sugarcane ratoon is strongly correlated with the sugarcane production. Only in crop seasons 2005–2006 and 2007–2008 (**Figure 6E** and **G**), the sugarcane expansion is correlated with crop production.

### **3.3. k-Means was used with DTW distance function of three dimensional (multivariate) time series database**

Dataset with more than 220,000 series in the state of São Paulo were clustered into five clusters (0–4) by k-means method with DTW distance function. Each cluster was formed according to the characteristics of NDVI, surface temperature and albedo extracted from AVHRR/NOAA images in the period 2001–2010. The identified areas were cluster 0 (magenta), which corresponds to water; cluster 1 (blue), which to the urban area and areas where the soil is exposed or have low vegetation and pasture; cluster 2 (green), which represents areas of agricultural crops; cluster 3 (yellow), which corresponds to sugarcane; and cluster 4 (red), which represents forest areas (**Figure 8A** and **B**).

NDVI was useful to separate vegetation areas from other targets, for example, forests present high values of NDVI during the whole season (have high concentration of vegetation and biomass), and these areas are normally shown by red-colored representative time series, in profile visualization (**Figure 9A**). On the other hand, albedo variable was useful to separate water areas from other targets, but was not enough to distinguish areas having different levels of vegetation cover (**Figure 9B**). The water represented by cluster 0 was well clustered, since the NDVI values and especially the albedo values were different from other clusters, as shown in the temporal Agricultural Monitoring in Regional Scale Using Clustering on Satellite Image Time Series http://dx.doi.org/10.5772/intechopen.71148 35

**Figure 8.** Geographic spatial of 2001–2002 (A) and 2009–2010 (B) of clustering results; yellow represents sugarcane.

profile of NDVI (**Figure 9A**) and albedo (**Figure 9B**). The albedo and NDVI values are lower (less than 0.1), since there is no presence of vegetation in the water or when there is minimal.

Clustering results for agricultural crops and grassland were less accurate, probably because different crops present similar NDVI values in some phenological phase during vegetative crop cycle, but are useful to separate agricultural from nonagricultural areas, such as water,

index R = 0.81 and 32% of the sugarcane pixels (**Figure 7**). In most crop seasons, sugarcane ratoon is strongly correlated with the sugarcane production. Only in crop seasons 2005–2006 and 2007–2008 (**Figure 6E** and **G**), the sugarcane expansion is correlated with

**Figure 7.** Graph of pixels' number percentage for each cluster regarding each crop seasons in the period 2001–2002 to

2009–2010. Correlation values of the clusters with the sugarcane production.

**3.3. k-Means was used with DTW distance function of three dimensional (multivariate)** 

Dataset with more than 220,000 series in the state of São Paulo were clustered into five clusters (0–4) by k-means method with DTW distance function. Each cluster was formed according to the characteristics of NDVI, surface temperature and albedo extracted from AVHRR/NOAA images in the period 2001–2010. The identified areas were cluster 0 (magenta), which corresponds to water; cluster 1 (blue), which to the urban area and areas where the soil is exposed or have low vegetation and pasture; cluster 2 (green), which represents areas of agricultural crops; cluster 3 (yellow), which corresponds to sugarcane; and cluster 4 (red), which repre-

NDVI was useful to separate vegetation areas from other targets, for example, forests present high values of NDVI during the whole season (have high concentration of vegetation and biomass), and these areas are normally shown by red-colored representative time series, in profile visualization (**Figure 9A**). On the other hand, albedo variable was useful to separate water areas from other targets, but was not enough to distinguish areas having different levels of vegetation cover (**Figure 9B**). The water represented by cluster 0 was well clustered, since the NDVI values and especially the albedo values were different from other clusters, as shown in the temporal

crop production.

34 Time Series Analysis and Applications

**time series database**

sents forest areas (**Figure 8A** and **B**).

**Figure 9.** Profile visualization (2001–2010) of NDVI (A), albedo (B) and surface temperature (C) of clustering results.

urban areas and forest. Clustering of these areas was defined mainly by surface temperature, being higher for targets with lower canopy, such as urban areas and exposed soil, and lower for woodland (**Figure 9A** and **C**). For example, the forest areas represented by cluster 4, in **Figure 8A** and **B**, have high NDVI values (**Figure 9A**) and lower surface temperature values (**Figure 9C**), as they are very shady and dense vegetation coverage areas.

However, sugarcane fields were well clustered over the crop seasons because the sugarcane has a typical behavior (long seasonal cycle) than other crops. In **Figure 8A** and **B**, it is possible to observe the dynamic of this agricultural crop, represented by cluster 3 (yellow), throughout the decade in which in the crop years 2001–2002 the acreage was low, with higher production, and planted in the northeast area of the state, and in the end of the crop years 2009–2010, there was a significant increase in the planted area toward the western of the state. This technique of clustering in three dimensional (multivariate) time series database was efficient to perform temporal analysis of land use, indicating that this methodology can be used to identify and analyze the dynamics of land use and cover.
