**3. Data set description for CANDELA**

Our main data sets extracted from different instruments are Earth's surface images of the European Copernicus Programme (e.g., Sentinel-1 and Sentinel-2). Sentinel-1 is a twin satellite synthetic-aperture radar configuration, while Sentinel-2 is also a twin satellite configuration, each carrying a multispectral imager [22, 23].

There are three reasons why we are selecting and using Sentinel-1 and Sentinel-2 images. Firstly, we can recognize different target area details in overlapping radar and optical images complementing each other with rapid succession. Secondly, individually selectable Sentinel-1 and Sentinel-2 images can be rectified and co-aligned by publicly available toolbox routines offered by ESA allowing a straightforward image comparison or image fusion. Thirdly, all Sentinel instruments are totally openly available to the EO community. Many publications (dedicated conferences [1, 24–26]) already describe newly discovered Earth's surface characteristics derived from the individual instruments.

Furthermore, the long-term operations of the Sentinel satellites allow the interpretation of image time series or even the combination of time series data with external supplementary data via additional data mining and data fusion tools [1, 25, 26].

Besides these data sets, we include other third-party EO mission data sets as specified by CANDELA users (e.g., TerraSAR-X and WorldView).

#### **3.1 Sentinel-1 data**

The Sentinel-1 mission comprises a constellation of two satellites (launched on April 1, 2014, and on April 25, 2016), operating in C-band for synthetic-aperture radar imaging. SAR has the advantage of operating at wavelengths not impeded by thin cloud cover, or a lack of solar illumination, and can acquire data over a selected area during day- or nighttime under nearly no weather condition restrictions. The repeat period of each satellite is 12 days; that means every 6 days there is an acquisition by one of the two satellites.

The Sentinel-1 characteristics are presented in detail in [22]. From the multitude of parameters/configurations that exist for Sentinel-1, we have selected as examples the following configurations based on data availability, the CANDELA use cases, and our previous experiments: level-1 Ground Range Detected (GRD) products with high resolution (HR) taken routinely in Interferometric Wide (IW) swath mode. These products/data are produced (prior to geo-coding) with a pixel spacing of 10 × 10 m and correspond to about five looks and a resolution (range × azimuth) of

**81**

*Artificial Intelligence Data Science Methodology for Earth Observation*

the two satellites, thus providing a high revisit frequency.

20 × 22 m. They have a nearly uniform signal-to-noise ratio (SNR) and also a stable distributed target ambiguity ratio (DTAR). For these products, the data are provided in dual polarization, VV and VH for land and HH and HV for polar target areas.

The Sentinel-2 mission (like Sentinel-1) comprises a constellation of two satellites (launched on June 23, 2015, and on March 7, 2017) able to collect multispectral data and is affected by the weather conditions (e.g., cloud cover). The repeat period of each satellite is 10 days; that means every 5 days there is an acquisition of one of

Each Sentinel-2 satellite carries a multispectral instrument with 13 spectral channels (in the visible/near-infrared and shortwave infrared spectral range) and with 290 km swath width. The Sentinel-2 characteristics are presented in detail in [23]. This also applies to level-1 data; level-1C of these products are radiometrically and geometrically corrected images with orthorectification and spatial registration on a global reference system with sub-pixel accuracy. Since the product size is very large, each image is divided into several quadrants in UTM WGS84 projection. The average size of a quadrant is 10,980 × 10,980 pixels (rows × columns). For visualization, the RGB bands (B04, B03, and B02) were used to generate a quicklook quadrant image. For feature extraction, the user can choose different band

From the available third-party mission data sets, we selected for demonstration

TerraSAR-X is a German radar satellite launched in June 2007, followed by its TanDEM-X twin in 2010. Both operate in X-band and are side-looking SAR instruments that offer a wide selection of operating modes and product generation options [7]. TerraSAR-X has a revisit cycle of 11 days on the Earth's equator. We selected high-resolution spotlight mode images because they provide the highestresolution data of the target areas. As for the product generation options, we took enhanced ellipsoid corrected (EEC) and radiometrically enhanced (RE) data. Finally, we took horizontally polarized (HH) or vertically polarized (VV) images, as this option is most frequently used. The images have a pixel spacing of 1.25 m and a resolution of 2.9 m with WGS-84 map projection. The average size of the images is

In contrast, WorldView-2 provides a single panchromatic band and eight multispectral bands. It was launched in October 2009 to become a DigitalGlobe satellite. The revisit period of the satellite is about 3 days on the Earth's equator [28]. The resolution for the panchromatic band is 0.46 m and for multispectral bands is 1.87 m. The map projection of WorldView-2 is, again, WGS-84, and the size of these images (on average) for panchromatic images is 47,000 × 37,000 pixels (rows × columns) and for multispectral images is 11,000 × 9000 pixels (rows × columns).

In EO data mining, a number of researchers have already developed technologies for semantic image understanding [29, 30]. The available web engines are

four pairs of multi-sensor images of TerraSAR-X and WorldView-2 [27].

*DOI: http://dx.doi.org/10.5772/intechopen.86886*

**3.2 Sentinel-2 data**

combinations.

**3.3 Third-party mission data**

8000 rows × 9600 columns.

**4. Typical CANDELA examples**

**4.1 Data mining by machine learning**

20 × 22 m. They have a nearly uniform signal-to-noise ratio (SNR) and also a stable distributed target ambiguity ratio (DTAR). For these products, the data are provided in dual polarization, VV and VH for land and HH and HV for polar target areas.
