**3. Experiment setup**

and addresses of the folders in which the input information to the processors is located and the folders in which the outputs of the processors have to be sent. They also include

○ To find data in the ground stations (pooling) to be ingested in a shared storage unit in

• **process4EO node:** Constituted of different software modules, which are in charge of the processing of the raw data and the products of previous levels to produce image products. **Figure 2** depicts the pipeline of the image processing process. The four most important

○ Calibration: (L0 and L0R processing levels) to convert the pixel elements from instru-

○ Geometric correction: (L1A processing level) to eliminate distortions due to misalign-

○ Geolocation: (L1BR processing level) to compute the geodetic coordinates of the input

○ Orthorectification: (L1C processing level) to produce orthophotos with vertical projec-

○ To control the processing chain by communicating with the product processors.

the format in which the processors generate their output.

the cloud for its distribution to the processing chain.

○ To manage the archive and catalogue.

ment digital counts into radiance units.

ments of the sensors in the focal plane geometry.

operations are the following:

180 Multi-purposeful Application of Geospatial Data

tion, free of distortions.

pixels.

**Figure 2.** EOD's pipeline.

#### **3.1. Testing infrastructure**

The testing infrastructure used in the experiment is formed by hardware deployed in three different locations and managed in a federated manner: DMU infrastructure (in Deimos UK in United Kingdom), DMS infrastructure (in Deimos Space in Spain) and DME infrastructure (in Deimos Engenharia in Portugal). The hardware resources deployed in every location are described in **Table 1**. The ENTICE middleware was installed in the DMU infrastructure, which is acting as master. It also contains an object store with interface to Amazon Simple Storage Service (Amazon S3) for cloud bursting. DMS and DME infrastructures are slaves of DMU infrastructure and contain object stores also with interfaces to Amazon S3. A block diagram describing the interrelations of the testing infrastructure is depicted in **Figure 3**. The virtualization of the infrastructure was done with OpenNebula. Kernel-based Virtual Machine (KVM) was used as hypervisor. The creation of the virtual machines was done with Packer, whereas the automatic deployment of the virtual machines was done with Ansible. **Figure 4** shows a diagram describing the logic process of automatic generation of the virtual machines that constitute the EOD software. The image building process takes advantage of


**Table 1.** Hardware resources in the testing infrastructure.

• Packer template: It is a JSON file that provides all the information to create the virtual machine in Packer. It contains the format, the instructions and the parameters on how to build a VMI using KVM. The provisioners define the scripts or recipes in Ansible for configuring

Optimization of an Earth Observation Data Processing and Distribution System

http://dx.doi.org/10.5772/intechopen.71423

183

• Ansible playbook: These files are "recipes" to install the EOD software in the virtual machines. This is a YAML file with the commands expressed in a simplified language, describing a configuration or a process. It contains the information to configure the system, install the EOD software and the functionalities to work in the cloud environment

the machine and installing the applications.

**Figure 4.** Diagram of the automatic generation of the EOD virtual machines.

(contextualization).

**Figure 3.** Block diagram of the testing infrastructure.

the functionalities provided by Packer and Ansible to build KVM images. The virtual images are based on CentOS 6 Linux distribution and are stored in qcow2 format. This automation step comprises several files:

• Execution script: This script, developed in Python, launches the creation of the machine image with Packer. It receives a JSON file with all the variables that will be used in the building process, e.g. the user configuration, software repositories, Kickstart file and Ansible playbook, and configures all the required fields in the Kickstart file. It can build all the types of VMIs required to deploy the EOD software: archive4EO, monitor4EO and process4EO. The type of virtual machine to generate is specified in the content of the configuration file.


**Figure 4.** Diagram of the automatic generation of the EOD virtual machines.

the functionalities provided by Packer and Ansible to build KVM images. The virtual images are based on CentOS 6 Linux distribution and are stored in qcow2 format. This automation

• Execution script: This script, developed in Python, launches the creation of the machine image with Packer. It receives a JSON file with all the variables that will be used in the building process, e.g. the user configuration, software repositories, Kickstart file and Ansible playbook, and configures all the required fields in the Kickstart file. It can build all the types of VMIs required to deploy the EOD software: archive4EO, monitor4EO and process4EO. The type of virtual machine to generate is specified in the content of the con-

step comprises several files:

**Figure 3.** Block diagram of the testing infrastructure.

182 Multi-purposeful Application of Geospatial Data

figuration file.

The Python script receives the configuration file and launches the Packer command after configuring some parameters in the Kickstart file. The Packer command takes the template and runs all the builds within it in order to generate a set of artefacts and build the image in KVM. Once the image is built, Packer launches all the provisioners (Ansible) contained in the template. Ansible carries out several steps: it configures all the repositories, installs all the dependencies and software packages of the EOD modules, configures the EOD software and installs a context package to deploy the VMI in OpenNebula.

**4. Experiment results**

tion of 54.05% in the deployment time.

First, the virtual machine images of the EOD pilot were created, delivered and deployed in the cloud. Then, the virtual machine of the proces4EO was optimized and its VMI was again

In these results, one can see the increase in the performance of the system before the runtime, i.e. up to the deployment of the system: this is a reduction of 30% in VMI size, a reduction of 37.3% in the VMI creation time, a reduction of 34.53% in the VMI delivery time and a reduc-

Next, the raw data recorded with the satellite were ingested in both the original EOD pilot and the optimized EOD pilot. The response of both optimized and nonoptimized systems were measured in the runtime. The processing time of the satellite imagery in the original EOD pilot and the EOD pilot with the optimization of the processing chain is shown in **Figures 5** and **6** respectively. It can be noticed that the processing time of the different levels is similar in both experiments, so as to the time to process the raw data up to the orthorectification level (L1CR): 33.95 and 35.75 s in the nonoptimized and optimized systems, respectively. This difference is not substantial and can be produced by some OpenNebula processes, or the cloud has used

> **VMI delivery time (hh:mm:ss)**

Optimization of an Earth Observation Data Processing and Distribution System

http://dx.doi.org/10.5772/intechopen.71423

185

**VMI deployment time** 

**(hh:mm:ss)**

created, delivered and deployed. The time spent in every step is depicted in **Table 2**.

**VMI size (GB) VMI creation time** 

**Figure 5.** Processing time of the satellite imagery with nonoptimized EOD system.

**Table 2.** Metrics of the optimized and nonoptimized EOD pilot.

**(hh:mm:ss)**

**Nonoptimized VM** 2 00:19:42 00:20:25 0:06:47 **Optimized VM** 1.4 00:12:21 00:13:22 0:03:07 **Reduction (%)** 30 37.31 34.53 54.05

The recording of the experiment data was done with Jmeter™ [15] and Nagios® [16]. Jmeter™ is installed in the Node and Nagios® in a virtual machine inside the federated cloud. It is used for the monitoring of the cloud resources and status and to extract the experimental data.
