**2. Mapping environmental diseases in the twenty-first century**

Most of the epidemiological studies that had mapped vector-borne diseases in the context of the environmental factors associated with those diseases [18–20] had based their assumptions on the established scientific evidence that those environmental factors were associated with the disease outcome of interest. To date an increasing number of disease mapping and epidemiology studies continue to use environmental and climatic data to map and predict disease distribution in defined geographic areas. Such studies would often be used to help guide and target the deployment of health interventions to those areas that had been identified to have high burden of the mapped disease. As the resolutions (both spatial and temporal) of remote sensing sensors had improved from the first generation of this technology, so has the interest and confidence in the use of their data products increased among the scientific community. As much as early studies were limited by computer processing power and storage, they were also limited by poor spectral bands of sensors which could not faithfully enhance delineation and demarcation of features. The spectral bands refer to the recorded wavelengths of the electromagnetic spectrum recorded by a sensor during image acquisition. Recently, sensor spectral bands have also improved and could now resolve and aid multicolor image display during feature analysis and identification. These advances in image processing, visualization and display have also supported and enabled the uptake and appreciation of research findings from mapping efforts as the resultant maps became more *beautiful* in addition to providing more information. **Figure 1** is an example of a high-resolution satellite imagery that had been used to develop a land cover classification map for part of Eswatini as shown in **Figure 2**.

In disease mapping, these advances in remote sensing and sensor technology meant that identification of spatial heterogeneities would be possible even at small geographic or local scales. These advances were pivotal for disease mapping as epidemiologist could identify important drivers of disease risk and thus be able to guide control programs more efficiently and with evidence-based decisions. Also, the costs of remotely sensed data products had been significantly reduced

#### **Figure 1.**

*An example of high-resolution true color image covering part of Eswatini used to develop a land cover classification.*

as more sensors had been launched thus stabilizing the demand for data products and reliance in only a few remote sensing agents. More and more countries and private companies have launched satellites into space in the twenty-first century and the resultant imagery data products have been availed to the research community [24]. In addition, archived remotely sensed data products had often been offered to researchers free of charge and this have enabled spatial analysts to perform various analysis techniques such as time-series analysis, data mining and other data learning techniques. The RS data and other end products had also been customized in terms of the derivation and calculation of vegetation indices used in mapping studies. This customization had enabled direct incorporation of such indices into models as interpretation became possible to the research community even though not being experts in the remote sensing field. Common indices that had been widely adopted into disease mapping models because of their ease in interpretation include those of the Normalized Difference Vegetation Index (NDVI), temperature and rainfall. These indices had often been supplied or archived in their complete processed (derived) and customized form, thus enabling researchers to easily access them from the hosting agencies and websites and directly incorporate them into their mapping models as they had become interoperable.

Advances made in mapping software, particularly geographic information system (GIS) software, had seen interest being stimulated among disease researchers and epidemiologist. Whereas, earlier software was mostly geared toward solely remote sensing experts, the availability of customized mapping platforms for spatial epidemiology meant that these software programs could be utilized even by

#### **Figure 2.**

*An example of high-resolution land cover classification in Eswatini.*

non-remote sensing experts. For instance, the public health community had been able to facilitate the development and customization of disease mapping software programs like; Health Mapper, Epi-Info, etc. Other GIS software programs such as ESRI ArcGIS had over the years added more customized mapping tools and extensions meant to support disease mapping and epidemiology efforts. Again, the high costs which were often associated with some of the commercial software had been reduced as more open source software became available. For example, GIS software programs such as QGIS had been availed as open source and could be directly downloaded and installed into any GIS capable computer. Also, true color visualization web-based software such as Google Earth which had even given mapping novices some level of confidence due to the fact that it had been made without any associated sophistication had contributed to the hype about disease mapping, rapid risk assessment, and prediction among epidemiologists. No mapping experts could leverage on such web-based imagery software programs and be able to identify, analyze, and interpret spatial phenomenon explicitly as it appears on the zoomed imagery on a computer screen. This way, disease experts had been able to explain some of the identified trends and patterns and also directly answer some of the pertinent questions associated with disease epidemiology such as clustering, severity variation and disease presence or absence *inter alia*.

#### *Remote Sensing Applications in Disease Mapping DOI: http://dx.doi.org/10.5772/intechopen.93652*

Again, a number of statistical software programs have also added mapping extensions which have enabled the analysis of environmental and climatic data to be undertaken using such software. This has resulted to a new field of research called geostatistics, which combines both geography and statistics during spatial analysis. For example, statistical software programs such as STATA, R, WINBUGS, and others have been used to process and analyze climatic data derived from remote sensing. In disease risk mapping, space and time analysis had often been conducted using these statistical software programs and they had been widely used as research methods in epidemiology and disease risk prediction in addition to the usage of mathematical models which attempt to explain the underlying factors and quantities in disease risk modeling. Geostatistics therefore had been pivotal in the application and incorporation of remotely sensed data products into disease mapping and epidemiology. The capability of Geostatistics to incorporate technical algorithms that could be used to forecast disease burden in space and time had also contributed to the wide adoption of such approaches as it meant that control programs could *a priori* be informed about disease risk and thus be better prepared to deal with disease outbreaks.

As already mentioned, advances in computer processing power had enabled the integration of computing methods based for instance on Bayesian inference approaches which had been previously limited due to poor computer performance. Data simulation methods such as Markov Chain Monte Carlo (MCMC) and the integrated Laplace approximation (INLA) had been widely used to estimate posterior distribution of geographic data in space and time. The results of which had been obtainable within reasonable time frames compared to earlier computation efforts of similar data. In statistics MCMC are methods comprised of a class of algorithms that sample from a probability distribution. MCMC uses simulation techniques to find a posterior distribution and sample from it. On the other hand INLA relies on analytical combinations that approximate and efficiently integrate numerical schemes to achieve highly accurate deterministic approximations of posterior quantities of interest [25, 26]. As a result of these statistical and computational advances, the integration of environmental and climatic data derived from remote sensing technology into disease mapping models had over the years markedly increased.

Recently, the capability of big data, machine learning and other location intelligence methods to handle a large array of data sets have contributed to the awareness about the application of geographic data as often models using these methods would be performed on geographic software. Big data refers to extremely large data sets that may be computationally analyzed to reveal patterns, trends, and associations relating to human behavior. Machine learning approaches focuses on computer programs that can assess data and use them to learn and improve from experience of them without being explicitly programmed [27]. These analysis methods are often conducted on GIS capable computers and they also rely on remote sensing products such as satellite imagery to analyze and reveal any patterns, trends and associations coming from the data. In disease mapping and epidemiology such analysis approaches are important in understanding risk of disease spread due to human behavior and their interaction with the environment. As a result, most of the diseases mapping efforts currently applied have either practically or theoretically ended up either utilizing remotely sensed data or its associated geographic data products into spatial analysis models as often the resultant modeling outputs would be displayed in a mapping environment.

## **2.1 Remote sensing data application in vector-borne disease surveillance**

Mapping of vector-borne diseases began around 1950 with the use of aerial photography and cartographic techniques. Early studies included those that focused on eradicating malaria, dengue fever and yellow fever whereby climatic factors were used to identify areas at risk of higher transmission. The Malaria Atlas Project founded in 2006 took over from previous mapping efforts and demonstrated the application of geography based variables to map and disseminate accurate information on malaria endemicity. Identifying and mapping vector habitats using climatic suitability was used to guide surveillance and control efforts. Different approaches were used to improve visualization and to produce detailed maps such as high-altitude color-infrared photography and also incorporating high-resolution images [28]. Mapping of vegetation types associated with some of the vector breeding habitat had been carried out since 1973. The techniques used in such analysis had been very important for surveillance support and for identification of vector oviposition habitats. The visualization and interpretation techniques used were based on tone and texture and were used to identify habitats associated with tick-borne disease in some areas based on the concepts of landscape epidemiology of disease [29].

In early 1970s, multispectral scanner data was first used to monitor and map environmental parameters required for the breeding of disease vectors. A combination of remote sensing data acquired from satellites as well as aircraft platforms was used for this task. Later in 1976, some studies demonstrated that computer processing techniques could be used to classify airborne multispectral scanner data for mapping and identifying vegetation types associated with certain disease-causing mosquitoes. Around 1984, remote sensing techniques were applied to describe and map geographic characteristics associated with schistosomiasis [30]. Temperature and precipitation data obtained from remote sensing was also used to estimate the probability of disease occurrence at unsampled locations. These data were also used to identify and map mosquito larval habitats and their association with certain environmental variables in space and time. A study by [31] identified tick habitats on the island of Guadeloupe using derived vegetation and moisture indices.

Most of the studies discussed above were primarily focused on the application of remote sensing to identify and map potential vector habitats and breeding sites based on vegetation, water, and soil. Identifying existing or potential habitats and breeding sites would not be enough to adequately guide surveillance and control efforts unless all possible affected areas were identified and mapped. Therefore most studies have also incorporated predictive techniques as part of their support of surveillance and control efforts. Consequently, it had been necessary for studies to go beyond mere habitat and potential breeding sites mapping and to make predictions of vector distributions in space and time often for the entire geographic area of interest. This often included identification and mapping areas where vector production and disease transmission risk would be greatest in defined time and space thresholds. **Figure 3** is an example of the spatial distribution of malaria vector breeding sites and their distance to subsistence farming in Eswatini.

### **2.2 Predicting diseases using remotely sensed data variables**

An important aspect in the application of RS data in disease mapping and epidemiology is their use as predictor variables in modeling. Disease mapping studies had often used environmental and climatic proxies derived from remote sensing in statistical regression models aiming to predict disease risk in both its spatial and temporal dynamics. These studies predict disease distributions, vector populations and disease transmission risk within the affected populations in specific areas. Common climatic variables used in disease predictive modeling studies often combine remote sensing measurements of vegetation, precipitation, and temperature to identify when and where conditions would be favorable for disease propagation. Other studies have attempted to use remote sensing to predict the temporal as well

**Figure 3.** *Example of mapped breeding sites and their distance to subsistence farming.*

as spatial patterns, of habitat development, vector populations, and disease transmission risk [32, 33].

Among the most common disease mapping studies where predictions had often been used included those of malaria. These studies also use remote sensing to predict which populations or villages are at risk of transmission. In these cases risk would often be defined by the proximity of a village to areas of heightened transmission as well as the breeding, feeding, and resting habitats required by the malaria vector *Anopheles* mosquito*.* In addition recent studies have focused on assessing the issue of predictive modeling and disease transmission risk based on the application of remote sensing and GIS technologies. To date, there have not yet been an alternative to disease mapping and prediction in space and time and to identify, characterize, and map the patterns of vector habitats other than using products derived from satellite imagery and remote sensing. For instance, identification and location of areas where vector survival rates are highest had been done using vegetation indices derived from remote sensing. **Figure 4** is an example of predicted potential malaria vector breeding sites in the northeastern part of Eswatini.

Also remote sensing data and GIS techniques had been used to identify and map landscape features associated with disease transmission risk. Landscape features such as brush, woodland and grassland and areas cleared for housing, roads, or trails and other similar locations identified through remote sensing had often been used for detecting intercept areas between human hosts, vectors and parasites. Other landscape features such as coniferous forest, deciduous forest, mixed forest,

**Figure 4.** *An example of identified potential malaria vector breeding sites in Eswatini.*

water bodies, glades, and housing developments had also been identified via remote sensing. Some studies have assessed these landscapes for their association with the presence of certain disease vectors and the proximity of housing as a measure of human exposure to those diseases. Such landscape epidemiology had therefore been used to identify areas where transmission risk is greatest by characterizing the mixture of deciduous forest and residential developments that bring diseases, vectors, and humans into contact.

Field studies had often been used to train spatial models that make use of remote sensing to map the distribution of disease and predict transmission risk areas. These studies also use GIS to capture groundtruthed coordinates or Global Positioning System point (GPS) which would then be used to assess the accuracy of predictive models. For instance, GIS and remote sensing had been used to investigate the adjacency of certain landscape features and residential properties with dense vegetation as a potential measure of human-vector contact. In this case, regression models would be used to assess general correlations between landscape and disease transmission risk. Evidently, remote sensing and GIS had been combined to study, for instance, the structure and composition of a landscape as it relates to the epidemiology of a disease. Again some studies have combined remote sensing and GIS analysis techniques, to assess various associations between diseases, vectors and human contacts.

Models based on remote sensing data and GIS techniques have also been used to study certain disease vector population dynamics using remote sensing and GIS technologies. Such studies use satellite imagery and GIS modeling techniques to distinguish between areas with either high or low disease-causing vectors. In these cases, ground data on vector populations are used jointly with remote sensing data combined via either a statistical or a GIS software. Ground data variables on vector presence or absence are analyzed in relation to the remotely-sensed spectral data captured via satellite imagery. The groundtruthed field measurement data, that

were observable using GPS, would then be used to evaluate the accuracy of the remote sensing based predictive models.
