**1. Introduction**

Vector-borne diseases are infections caused by viruses, bacteria, or parasites that are transmitted to humans by the bite of infected arthropod species, these can be diseases transmitted by mosquitoes (dengue fever, West Nile fever, chikungunya, malaria, Zika, etc.), by sandflies (leishmaniasis), by ticks (encephalitis, Lyme Borreliosis, Crimean-Congo hemorrhagic fever, Human Granulocytic Anaplasmosis) by triatomines (Chagas disease), among others. These diseases account for more than 17% of all infectious diseases and cause more than 700, 000 deaths per year [1, 2].

Vectors are living organisms that can transmit infectious pathogens between humans or from animals to humans. Many of these vectors are insects that ingest disease-causing microorganisms during a blood meal from an infected host and then transmit it to a new host after the pathogen has replicated. Another characteristic of arthropod vectors is that they are cold-blooded (ectothermic) and therefore very sensitive to climatic factors, although the climate is only one of many factors that influence vector distribution, as there are also geographic and sociodemographic factors [1].

In order to interpret the behavior of vector-borne diseases in the most accurate and simplified way possible, statistical models are used. A statistical model is a simplified representation of a phenomenon of interest [3, 4]. With their help, it is possible to model, predict and make inferences about natural phenomenons, biological systems, epidemiological studies, and others [5]. One of the most widely used statistical models is linear regression models, which predict a continuous target based on linear relationships between the target and one or more predictors. But there is another type of model that extends the general linear model, so that the dependent variable is linearly related to the factors and covariates by means of a certain link function, which is known as a generalized linear model [6].

Generalized Linear Models (GLMs) provide a collection of linear regression models including the exponential family, such as the Binomial and Poisson, which are distributions for counting data. The GLMs were introduced by Nelder in 1972 [7], in 1989 they were studied in greater depth by McCullagh [8] and over time more authors were integrated [9–13].

There are three components in GLMs: A response variable distribution, a linear predictor, and a link function. A response variable **Y** is assumed ð Þ *Y*1, *Y*2, … , *Yn* , where *Y*1, *Y*2, … , *Yn* are independent of each other; its expected value is related to a linear predictor *E Y*½ �¼ *<sup>g</sup>*�<sup>1</sup> **<sup>d</sup>**<sup>0</sup> *β* , where *β*∈ ℜ*<sup>p</sup>* is a vector of regression parameters, **d** are known explanatory variables and *g* is a known function called a link function, which allows to define the relationship between the systematic and random components [14].

GLMs can help in numerous areas such as epidemiology, mining engineering, Earth and environmental sciences, ecology, biology, geography, economics, agronomy, forestry, image processing, and more [15, 16]. For epidemiology in particular, as it is about understanding diseases that affect a population, the most usual thing is to find a binary variable that represents the presence or absence of a disease or to count the events of a disease for certain areas.

Such is the case of a study conducted by Hashizume et al. [17] in Bangladesh, 2012. They used a Generalized Linear Poisson Regression Model to examine weekly dengue hospitalizations in relation to river levels, during the years 2005 to 2009, and the climatic variables daily precipitation and average temperature. The models were adjusted according to seasonal variation and temperature. They found evidence of a 6*:*9% increase in dengue with high river levels, but a 29*:*6% increase in disease when rivers were very low.

An important extension of the GLMs is the Generalized Linear Mixed Models (GLMMs) [18]. GLMMs provide a range of analyses for those data that are correlated in space and belong to the exponential family (Gamma, Poisson, Binomial, among others) [19]. Generalized Linear Spatial Models (GLSMs) are basically GLMMs, since latent variables are derived from a spatial process. In recent years, there has been a growing interest in the analysis of spatial data in epidemiology, in order to predict the incidence of vector-borne diseases.

#### *Spatial Statistics in Vector-Borne Diseases DOI: http://dx.doi.org/10.5772/intechopen.104953*

Using techniques available to epidemiologists and other health professionals, the potential of remote sensing, Geographic Information Systems (GIS), and spatial analysis of epidemiological data has been demonstrated by some authors such as those mentioned below; however, there are still few studies that adequately prove the potential of these tools, since they are still being exploited in the fight against diseases [20].

For instance, a Colombian paper published in 2012, Sanchez et al. [21] estimated Generalized Linear Spatial Regression Models with a Poisson response to explain the behavior of malaria and dengue in different years. Health determinants were identified in the occurrence of these diseases and risk maps were obtained. Finally, it demonstrated the need to link spatial effects in the models and the explanatory variables considered, to explain the number of reported cases of the disease in the years analyzed.

Another example is the work of Estallo et al. [22] in 2021, which evaluated the species responsible for the transmission of *Leishmaniasis* (phlebotom-*Phlebotominae*) during the period 2012 � 2014 in northern Argentina. Through Generalized Linear Mixed Models, the implications of vectors in disease transmission were evaluated, using meteorological and teledetection environmental factors. It was observed that the species *Lutzomyia* longipalpis was the most abundant in urban areas. The findings allowed detecting of high-risk areas and the developing of predictive models to optimize resources and prevent *leishmaniasis* transmission in the area.

As can be seen, spatial analysis is a powerful tool for the analysis of georeferenced data, as it can give health research a broader perspective of the occurrence of health events and diseases. Spatial statistical models are useful because they estimate the spatial variance inherent in the data, and can also be used to perform statistical inference throughout the study area. Spatial prediction can be made based entirely on a stochastic model or in combination with a deterministic trend [20, 23].

The aim of this chapter is to show an example of the application of spatial statistics, implementing a Generalized Linear Spatial Model for the prediction of dengue disease in the state of Chiapas. For this, there are considered patient age and the next information of each municipality: garbage disposal service, maximum environmental temperature, average monthly rainfall, and altitude as covariates. For the study of the disease in the 118 municipalities of Chiapas, the cases observed in 36 municipalities in the state of Chiapas and the information in the aforementioned explanatory variables were considered.
