**5. Modeling species distribution**

Innovative techniques are urgent to understand species geographic distribution, especially considering the impact of global changes. A computation technique was developed to model species distribution. This tool received different names, which will be considered here as synonymous, e. g., species distribution modeling, ecological niche modeling, and recently, habitat suitability modeling.

Species distribution modeling (SDM) can combine georeferenced occurrence data points (latitude x longitude) with different data sets that characterize the environment where the focal species occur. These sets are combined and analyzed aiming to build a representation of ecological requirements of the focal species or, in other words, a representation of their ecological niche. The final result can be projected in the geographic space, indicating the areas that are suitable to the focal species and can be potentially occupied by it.

Usually, these data sets are comprised by abiotic features, such as temperature, precipitation and altitude, which describe the environment where the species occur. But SDM can include data about occurrences of interacting species (biotic features) that are also responsible for shaping geographical distribution, such as other species involved with the focal species on mutualism, competition or parasitism. And finally, it can include data about the species dispersal capacities, in order to estimate their capability of occupying new environments. These features are the base of modelling and its conceptual framework can be found mainly in Soberon (2010) (but also see Elith and Leathwick, 2009). Nowadays, abiotic, biotic and dispersion capacities can be integrated in BIOMOD (Thuiller, 2003) a computational system developed to R (The R Foundation for Statistical Computing) that can be used to model species distribution.

Considering the relationship between bees and plants, interactions are key aspects to include in SDM and have been considered its main challenge (Elith and Leathwick, 2009). Interactions are also of special concern when considering scenarios of global change (Schweiger et al., 2010). Mismatches between the correspondences of geographic areas of obligate interacting species due to climate change were already suggested (Stralberg et al., 2009). Besides, the relationship between pollinators and their host plants includes a temporal correspondence, in which plants synchronize the flowering period with their pollinators' activity, and this correspondence can also be changed due to climate changes (Hulme, 2011)

Interacting species are closely related to their geographic areas of distribution (Giannini et al., 2010; Giannini et al., 2011). For example, when characterizing the flower visitors'

identification system uses two computational algorithms to complete the recognition and classification of bee species. The first algorithm, named kNN (k-Nearest Neighbour), is used to select and extract morphometric features related to the distances between the landmarks plotted in the wing veins intersections from the pictures. The second algorithm, named FkNN (Fuzzy kNN) implements a variation of the Fuzzy Logic for species classification. For an optimized result, the features selection involves a statistical analysis which evaluates the better landmarks for the classification process and only the most informative are used in the

The morphometric analysis of forewing is a very powerful tool to describe species variation and also to identify species based on landmark and outline morphometric methods. Allied to that, morphometric analysis is a fast, inexpensive and informative method to be used in

Innovative techniques are urgent to understand species geographic distribution, especially considering the impact of global changes. A computation technique was developed to model species distribution. This tool received different names, which will be considered here as synonymous, e. g., species distribution modeling, ecological niche modeling, and recently,

Species distribution modeling (SDM) can combine georeferenced occurrence data points (latitude x longitude) with different data sets that characterize the environment where the focal species occur. These sets are combined and analyzed aiming to build a representation of ecological requirements of the focal species or, in other words, a representation of their ecological niche. The final result can be projected in the geographic space, indicating the

Usually, these data sets are comprised by abiotic features, such as temperature, precipitation and altitude, which describe the environment where the species occur. But SDM can include data about occurrences of interacting species (biotic features) that are also responsible for shaping geographical distribution, such as other species involved with the focal species on mutualism, competition or parasitism. And finally, it can include data about the species dispersal capacities, in order to estimate their capability of occupying new environments. These features are the base of modelling and its conceptual framework can be found mainly in Soberon (2010) (but also see Elith and Leathwick, 2009). Nowadays, abiotic, biotic and dispersion capacities can be integrated in BIOMOD (Thuiller, 2003) a computational system developed to R (The R Foundation for Statistical Computing) that can be used to model

Considering the relationship between bees and plants, interactions are key aspects to include in SDM and have been considered its main challenge (Elith and Leathwick, 2009). Interactions are also of special concern when considering scenarios of global change (Schweiger et al., 2010). Mismatches between the correspondences of geographic areas of obligate interacting species due to climate change were already suggested (Stralberg et al., 2009). Besides, the relationship between pollinators and their host plants includes a temporal correspondence, in which plants synchronize the flowering period with their pollinators' activity, and this

Interacting species are closely related to their geographic areas of distribution (Giannini et al., 2010; Giannini et al., 2011). For example, when characterizing the flower visitors'

correspondence can also be changed due to climate changes (Hulme, 2011)

areas that are suitable to the focal species and can be potentially occupied by it.

species characterization in the Fuzzy Logic (Buani, 2010).

the characterization of species and its variation.

**5. Modeling species distribution** 

habitat suitability modeling.

species distribution.

composition at the whole range of their host plant distribution, Espíndola et al. (2011) found geographically structured variability of the prevailing visitor. They suggested that climate is driving the specificity of this interaction, by potentially affecting the phenology of one or both interacting species, providing an example of the direct effect that the abiotic environment can have on the plant–insect interaction.

This is in accordance with Thompson (2005) who suggested a geographic mosaic theory of coevolution stating that interspecific interactions commonly exhibit geographic selection mosaics and trait remixing among populations. From this view, the form and trajectory of coevolutionary selection vary across landscapes. In addition, gene flow and metapopulation dynamics continually shift traits among populations, thereby continually altering the structure of local selection.

Laine (2009) reviewed 29 studies that support this theory, concluding that natural coevolutionary selection produces genetic differentiation among populations and may be an important mechanism promoting diversity in nature given how different types of interactions show divergence, and how variable the causes promoting such divergence are. One of the remarkable results of this review is the spatial scale over which it is possible to find divergent coevolutionary trajectories. Variation was detected in populations separated by some hundreds of kilometers highlighting the potential for the environment to create geographically variable selection trajectories. For example, analyzing a rare and endangered solitary bee (*Colletes floralis*), Davis et al. (2010) found an extremely high genetic differentiation among populations at the extreme edges of the species range. Also, Pellissier et al. (2010) analyzed how the traits of different pollination syndromes influence the distributions of plant species in interaction with pollinators. They used a combination of environmental descriptors and found a potential effect of the pollinator on the spatial distribution of plant species. Also, analyses of a system involving the Japanese camellia and its obligate seed predator, found that the sizes of the plant defensive apparatus (pericarp thickness) and the weevil offensive apparatus (rostrum length) clearly correlated with each other across geographically structured populations (Toju and Sota, 2006).

Therefore, intermingled with environmental (abiotic) and interaction (biotic) features, geographical distribution is also related to species evolutionary trends, determining patterns of genetic diversity and trait variation across space. New approaches are necessary to analyze the importance of these complex features. Recently, Pavoine et al. (2011) suggested a framework based on a mathematical method of ordination to analyze phylogeny, traits, abiotic variables and space in a plant community. Another example can be found in Diniz Filho et al. (2009) proposing an integrated framework to study spatial patterns in genetic diversity within local populations, coupling genetic data, SDM and landscape genetics. Also, Kuparinen and Schurr (2007) developed a framework to link the spatio-temporal dynamics of plant populations and genotypes, and a similar approach was suggested for modeling the variation of geographical distribution of animals in a climate change scenario (Kearney and Porter, 2009).

To analyze the multiple drivers shaping the species geographical distribution is a challenge that will be met by integrating different fields of research. In order to attain the objective of predicting impacts on species distribution due to global changes, it is necessary to consider that species are genetically heterogeneous entities and, in order to protect them, it is necessary to protect its genetic diversity. As species diversity might act as insurance against environmental changes, genetic diversity should also have the potential to protect communities from environmental variability (Lavergne et al., 2010). Taubmann et al. (2011)

Biodiversity in a Rapidly Changing World: How to Manage and Use Information? 357

Fund (WWF) (Terrestrial Ecorregion - Olson et al., 2001) and Global Land Cover database

Algorithms are finite sequences of instructions for calculating a function. There are lots of algorithms for SDM. Two important ones are Maxent (Maximum Entropy - Phillips et al., 2006) and Genetic Algorithm for Rule set Production (GARP - Stockwell and Peters, 1999) that have been successfully applied to small data sets with presence-only occurrence points (Wisz et al., 2008). GARP and other algorithms can be found in openModeller (Santana et al., 2008), a computational system to perform SDM. Another system available to SDM is BIOMOD (Biodiversity Modelling – Thuiller, 2003) developed to R (The R Foundation for

Generating an independent data set is necessary to evaluate the model accuracy. This data set can be obtained through new surveys on supplementary areas or dividing the original data set in two. One of the data sets will be used to generate the models (train data) and the other to test it (test data). Usually, it is suggested to divide the original data set randomly and without reposition in 70% of the data to generate the model, and 30% to test it (Fielding and Bell, 1997; Hirzel and Guisan, 2002). It is also possible to divide the data considering their spatial pattern (Peterson et al., 2008). The area under the receiver operating characteristic curve (AUC) is usually used to evaluate the models. Values of AUC range from 0.5 for models with no predictive ability, to 1.0 for models giving perfect predictions (Swets, 1988). Some authors discussed the use of AUC to evaluate the model accuracy (Peterson et al., 2008) but nowadays, it is the most important method to this end (but see other alternatives on Thuiller, 2003). It is also possible to check the model accuracy conducting new surveys on the areas suggested as potential areas of occurrence by

Preliminary analyses regarding some Brazilian pollinators and species distribution modeling were recently done. We chose some species of *Melipona* and *Centris* bees (Apidae; Hymenoptera) to forecast the impact of future climate changes (Saraiva et al., in press;

*Melipona* genus is comprised by eusocial bees that are mainly associated to Atlantic Forest, an endangered moist forest of Brazil. This genus was suggested as an important pollinator to this ecosystem by Ramalho (2004). On the other hand, *Centris* is comprised by solitary species that are especially important to plants that produce floral oil, on which they depend

The analyzed species of *Melipona* were also reported as pollinators of some fruit crops, as açai (*Euterpe oleracea*), avocado (*Persea americana*) and guava (*Psidium guajava*) (Castro, 2002; Venturieri *et al.,* 2008). The *Centris* species were reported as pollinators of acerola (*Malphighia punicifolia*), murici (*Byrsonima crassifolia*), cashew (*Anacardium occidentale*) and tamarind (*Tamarindus indica*) (Castro, 2002; Freitas *et al.,* 2002; Vilhena and Augusto, 2007; Rego *et al.,* 2006; Ribeiro *et al.,* 2008; Siqueira, 2010). Recently, Nunes-Silva et al. (2010) also demonstrated the importance of *M. fasciculate* on performing buzz pollination on tomatoes.

(Bicheron et al., 2008).

**5.4 Model evaluation** 

modelling.

Giannini et al., in press).

Statistical Computing) that presents nine algorithms.

**6. Example of tools' integration using pollinators** 

to build their nest and feed their offspring (Simpson 1989).

**5.3 Algorithm** 

analyzed the genetic population structure of the endangered mayfly (*Ameletus inopinatus*) in its European range genotyping hundreds of individuals from different populations. They found variations in genetic diversity and also projected the distribution of species through SDM for the year 2080 finding some areas of regional habitat loss. By relating these range shifts to the population genetic results, they were able to identify conservation units that, if preserved, would maintain high levels of the present-day genetic diversity and continue to provide long-term suitable habitat under future climate change scenarios.

Most ecological forecasting of future species ranges is based on models that generally ignore evolution and assume that the mechanistic relationship between species abundance and environmental characteristics is unchanged at the timescale of the projection (Lavergne et al., 2010). But there is accumulating evidence that evolution can proceed fast (Hairston et al., 2005) and genetic variation for adaptation - and more generally for traits defining species ecological niches - is common both between and within populations, suggesting a high level of local adaptation to climate at a fine scale (Pearman et al., 2008). Adaptation and dispersal are often presented as alternative mechanisms whereby a population can respond to changing environmental conditions playing a crucial role in tracking favorable environmental conditions through space (Pease et al., 1989). Thus migration of different genotypes could have important consequences for the evolution of geographical distribution limits (Davis et al., 2005).

Addressing the main aspects discussed here about distribution of species, it was suggested that the new trends on SDM, regarding the impacts of global changes on species diversity, are niche evolution, phylogeographic and phylogenetic research (Zimmermann et al., 2010). As pointed out by Gilman et al. (2010), the key question is not the effects resulting from global change on individual species, but the stability of the system as a whole. Integrated fields of research will allow novel analysis of both historical and contemporary drivers of species ranges, and will likely provide new possibilities to understand present day species distributions and project them to the future.

Species distribution modeling presents some steps and requires expertise knowledge about the focal species and also ecology, geography and clime. The following summarized steps are suggested.
