**2. Materials and methods**

#### **2.1 Prediction of land cover changes**

In recent years, land cover changes have received special attention. Land cover changes are the result of the complex interaction of several factors such as management, economy, culture, human behavior, and environment [20, 21]. Familiarity with how land cover changes is very important because these changes cause major effects on the environment such as hydrological cycles changing [22], the size and order of natural habitats such as forest areas [23], and species diversity [24] and can overshadow the region's economy [25].

Modeling the spatial pattern of land cover changes provides valuable information for a better understanding of the change process, determining the effective factors, and predicting the areas subject to change. Spatial models of land cover change can be divided into three main groups: Empirical Estimation Models, Dynamic Simulation Models, and Rule-base Simulation Models [26]. Empirical estimation methods use statistical techniques to model the relationship between the change based on the user rule and the factors affecting it.

Knowing the effective processes in change is possible by interpreting the output of statistical models. Empirical estimation methods are one of the most widely used methods for simulating the spatial pattern of land cover and its changes over time due to the simplicity of the structure and the ability to analyze multiple variables [27].

#### **2.2 Land change modeler**

The land change modeler is a software for creating sustainable ecological development, which was designed to understand and identify land cover changes and environmental and protection requirements caused by these changes. This software exists as a vertical application in the IDRISI software system [28]. This model is also available as Extension for ArcGIS software. In Land Change Modeler (LCM), tools for the assessment and prediction of land cover change and its implications are organized around major task areas: change analysis, change prediction, and planning interventions. Land change prediction in land change modeler is an empirically driven process that moves in a stepwise fashion from 1) change analysis, 2) transition potential modeling, to 3) change prediction. It is based on the historical change from time 1 to time 2 land cover maps to project future scenarios [29].

#### **2.3 Sensitivity analysis of artificial neural network**

The artificial neural network option can model multiple transitions at one time Initially, the dialog for the multilayer perceptron neural network may seem daunting, but most of the parameters presented do not need to be modified (or in fact understood) to make productive use of this very powerful technique.

As launched by LCM, the multilayer perceptron starts training on the samples it has been provided of pixels that have and have not experienced the transitions being modeled. At this point, the MLP is operating in automatic mode whereby it makes its own decisions about the parameters to be used and how they should be changed to

better model the data. Automatic mode monitors and modifies the start and end learning rate of a dynamic learning procedure. The dynamic learning procedure starts with an initial learning rate and reduces it progressively over the iterations until the end learning rate is reached when the maximum number of iterations is reached [29].

**Figure 1** illustrates a single hidden layer MLP feed-forward neural network. The back-propagation algorithm [30] is the most widely used learning algorithm for an MLP neural network. The learning process consists of two parts: feed-forward and backward pass. The outputs of the Artificial Neural Network (ANN) are calculated in the feed-forward pass process and the output errors are propagated backward to adjust the weights and biases of the ANN.

In the present research, Skill Measure [28] and Accuracy Rate were used to analyze the sensitivity of the multilayer perceptron artificial neural network model. Skill Measure is a statistic to evaluate the ability of the model based on the validation data and measures the skill of the model to predict future changes based on the training data. In fact, this statistic is used to compare the accuracy of the model based on the validation data and the expected accuracy that is supposed to occur randomly.

This statistic considers the ability of the model for each land cover transfer separately and shows the role of variables in the accuracy of the model for predicting change. This statistic is between �1 and � 1. The closer to 1, it indicates the high accuracy of the model in predicting changes, and if it is zero, it indicates the random performance of the model. The expected accuracy and Skill Measure statistics are calculated from Eq. (1) and (2) [29].

$$\mathbf{E(A)} = \mathbf{1} + (\mathbf{T} + \mathbf{P}) \tag{1}$$

E(A), is the expected accuracy rate. T is the number of considered transitions in the transition potential. P is the number of classes that remain constant in transitions.

$$\mathcal{S}(\text{Skill Measure}) = (A - E(A))/(1 - E(A))\tag{2}$$

#### **Figure 1.**

*Example of a single hidden layer MLP feed-forward neural network.* X*1,* x*2, … ,* xR *are input parameters,* y*1, … ,* y*<sup>T</sup> are output targets,* R *is the number of input parameters, and* T *is the number of output targets.* f*1 (*x*) is the activation function of the hidden layer, and* f*<sup>2</sup> (*x*) is the activation function of the output layer.*

*A Study of the Comparison between Artificial Neural Networks, Logistic Regression and Similarity… DOI: http://dx.doi.org/10.5772/intechopen.111615*

A is the measured accuracy provided by the model, and E(A) is the expected accuracy.

#### **2.4 Sensitivity analysis of logistic regression**

It is one of the experimental models that has been used in many researches in the field of forest area change analysis, urban growth modeling, and agricultural land modeling and has provided very good results [14]. Logistic regression is a probabilistic model that fits between land use change (as a dependent variable) and factors affecting it (as independent variables). Based on this model, the relationship between variables can be explained, the relative importance of index variables can be estimated, and the land cover change probability map can be extracted [31].

In the logistic function, the probability of land use change was defined as a function of explanatory variables, which includes a uniform curve between 0 and 1 [14]. One of the statistical characteristics that are examined in the logistic regression model is Goodness of fit. The goodness of fit is determined based on the difference between the observed values and the predicted values of changes in the value of the dependent variable. The following equation is used to calculate the Goodness of fit (Eq. (3)) [32]:

$$\text{Goodnessoffset} = \sum\_{i=1}^{N} (y\_i - u\_i) / u\_i \times (1 - u\_i) \tag{3}$$

ui are the observed values for the dependent variable, and yi are the predicted values for the dependent variable. The basis of Goodness of fit in logistic regression is the probability ratio, which is determined based on two statistical characteristics -2log (L0) and -2log(Likelihood), where L0 is the value of the probability function, provided that all coefficients except the intercept are 0 and Likelihood represents the amount of the probability function for the model. The following two statistical characteristics are defined based on the aforementioned two statistical characteristics (Eq. (4) and (5)) [33]:

$$\text{PseudoRSquare} = \mathbf{1} - (\log(\text{Likelihood}) / \log(\mathbf{L}\_0)) \tag{4}$$

Therefore, if the value of R (Pseudo R Square) is equal to 1, it indicates a good Goodness of fit; if the value of this characteristic is 0, it means that there is no relationship between the variables, and if its value is greater than 0.2, it indicates a relatively good Goodness of fit:

$$\text{ChiSquare}(K) = -2(\log(\text{Likelihood}) - \log(\text{L}\_0)) \tag{5}$$

Chi-square is also introduced as a statistical characteristic of the probability ratio, which follows the chi-square distribution when the null hypothesis is correct. This statistical characteristic examines hypotheses in which all coefficients are zero except the intercept. The chi-square degree of freedom (K) is equal to the number of independent variables used in land change modeling.

The characteristic of the ROC (Relative Operating Characteristic) is a very suitable statistic for measuring the degree of Goodness of fit in the logistic regression model. The amount of this statistic are between 0 and 1, where 1 indicates proper Goodness of fit and 0.5 indicates random Goodness of fit [14]. Finally, the logistic regression equation is defined as follows (Eq. (6)):

$$\log\_e(P) = \beta\_0 + \beta\_1 X\_1 + \beta\_2 X\_2 + \beta\_3 X\_3 + \dots + \beta\_n X\_n + errorterm \tag{6}$$

β0: intercept, β0,β1, ...,βn: regression coefficients, X1,X2, ..., Xn: used variables.

#### **2.5 Sensitivity analysis of SimWeight**

The similarity-weighted sample-based learning method is a sample-based learning algorithm based on the K-nearest neighbor (KNN) algorithm. This method identifies the relationship between the stimulus variable and the transmission potential prediction for areas that show cases of change. This method is based on the calculation of weighted distances in variable space for known examples of user classes. SimWeight should have two classes (fixed pixel and variable pixel) for each transfer to create transfer potential in the direction of land change modeling.

In this method, the K-nearest neighbor was extracted for each pixel (fixed or variable) and calculates the distance in the variable space from each unknown location to the locations around it (within the range of K) (**Figure 2**). This distance is obtained in the exponential weighting function (e1/di) in order to calculate a continuous level of membership of existing land use classes for each pixel from Eq. (7) [34]:

$$\text{Membership of Class} = \frac{\sum\_{i=1}^{\epsilon} \left( \mathbf{1.0} - \frac{1}{1 + e^{\mathbf{d}}} \right)}{k} \text{ (c \le k)}\tag{7}$$

where K represents the total number of the closest variable and fixed pixels, c represents the number of variable pixels in the nearest neighbor k, and d is the linear distance of the variable pixel i.

This algorithm, like the K-nearest neighbor method, may be influenced by irrelevant variables [35]. Different stimulus variables may have different importance in

#### **Figure 2.**

*A representation of the variable space created by two hypothetical variables. In this figure, the triangles represent the variable pixels, and the diamonds represent the fixed pixels. The black square represents the pixel that will be evaluated under the influence of the hypothetical and unknown transmission potential. The dotted circle in the figure represents the range of k (k = 9), which is hypothetically 6 pixels for the changing state and 3 pixels for the fixed state, and also the lines showed are the linear distances from the considered pixel in the space is variable [34].*

*A Study of the Comparison between Artificial Neural Networks, Logistic Regression and Similarity… DOI: http://dx.doi.org/10.5772/intechopen.111615*

determining transmission potential. This different importance is due to the relative weight by which each variable is multiplied to determine its ability among different classes of land use. In this method, the weight of the importance of each variable is determined by comparing the standard deviation of the variables that have changed within the considered area to the standard deviation of the variables in the study area (Eq. (8)) [34]:

The correlation weight of each variable ¼ *the standard deviation of the variable pixels in the change area the standard deviation of the variables in the study area* (8)

#### **2.6 The location of the study**

The Gorganroud river watershed with an area of about 9350 square kilometers with a longitude of 54°2<sup>0</sup> to 56°22<sup>0</sup> and a latitude of 36°22<sup>0</sup> to 37°47<sup>0</sup> North is in Golestan province (**Figure 3**). The said river is from Golestan National Park originates from Golidagh Heights, and flows into the Caspian Sea after passing through Gonbadekavous and Aghqala in the west of Khajenafas. This river is located in the southeastern part of the Caspian Sea. Most of its branches are from Alborz mountain and flow from south to north. Among its rivers, we can mention the Madersu, Zaringol, Tilabad, and Chehelchai rivers. The Gorganroud river length is about 300 km, and the direction of the river water flow is from east to west [36].

#### **2.7 Data used**

In this study, in order to prepare the forest cover map of 1984 and 2012, the land use maps produced in the Golestan province survey plan have been used. In order to evaluate the accuracy and determine the extent of the forest in 2015, the previous studies [37] by using the TM sensor of the Landsat satellite and also the visual classification of the area through Google Earth images, have been used.

**Figure 3.** *Location of the study area.*
