2.1. Input data

the interpolation accuracy with sparsely distributed sample points, various improved kriging methods have been developed in Ref. [13]. However, the methods still require substantial amounts of field samples to define the spatial autocorrelation and the precision of the resultant maps will still depend upon the density and distribution of original data points in Ref. [14]. Due to high spatial variability of soil characteristics, large numbers of sampling points are required to generate an accurate high-resolution soil map. Although the accuracy of a soil map may be increased with increasing data points, intensive field surveys are expensive and timeconsuming. Furthermore, the accuracy is affected by the quality of the data, which, to a great extent, depends on the field experience of the soil surveyors in Ref. [15]. As an alternative,

Statistical models with predictive powers could potentially overcome the problem of interpolation methods in Ref. [16]. Bell et al., for example, related soil drainage class to parent material, terrain, and surface drainage with the help of discriminant function analysis in Pennsylvania, USA in Ref. [17]. According to this method, soil drainage probability maps were predicted well when compared with published soil drainage maps. Campling et al. applied a logistic model to successfully predict the probability of drainage classes in a tropical area using terrain properties (elevation, slope, distance-to-the-river channel) and vegetation indices from a Landsat TM image in Ref. [18]. By applying discriminant function analysis and a co-kriging method, Kravchenko et al. created soil drainage maps using topographical data, i.e., slope, curvature, and flow accumulation, and soil electrical conductivity data in central Illinois, USA in Ref. [19]. But empirical models derived with traditional statistical methods may hinder the real relationships between soil properties and independent data because the relationships are rarely linear

In recent years, artificial neural network (ANNs) have been increasingly used to overcome non-linear problems. The ANN is a form of artificial intelligence that was inspired by the studies of the human neuron and has been used to analyze biophysical data in Ref. [20]. ANNs have the ability to auto-analyze the relationships between multi-source inputs (including combinations of qualitative and quantitative data) by self-learning, and produce results without hypothesis. Some ANNs have been successfully used to map soil properties in Ref. [21]. For example, in Licznar and Nearing's study, soil loss was predicted quantitatively from natural runoff plots with the ANN method in Ref. [22]. The results showed that correlation coefficients (predicted soil loss versus measured values) were in the range of 0.7–0.9. Ramadan et al. applied two different multivariate calibration methods (PCA and back-propagation ANN) to predict soil properties (sand, silt, clay, etc.) with the help of DNA data from microbial

In the chapter, we focused on describing a general approach for using ANNs to produce highresolution soil properties, from preparing data, building ANN structure, training ANNs,

various models have been developed to produce soil property maps.

in nature.

1.3. Artificial neural networks

54 Advanced Applications for Artificial Neural Networks

community in Ref. [23].

optimizing networks, to simulating ANNs.

1.4. Objectives

Input data were composed of potential variables that describe or determine the predicting soil properties, including DEM-generated topo-hydrological variables, such as slope steepness, soil terrain factor (STF), sediment delivery ratio (SDR), vertical slope position (VSP), topographic witness index (TWI), and potential solar radiation (PSR) (Figure 3), and existing coarse resolution soil map, such as soil property map, geology map, surficial parent material map, and hydrologic map, because (1) at local levels, soil properties are assumed to have been modified by hydrological processes that are associated with topography and they can be modeled with a

Figure 3. The images of slope (A), soil terrain factor (B), sediment delivery ratio (C), vertical slope position (D), topographic witness index (E), and potential solar radiation (F) in the black brook watershed, New Brunswick, Canada.

DEM in Ref. [24]; (2) at landscape levels, average soil properties were related to geological formations and soil parent materials. These landscape features are assumed to have been captured by existing coarse resolution soil maps.

Soil terrain factor is a modified version of the hydrological similarity index in Ref. [25]. It considers total drainage area and slope as well as the clay content in rooting zone. The STF was calculated using Eq. (1):

$$STF = \ln \frac{(A+1)P\_{clay}}{(s+k)^2} \tag{1}$$

where A is the flow accumulation (m2 ); Pclay is the clay content (wt. %) from the coarse resolution soil data; k is a parameter (=1); and s is the slope steepness (m m�<sup>1</sup> ).

Sediment delivery ratio is the percent of sediment delivered to surface waters from the total amount of soil eroded in a watershed. The ratio, calculated by Eq. (2), indicates the efficiency of sediment transport in the watershed and is largely influenced by topography and the flow distance to streams in Ref. [26].

$$\text{SDR}\_{i} = \exp\left(-\beta t\_{i}\right) \tag{2}$$

where ti is the travel time from cell i to the nearest channel (s); and β is a watershed-specific constant.

Traveling time, ti, is defined by Eq. (3):

$$t\_i = \sum\_{j=1}^{N\_p} \frac{l\_j}{v\_j} \tag{3}$$

where Np is the total number of cells from cell j to the nearest channel, along the flow path (m); lj is the length segment cell j along the flow path (m); and v is flow velocity (m s�<sup>1</sup> ).

Flow velocity, v, is got based on Eq. (4) in Ref. [27].

$$w = ds^{1/2} \tag{4}$$

where s is slope steepness (m m�<sup>1</sup> ) and d is a coefficient dependent on surface roughness characteristics (m s�<sup>1</sup> ) for cell i.

By using HYDRO-tools extension in ArcView, the flow length, ti, was calculated in order to acquire travel time, with an inverse velocity grid used as a weighting factor in Ref. [28].

The watershed parameter, β was estimated by numerically solving Eq. (5):

$$SDR\_w = \frac{\sum\_{i=1}^{N} \exp\left(-\beta t\_i\right) l\_i^{0.5} s\_i^2 a\_i}{\sum\_{i=1}^{N} l\_i^{0.5} s\_i^2 a\_i} \tag{5}$$

where SDRw is the watershed average SDR, which was calculated with an empirical formula similar to SDRw = pAT <sup>c</sup> in Ref. [29]. Parameters <sup>p</sup> and <sup>c</sup> were confirmed as 0.42 and �0.125 because they represent a good general approximation between SDRw and SDR in Ref. [30].

DEM in Ref. [24]; (2) at landscape levels, average soil properties were related to geological formations and soil parent materials. These landscape features are assumed to have been

Soil terrain factor is a modified version of the hydrological similarity index in Ref. [25]. It considers total drainage area and slope as well as the clay content in rooting zone. The STF

STF <sup>¼</sup> ln ð Þ <sup>A</sup> <sup>þ</sup> <sup>1</sup> Pclay

Sediment delivery ratio is the percent of sediment delivered to surface waters from the total amount of soil eroded in a watershed. The ratio, calculated by Eq. (2), indicates the efficiency of sediment transport in the watershed and is largely influenced by topography and the flow

SDRi ¼ exp �βti

where ti is the travel time from cell i to the nearest channel (s); and β is a watershed-specific

ti <sup>¼</sup> <sup>X</sup> Np

j¼1

where Np is the total number of cells from cell j to the nearest channel, along the flow path (m);

By using HYDRO-tools extension in ArcView, the flow length, ti, was calculated in order to acquire travel time, with an inverse velocity grid used as a weighting factor in Ref. [28].

> exp �βti � �l 0:5 <sup>i</sup> s<sup>2</sup> i ai

> > P N i¼1 l 0:5 <sup>i</sup> s<sup>2</sup> i ai

P N i¼1

lj is the length segment cell j along the flow path (m); and v is flow velocity (m s�<sup>1</sup>

The watershed parameter, β was estimated by numerically solving Eq. (5):

SDRw ¼

lj vj

resolution soil data; k is a parameter (=1); and s is the slope steepness (m m�<sup>1</sup>

ð Þ <sup>s</sup> <sup>þ</sup> <sup>k</sup> <sup>2</sup> (1)

� � (2)

<sup>v</sup> <sup>¼</sup> ds<sup>1</sup>=<sup>2</sup> (4)

) and d is a coefficient dependent on surface roughness

).

(3)

(5)

).

); Pclay is the clay content (wt. %) from the coarse

captured by existing coarse resolution soil maps.

56 Advanced Applications for Artificial Neural Networks

was calculated using Eq. (1):

distance to streams in Ref. [26].

Traveling time, ti, is defined by Eq. (3):

where s is slope steepness (m m�<sup>1</sup>

characteristics (m s�<sup>1</sup>

Flow velocity, v, is got based on Eq. (4) in Ref. [27].

) for cell i.

constant.

where A is the flow accumulation (m2

N is total number of cells over the watershed, ai is area of the cell (m<sup>2</sup> ), li is the length of cell i along the flow path (m), AT is the area of the watershed (km<sup>2</sup> ).

Vertical slope position (m) is defined as the elevation differences between the land and the nearest water surface and calculated by integrating the elevation difference for each cell alone the path to the nearest water body using the following Eq. (6) (Figure 4):

$$VSP = \min \sum \text{(ds)}\tag{6}$$

where d is the distance between two adjacent cells (m); s is slope steepness (m m�<sup>1</sup> ).

Topographic wetness index is a steady-state wetness index that reflects soil moisture and drainage conditions, defined as a function of the natural logarithm of the ratio of local upslope contribution area and slope angle in Ref. [32].

$$TWI = \ln\left(\frac{A}{s}\right) \tag{7}$$

where A is the flow accumulation (m2 ) ands is the slope steepness (m m�<sup>1</sup> ).

Potential solar radiation (MJ m�<sup>2</sup> ) is the total of annual potential solar radiation. PSR reflected the potential light distribution along with the change of topography. The higher the value, the stronger the light radiation. Potential solar radiation takes into account the central Latitude, days of 1 year from 1 to 365 and hours of 1 day from 1 to 24 by an ArcView Extension in Ref. [33].

Coarse resolution soil maps are widely available. These maps usually reflected average soil properties over a large area (Figure 5). Researches indicated that coarse resolution soil data had a significant influence on the distribution of high-resolution soil property maps, especially around the boundary in Ref. [34].

Figure 4. Vertical slope position of a slope profile in ref. [31].

Figure 5. Comparison of coarse resolution soil map (A) and high-resolution soil map (B).

### 2.2. Target data

Target data, used as reference data in training ANNs, were composed of collecting field soil samples with soil property data (Figure 6). Representativeness and density of target data will directly affect the performance of ANNs.
