*3.1.1 Contamination sources of surface water*

The TEs migrate from rock and coal to water through water-rock interaction. Then the surface and ground water may be contaminated.

In Turkey, TEs source in a large reservoir was identified. The PCA showed that PC1, PC2, and PC3 includes Co/Cr/Fe, Cu/Pb/Zn, and As/Cd, respectively. Combing with correlation analysis, the three PCs were identified to natural source, bedrock weathering, and bedrock weathering, respectively [25]. Another research revealed by PCA that mineral pollution, nutrient pollution, and organic pollution are major latent factors which influence the water quality of Asi River [88].

Because of the vandalization of pipeline, soil and water may be contaminated. The PCA results showed that the first source was associated with anthropogenic source, such as vehicular emission, which was composited by Cd, Cr, Pb, and Mn. The second source, including Cu and Zn, was related to natural geological origin, and the Ni and V were released from natural source collaborating with the petroleum contamination [12].

In Ethiopia, water samples were divided into four categories by clustering analysis: natural cluster, mixed cluster, agriculture cluster and urban cluster. In the agriculture cluster, VF1 has strong loadings on TN, NO3 , salinity, Fe, NH3, hardness, and Mn, which is cultivated originated, VF2 were associate with turbidity, Chl-α, and Cu, which may come from farming and excavation sites of quarrying activities. Mg, and K were mainly loading on VF3, and VF4, respectively. K is mainly spread while potash fertilizer is used [24].

Supervised ML technique, discriminant analysis, was applied with the clustering analysis to assort and find spatiotemporal distribution of trace element in surface water, in the USA. Sources of salt ions (magnesium, chloride, and sodium) vary from natural sources (oceans, atmospheric deposition, weathering of common rocks, minerals and soils, and salt deposits and brines) to anthropogenic sources (landfills, wastewater and water treatment, agriculture, and application of deicing salts) [27].

A Bayesian isotope mixing model was used to estimate proportional contributions of multiple nitrate sources in surface water in Belgium. The result showed that "manure and sewage" contributed highest, "soil N", "NO3 fertilizer" and "NH4 + fertilizer and rain" contributed middle, and "NO3 in precipitation" contributed least [26].

### *3.1.2 Contamination sources of ground water*

In southern India, potential TE source of ground water was analyzed, it was concluded that Fe and Mn were natural origin, Cr, Cu, Pb and Ni may come from mixed sources, natural and flow contaminated with fertilizers and pesticide. In another study of northern India, the sources of ground water were identified to be anthropogenic source via agrochemical and industrial wastes (As, Cd, Co, Pb and V), parent material from an adjacent area (U and Sr), lithogenic origin (Fe, Mn, Zn), and background level elements (Mo and Se), respectively [36].

In Greece, Matiatos et al. [28–30] investigated surface water and ground water combing the method of geochemical, isotopic and multivariate statistical analysis, such as PCA and Bayesian isotope mixing model. By the PCA analytical result, EC, Na, K, Cl, and Mg, were found to be seawater inrush origin, Fe/Zn, Ca/hardness were from water-silicate rocks interaction, and dissolution of limestone, respectively, NO3 stand for nitrogen pollution, and SO4 <sup>2</sup> and Mn were from dedolomitization process and increased agricultural input.

for domestic, agricultural, and industrial uses. Gammons et al. have found the contamination of abandon coal mines on ground water using method of isotope

*Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix*

the fly ash, which indicate impact on water and soil quality [103, 104].

*3.1.4 Source apportionment of water inrush in coal mines*

*DOI: http://dx.doi.org/10.5772/intechopen.88818*

*3.1.5 TE occurrence and reaction pathway*

**3.2 TE apportionment in sediment**

**15**

The TEs in coal are not only migrate while mining and dumping of gangue and dust deposit [100], but also accompanying spread by smoke, fly ash, bottom ash when combustion [101, 102]. Trace elements, As, Cu, Se are found concentrated in

The TE source apportionment technology is used in coal mines to determine the source of water inrush [32]. In coal mine, water inrush constantly threatens the production, human health and cause financial losses. The water inrushes are cauterized to four sources: quaternary sand-gravel pore aquifer, Dyas sandstone aquifer, limestone aquifer from Ordovician and Carboniferous, and abandoned coal mine districts, respectively. Different sources show varies features and need different treatment strategies. The main purpose of the water inrush analysis is to find categories of source aquifers. Huang et al. [32] proposed a technology system, Piper-PCA-Bayes-LOOCV discrimination model to determine water inrush types in coal mines. The piper diagram is a geochemical technique to show the water characteristics, and abnormal samples/points were screened in this research. PCA was used to lower dimension of the sample matrix, to make less variates standing for all the original variates. Then the supervised ML model, Bayes DA, is used to train and implement a model for water source discriminant. LOOCV means leave-one-out cross-validation, to validate and improve quality of the model. Wang et al. used discriminant analysis to determine water bursting source in coal mines [33].

The PCA method has also used to investigate trace element occurrence in rock/ coal, and reaction pathway, which may be the source of TEs that have contaminating potential on surrounding water bodies. Shan et al. [34] found that in coal host rock seam, Se/Cd/Hg/ As occurred in sulfide minerals, Be and V occurred in carbonate minerals, Cr and Pb occurred in clay minerals, respectively; while in coal seam, Se/Cr/ Pb occurred in clay minerals, As and Hg occurred in sulfide minerals. Se, As and Hg immigrated through dissolution of sulfide minerals, Cr immigrated through transformation of clay minerals in coal host rock. In coal seam, As and Hg occurred in sulfide minerals. Se, Pb and Cr immigrated through transformation of clay minerals, As and Hg immigrated through dissolution of sulfide minerals, respectively. Pumure et al. [105] investigated occurrence of selenium and arsenic in coal by the method of two step PCA, founding that ultrasound leachable selenium concentrations were associated with 14 Å d-spacing phyllosilicate clays (chlorite, montmorillonite and vermiculite all 2:1 layered clays) whilst ultrasound leachable arsenic concentrations were closely related to the concentration of illite, another 2:1 phyllosilicate clay.

Surface water and sediment compose a reaction system, trace elements in water may be adsorbed by sediment, meanwhile, trace elements in sediment are released. Therefore, the sediment may be a sink or origin of trace elements. Because of the complex reaction pathway, and environmental persistence and biological accumulation, the trace elements in the aquatic environments has drawn special attentions [106].

analysis [71, 72].

A semi-supervised ML technique was used to trace contaminants'source in the USA. Vesselinov et al. [37] proposed a contaminant source identification approach that performed decomposition of the observation mixtures based on non-negative matrix factorization (NMF) method for blind source separation (BSS), coupled with a custom semi-supervised clustering algorithm. As a result, the mixing coefficients of all the groundwater types (contaminant sources) for each observation well (samples) were obtained.

As a supervised ML technique, decision tree is used combing with isotope method in a study to determine nitrogen source in groundwater. The decision tree has made 97.5% success in the water quality analysis. However, concentration data alone could not identify the dominant NO3 sources for groundwater contamination. It is suggested that an integrated approach should be setup by the combination of the N and O isotopes of NO3 with land-uses and physical-chemical properties, especially in areas with specific activities [38].

#### *3.1.3 Contamination by coal mine water*

One of the most focused issues of surface and ground water contamination is the acid mine drainage (AMD). The AMD is formed when pyrite and other sulfide minerals oxidized and dissolved during coal and metal mining, highway construction, and other large-scale excavation [13]. In an anaerobic environment, the sulfide minerals are stable, while exposure to water and oxygen, and with other accelerating factors such as bacteria, they are oxidized to form sulfuric acid, accompanying release of trace elements to surrounding water bodies [89]. In coal mine water, some of the drainage is alkaline, the leaching behavior and the TE composition in the leaching water are different [14].

Mobility of the TEs in AMD depends on several conditions. First, what is the TE occurrence and abundance in the potential AMD source; second, during the waterrock interaction process, where and how the adsorption-desorption, dissolutionprecipitation take place; third, what are the main flow path, and the river and lake geochemistry where the TEs may be adsorbed or released again.

If the flow path is known, the source and reaction rates of specific trace elements can be estimated by mass balance calculation. The post-dissolution behavior of TEs is controlled by solution composition, pH, Eh of the water, temperature, and contact-time with mineral surfaces. For example, metal elements will have little attenuation in the solid phases, and high mobility potential into water. The versus behavior can be observed for the metalloid elements. Along with the flow path, water geochemical characteristics and pH and Eh of water changes, the TEs may undergo very complex reaction process, leading to redistribution of TEs in surface water, ground water, and sediment in the water bodies. Therefore, the source identification of TEs is an important and challenging work.

The pH of AMD ranges from 2 to 8. In an acid environment, metal element, Pb, Cd, Cu, Ni, have high mobility, while some metalloid element, As, Se, tend to migrate in an alkaline environment. Damaging effects of AMD are reported in Asia [34, 90–93], North America [72, 94–96], Europe [83, 97], South America [47, 98, 99]. When AMD enters surface water bodies, the effects include biotic impacts on stream and lake organisms through direct toxicity, habitat alteration by metal precipitates, visual changes from orange or yellow staining of stream sediments, nutrient cycle disruptions, or other mechanisms, and the water often becomes unsuitable

#### *Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix DOI: http://dx.doi.org/10.5772/intechopen.88818*

for domestic, agricultural, and industrial uses. Gammons et al. have found the contamination of abandon coal mines on ground water using method of isotope analysis [71, 72].

The TEs in coal are not only migrate while mining and dumping of gangue and dust deposit [100], but also accompanying spread by smoke, fly ash, bottom ash when combustion [101, 102]. Trace elements, As, Cu, Se are found concentrated in the fly ash, which indicate impact on water and soil quality [103, 104].

#### *3.1.4 Source apportionment of water inrush in coal mines*

The TE source apportionment technology is used in coal mines to determine the source of water inrush [32]. In coal mine, water inrush constantly threatens the production, human health and cause financial losses. The water inrushes are cauterized to four sources: quaternary sand-gravel pore aquifer, Dyas sandstone aquifer, limestone aquifer from Ordovician and Carboniferous, and abandoned coal mine districts, respectively. Different sources show varies features and need different treatment strategies. The main purpose of the water inrush analysis is to find categories of source aquifers. Huang et al. [32] proposed a technology system, Piper-PCA-Bayes-LOOCV discrimination model to determine water inrush types in coal mines. The piper diagram is a geochemical technique to show the water characteristics, and abnormal samples/points were screened in this research. PCA was used to lower dimension of the sample matrix, to make less variates standing for all the original variates. Then the supervised ML model, Bayes DA, is used to train and implement a model for water source discriminant. LOOCV means leave-one-out cross-validation, to validate and improve quality of the model. Wang et al. used discriminant analysis to determine water bursting source in coal mines [33].

#### *3.1.5 TE occurrence and reaction pathway*

The PCA method has also used to investigate trace element occurrence in rock/ coal, and reaction pathway, which may be the source of TEs that have contaminating potential on surrounding water bodies. Shan et al. [34] found that in coal host rock seam, Se/Cd/Hg/ As occurred in sulfide minerals, Be and V occurred in carbonate minerals, Cr and Pb occurred in clay minerals, respectively; while in coal seam, Se/Cr/ Pb occurred in clay minerals, As and Hg occurred in sulfide minerals. Se, As and Hg immigrated through dissolution of sulfide minerals, Cr immigrated through transformation of clay minerals in coal host rock. In coal seam, As and Hg occurred in sulfide minerals. Se, Pb and Cr immigrated through transformation of clay minerals, As and Hg immigrated through dissolution of sulfide minerals, respectively. Pumure et al. [105] investigated occurrence of selenium and arsenic in coal by the method of two step PCA, founding that ultrasound leachable selenium concentrations were associated with 14 Å d-spacing phyllosilicate clays (chlorite, montmorillonite and vermiculite all 2:1 layered clays) whilst ultrasound leachable arsenic concentrations were closely related to the concentration of illite, another 2:1 phyllosilicate clay.

#### **3.2 TE apportionment in sediment**

Surface water and sediment compose a reaction system, trace elements in water may be adsorbed by sediment, meanwhile, trace elements in sediment are released. Therefore, the sediment may be a sink or origin of trace elements. Because of the complex reaction pathway, and environmental persistence and biological accumulation, the trace elements in the aquatic environments has drawn special attentions [106].

Na, K, Cl, and Mg, were found to be seawater inrush origin, Fe/Zn, Ca/hardness were from water-silicate rocks interaction, and dissolution of limestone, respec-

As a supervised ML technique, decision tree is used combing with isotope method in a study to determine nitrogen source in groundwater. The decision tree has made 97.5% success in the water quality analysis. However, concentration data

tion. It is suggested that an integrated approach should be setup by the combination

One of the most focused issues of surface and ground water contamination is the

Mobility of the TEs in AMD depends on several conditions. First, what is the TE occurrence and abundance in the potential AMD source; second, during the waterrock interaction process, where and how the adsorption-desorption, dissolutionprecipitation take place; third, what are the main flow path, and the river and lake

If the flow path is known, the source and reaction rates of specific trace elements can be estimated by mass balance calculation. The post-dissolution behavior of TEs is controlled by solution composition, pH, Eh of the water, temperature, and contact-time with mineral surfaces. For example, metal elements will have little attenuation in the solid phases, and high mobility potential into water. The versus behavior can be observed for the metalloid elements. Along with the flow path, water geochemical characteristics and pH and Eh of water changes, the TEs may undergo very complex reaction process, leading to redistribution of TEs in surface water, ground water, and sediment in the water bodies. Therefore, the source

The pH of AMD ranges from 2 to 8. In an acid environment, metal element, Pb,

Cd, Cu, Ni, have high mobility, while some metalloid element, As, Se, tend to migrate in an alkaline environment. Damaging effects of AMD are reported in Asia [34, 90–93], North America [72, 94–96], Europe [83, 97], South America [47, 98, 99]. When AMD enters surface water bodies, the effects include biotic impacts on stream and lake organisms through direct toxicity, habitat alteration by metal precipitates, visual changes from orange or yellow staining of stream sediments, nutrient cycle disruptions, or other mechanisms, and the water often becomes unsuitable

geochemistry where the TEs may be adsorbed or released again.

identification of TEs is an important and challenging work.

acid mine drainage (AMD). The AMD is formed when pyrite and other sulfide minerals oxidized and dissolved during coal and metal mining, highway construction, and other large-scale excavation [13]. In an anaerobic environment, the sulfide minerals are stable, while exposure to water and oxygen, and with other accelerating factors such as bacteria, they are oxidized to form sulfuric acid, accompanying release of trace elements to surrounding water bodies [89]. In coal mine water, some of the drainage is alkaline, the leaching behavior and the TE composition in

A semi-supervised ML technique was used to trace contaminants'source in the USA. Vesselinov et al. [37] proposed a contaminant source identification approach that performed decomposition of the observation mixtures based on non-negative matrix factorization (NMF) method for blind source separation (BSS), coupled with a custom semi-supervised clustering algorithm. As a result, the mixing coefficients of all the groundwater types (contaminant sources) for each observation well

<sup>2</sup> and Mn were from dedolomi-

sources for groundwater contamina-

with land-uses and physical-chemical properties,

tively, NO3 stand for nitrogen pollution, and SO4

*Trace Metals in the Environment - New Approaches and Recent Advances*

tization process and increased agricultural input.

alone could not identify the dominant NO3

especially in areas with specific activities [38].

(samples) were obtained.

of the N and O isotopes of NO3

*3.1.3 Contamination by coal mine water*

the leaching water are different [14].

**14**

In southwest China, lake sediment was analyzed [42]. PCA result showed that Cd/Hg/Pb/Zn, and As (as PC2 and PC3, respectively) were mainly from non-point anthropogenic sources, especially with the atmospheric emission from non-ferrous metal smelting and coal consumption [107].

road traffic, and Cd were linked to industrial activities. In order to predict Pb and

In Beijing China, the TEs can be represented by two PCs. The TEs Co, Ni, Cr and V were probably released from parent material of the soil, Cd, Cu, and Zn were primarily from agricultural cultivation. Hg may be originated from coal combustion

In Jijin China, Al, Fe, Mn, Zn, Cr, Ni, As, Cu, and Pb was found accounting for 55.16% of the total variance, which was identified as natural source. N, OC, P, Cd, and Hg have high loadings on the PC1, accounting for 16.75% of the total variance.

In Iran, a kind of semi-supervised ML method was applied. The study area was located around a Cu-Au porphyry deposit, so the soil may be associated. Initially eleven soil geochemical variables were selected by using hieratical clustering analysis and expert knowledge. Then, the semi-supervised fuzzy c-means clustering method (ssFCM) was used to separate multivariate soil geochemical anomalies from

The impact of ore deposit on surrounding soil was investigated in Beijing China. Frequent mining activities produce dust, acidic drainage from the oxides and mill tailing. Cu, Co, Zn, Cd and V was found to mixed sources originated; Be, Pb and As came from natural sources and are mainly affected by the weathering and erosion of parent rock material; Cr, Ni and Ba were polluted by fine particle, industrial and mining activities; transportation and soil minerals were the common sources of Cr and Ni; Hg came from anthropogenic sources, mainly impacted by mining, benefi-

Large urban and industrial areas along the coastline in Italy was investigated. Pb and Zn due to heavy traffic and alloy production. Some Cr and Ni contamination were discerned through releases from tannery industry. Zn and Pb enrichment were mainly related to the large volcanic complexes. Cr and Ni were enriched in the siliciclastic deposits [56]. Another large-scale investigation was carried out in Yangtze river delta China, industrialization lead to high contamination potential on environment. Four PCs were selected to present the sources of trace element. As, Hg, Cu, Cd, Mo, S and Zn are recognized as traffic origin. Fe and Mn were from natural resources. The PC3, including Cr and Ni, pointed to pyrometallurgical processes, especially non-ferrous metal industries, etc. The PC4, composited by Pb and Se, was inferred to be coal combustion originated [52]. Another research carried out in this area showed that Cr, Ni, Co, Mn, Cu, and As were mainly came from natural sources. Cd/Hg and Pb/Zn originated from anthropogenic sources in two

In Shaanxi province China, roadway dust was analyzed. TEs Zn, Mn, Ni, As had the highest variance. Because Zn was released mainly from wear vehicle tire and corrosion of galvanized automobile part. Cu, Pb, and Cr was inferred to traffic origin. The third source was dominated by Co and Ni, and they were released from machine manufacturing plant [48]. In a research from Alaska America, soil contamination was found to be caused and controlled mainly by distance to road, traffic category, including highway and refuge road, land cover category, paved or not, land cover category, traffic loading, and other parameters, in descending

In Pakistan, four factors were identified using the factor analysis to trace surface

soil contamination in industrial cite. VF1 contains Ni, Cr, Zn, and Cu, which

Cd concentration in the following years, an ANN model was applied [49].

*Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix*

The PC2 also seemed as natural source, high relationship of Hg and Cd was

or mineral fertilizers [45].

explained to high organic affinity [46].

*DOI: http://dx.doi.org/10.5772/intechopen.88818*

back-ground for further drilling [58].

different groups [57].

order [59].

**17**

*3.3.2 TE apportionment in urban and industrial top soil*

ciation, smelting and acid mine drainage waste [44].

In Jiangxi China, river sediment was investigated. As the metal mines are excavating in the study area, metal element contamination was found. The PCA analytical result show probably coal and gold mining, copper mining and refining, Zn/Pb deposits and agricultural activities origin associated with PC1, PC2, and PC3, respectively. The PC1 were high loaded with Ni, Hg, Cr, the PC2 were high loaded with Cu, the PC3 were high loaded with Cd, Pb, Zn, As, respectively [39]. A research on lake sediment in Jiangxi China showed that Cr, Pb, and Zn may be mainly derived from both lithogenic and human activities, such as atmospheric and river inflow transportation, whereas Cu and Cd may be mainly contributed from anthropogenic sources, such as mining activities and fertilizer application [43]. In northern China, Cd and Zn are found originating from agriculture source and Cu, Cr, Ni were natural source origin [40]. In northwest China, Zn, Cu, Ni, and As were high loaded on the PC1, and natural originated, Cr and Cd/Pb/Hg are high loaded on the PC2 and PC3, which were township/silicon chemical factories, and agriculture/ urban construction origin, respectively [41].

#### **3.3 TE apportionment in soil**

Researches have focused on distinguish TE source from natural and anthropogenic [108, 109] contaminates in soil. The major natural contribution of heavy metals comes from the parent materials from which the soils developed. The anthropogenic source of heavy metals in soils includes acid mine drainage [110], agricultural and industrial waste discharges [111], atmospheric deposition [112], fertilizers and pesticides [113], which has a significant contribution to the content levels of heavy elements in soils. PCA are now a popular technique to trace source of TEs in soil, then enrichment factors are usually used to verify the sources. In order to investigate source of TEs and spatial distribution, the combination method of geochemical, multivariate analysis, and geostatistical analysis. GIS and multivariate analysis of soil contamination has been detailed reviewed [114]. Understanding sources of heavy metals in surface soils is imperative for the decision to implement the strategies for protecting the food safety, human health and ecosystem sustainability.

#### *3.3.1 TE apportionment in agricultural soil*

In Greece, two main sources explained 74.8% of all the variance for the agriculture soil contamination analysis. The TEs Cu, Pb, Zn, As, Cd, P and K were identified to be anthropogenic influence, and TEs Ni, Co, Fe and Cr were recognized to be natural source origin [55].

In soil samples on hills in India, four principal components were determining by using the PCA method, high loading TEs on which are Mn/Zn, Cr, Ni, Co, respectively. The PC1 and PC2 were inferred to be natural sources, and PC3 represent fossil fuel burning origin, which contribute most of Ni in soil, and PC4 represent irrigation sources, respectively [50].

In Shanxi province China, soil samples were collected in an area of 25k km<sup>2</sup> . The PCA analytical result showed that Co, Cr, Cu, Mn, Ni, Se, V, and Zn were mainly originated from natural source, and Cd and Pb were affected by anthropogenic pollution heavily. Associated with the spatial data, Pb were strongly associated with

#### *Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix DOI: http://dx.doi.org/10.5772/intechopen.88818*

road traffic, and Cd were linked to industrial activities. In order to predict Pb and Cd concentration in the following years, an ANN model was applied [49].

In Beijing China, the TEs can be represented by two PCs. The TEs Co, Ni, Cr and V were probably released from parent material of the soil, Cd, Cu, and Zn were primarily from agricultural cultivation. Hg may be originated from coal combustion or mineral fertilizers [45].

In Jijin China, Al, Fe, Mn, Zn, Cr, Ni, As, Cu, and Pb was found accounting for 55.16% of the total variance, which was identified as natural source. N, OC, P, Cd, and Hg have high loadings on the PC1, accounting for 16.75% of the total variance. The PC2 also seemed as natural source, high relationship of Hg and Cd was explained to high organic affinity [46].

In Iran, a kind of semi-supervised ML method was applied. The study area was located around a Cu-Au porphyry deposit, so the soil may be associated. Initially eleven soil geochemical variables were selected by using hieratical clustering analysis and expert knowledge. Then, the semi-supervised fuzzy c-means clustering method (ssFCM) was used to separate multivariate soil geochemical anomalies from back-ground for further drilling [58].

#### *3.3.2 TE apportionment in urban and industrial top soil*

The impact of ore deposit on surrounding soil was investigated in Beijing China. Frequent mining activities produce dust, acidic drainage from the oxides and mill tailing. Cu, Co, Zn, Cd and V was found to mixed sources originated; Be, Pb and As came from natural sources and are mainly affected by the weathering and erosion of parent rock material; Cr, Ni and Ba were polluted by fine particle, industrial and mining activities; transportation and soil minerals were the common sources of Cr and Ni; Hg came from anthropogenic sources, mainly impacted by mining, beneficiation, smelting and acid mine drainage waste [44].

Large urban and industrial areas along the coastline in Italy was investigated. Pb and Zn due to heavy traffic and alloy production. Some Cr and Ni contamination were discerned through releases from tannery industry. Zn and Pb enrichment were mainly related to the large volcanic complexes. Cr and Ni were enriched in the siliciclastic deposits [56]. Another large-scale investigation was carried out in Yangtze river delta China, industrialization lead to high contamination potential on environment. Four PCs were selected to present the sources of trace element. As, Hg, Cu, Cd, Mo, S and Zn are recognized as traffic origin. Fe and Mn were from natural resources. The PC3, including Cr and Ni, pointed to pyrometallurgical processes, especially non-ferrous metal industries, etc. The PC4, composited by Pb and Se, was inferred to be coal combustion originated [52]. Another research carried out in this area showed that Cr, Ni, Co, Mn, Cu, and As were mainly came from natural sources. Cd/Hg and Pb/Zn originated from anthropogenic sources in two different groups [57].

In Shaanxi province China, roadway dust was analyzed. TEs Zn, Mn, Ni, As had the highest variance. Because Zn was released mainly from wear vehicle tire and corrosion of galvanized automobile part. Cu, Pb, and Cr was inferred to traffic origin. The third source was dominated by Co and Ni, and they were released from machine manufacturing plant [48]. In a research from Alaska America, soil contamination was found to be caused and controlled mainly by distance to road, traffic category, including highway and refuge road, land cover category, paved or not, land cover category, traffic loading, and other parameters, in descending order [59].

In Pakistan, four factors were identified using the factor analysis to trace surface soil contamination in industrial cite. VF1 contains Ni, Cr, Zn, and Cu, which

In southwest China, lake sediment was analyzed [42]. PCA result showed that Cd/Hg/Pb/Zn, and As (as PC2 and PC3, respectively) were mainly from non-point anthropogenic sources, especially with the atmospheric emission from non-ferrous

*Trace Metals in the Environment - New Approaches and Recent Advances*

In Jiangxi China, river sediment was investigated. As the metal mines are excavating in the study area, metal element contamination was found. The PCA analytical result show probably coal and gold mining, copper mining and refining, Zn/Pb deposits and agricultural activities origin associated with PC1, PC2, and PC3, respectively. The PC1 were high loaded with Ni, Hg, Cr, the PC2 were high loaded with Cu, the PC3 were high loaded with Cd, Pb, Zn, As, respectively [39]. A research on lake sediment in Jiangxi China showed that Cr, Pb, and Zn may be mainly derived from both lithogenic and human activities, such as atmospheric and river inflow transportation, whereas Cu and Cd may be mainly contributed from anthropogenic sources, such as mining activities and fertilizer application [43]. In northern China, Cd and Zn are found originating from agriculture source and Cu, Cr, Ni were natural source origin [40]. In northwest China, Zn, Cu, Ni, and As were high loaded on the PC1, and natural originated, Cr and Cd/Pb/Hg are high loaded on the PC2 and PC3, which were township/silicon chemical factories, and agriculture/

Researches have focused on distinguish TE source from natural and anthropogenic [108, 109] contaminates in soil. The major natural contribution of heavy metals comes from the parent materials from which the soils developed. The anthropogenic source of heavy metals in soils includes acid mine drainage [110], agricultural and industrial waste discharges [111], atmospheric deposition [112], fertilizers and pesticides [113], which has a significant contribution to the content levels of heavy elements in soils. PCA are now a popular technique to trace source of TEs in soil, then enrichment factors are usually used to verify the sources. In order to investigate source of TEs and spatial distribution, the combination method of geochemical, multivariate analysis, and geostatistical analysis. GIS and multivariate analysis of soil contamination has been detailed reviewed [114]. Understanding sources of heavy metals in surface soils is imperative for the decision to implement the strategies for protecting the food safety, human health and ecosystem sustain-

In Greece, two main sources explained 74.8% of all the variance for the agriculture soil contamination analysis. The TEs Cu, Pb, Zn, As, Cd, P and K were identified to be anthropogenic influence, and TEs Ni, Co, Fe and Cr were recognized to be

In soil samples on hills in India, four principal components were determining by using the PCA method, high loading TEs on which are Mn/Zn, Cr, Ni, Co, respectively. The PC1 and PC2 were inferred to be natural sources, and PC3 represent fossil fuel burning origin, which contribute most of Ni in soil, and PC4 represent

In Shanxi province China, soil samples were collected in an area of 25k km<sup>2</sup>

PCA analytical result showed that Co, Cr, Cu, Mn, Ni, Se, V, and Zn were mainly originated from natural source, and Cd and Pb were affected by anthropogenic pollution heavily. Associated with the spatial data, Pb were strongly associated with

. The

metal smelting and coal consumption [107].

urban construction origin, respectively [41].

*3.3.1 TE apportionment in agricultural soil*

irrigation sources, respectively [50].

natural source origin [55].

**3.3 TE apportionment in soil**

ability.

**16**

originate from vehicular emission and industrial activities. VF2, compositing by Pb, Cd, and Co, originated from anthropogenic activities such as automobiles. Fe, Mn, standing for VF3, and VF4, were natural source origin [54]. In Armenia, Ti, V, Mn, Fe and Co, were identified to be natural originated. The PC2 include two distinguished negative groups, As/Hg, and Pb/Zn. The PC3 is composited mainly with Cu and Mo, and recognized as anthropogenic origin [51]. In Spain, the first source, including Pb, Tl, As, Sb, Cd, pointed to coal combustion. The second source was traffic air pollution origin, which released Cr, Ni, Be, V, Co. The third and fourth factors explained a very low proportion of variance and were considered secondary. These factors included TEs Cu, Zn and Sn, showing mixed behavior with regard to the first two factors [53].

"Unspecified sources of human origin" category mainly includes secondary particles formed from unspecified pollution sources of human origin. The reasons of the second outbreak of PM 2.5 are complex, including some chemical reaction. In fact, the reasons of the fog and haze are: (1) the accumulation of the fog and haze, namely the results of combustion, automobile exhaust, and dust effects; (2) the fog and haze particles' upward momentum—hot-air upward movement and wireless communication, namely the electromagnetic wave net sports; (3) no sustained wind. These three conditions indispensable lead to persistent fog and haze

*Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix*

*DOI: http://dx.doi.org/10.5772/intechopen.88818*

weather, and the second outbreak of PM 2.5 results from the above three conditions together. In a review for the source apportionment study, 87% of the record have traffic origin, 66% have industry origin, 45, 100, and 89% have domestic fuel burning, unspecified source of human origin, and natural sources origin,

In southern China, TE source in the PM 2.5 was identified using PCA technique. Three sampling sites were analyzed separately. In YL sampling site, the PC1, with Zn and Pb, were identified as the traffic source, Cu and Cd, high loading on the PC2, originated from coal and other kind of fossil fuel. In KF sampling site, Zn, Cd, and Pb were from vehicle emission and abrasion of automobile tire. Cu, high loaded on the PC2, is a tracer of fossil and other fuel combustion. In the YH site, Cu, Cd, and Pb were associated with domestic fossil fuel burning, and Zn represent brake and tire wear and other transportation processes [17]. During Chinese Spring Festival, haze may occur more frequently, and the PM 2.5 level can be elevated. In Henan province China, sources of PM 2.5 were identified by using PCA, and a model to predict PM 2.5 concentrations using multivariate linear regression was set up. The most important source was burning source, including coal combustion, fireworks, fire crackers and biomass burning, contributing 61% of all the PM 2.5. The second, third, and fourth sources were vehicle emission (27%), soil (8%), and

In Costa Rica, by using the method of PMF, eight important sources of PM 2.5

In the USA, sources of PM 2.5 were determined, variance of meat, secondary aerosols, motor oil/brake dust/other outdoor, dust, cigarette, gasoline, biomass burning, and retene were explained with 23, 14, 10, 9, 6, 6, 5, and 5%, respectively. For the chemicals, meat released cholesterol, the alkanoic acids, OC, and light nalkanes. Ammonium, sulfur, and nitrate were mainly released by secondary aerosol. The PC3 includes cholestanes, hopanes, Ba, and nitrate, which was related to motor

A 6-year investigation of PM 2.5 levels, source and potential human risk was investigated in Canada. Secondary organic aerosol, secondary nitrate, secondary sulfate, transportation and biomass burning, contributed more than 85% to PM 2.5,

In Nigeria, source of PM 2.5 was identified to be soil (44%), savannah burning (26%), scrap processing (18%) and vehicular emissions (12%), and soil plus biomass burning (71%), sea salt (22%), scrap processing (5%) and vehicle emissions (tire wear) (2%) for the PM 10. Elements Al, Si, Ca, Ti, Mg, Fe and Na were spread through fine particle, and crustal elements Al, Si, Ca, Ti, Mn, Fe and anthropogenic elements Cl, K, V, Cr, Ni, Br, Pb, and black carbon were spread through coarse particles. Savannah burning release Br, BC and Pb through fine particles. The vehicle emit Na, S, Zn, As, Br, and Pb, and spread through fine particles [67].

<sup>2</sup> and

and PM 10 and TEs were identified. Vehicle exhaust, containing EC, OC, SO4

certain amount of Fe, residual oil combustion, bringing Ni and V, fresh sea salt, including Cl, Na and Mg, were the first three source. The others are crustal, or dust aerosols originated, organic carbon and sulfate, secondary sulfate, secondary

respectively [125].

road dust (3.28%), respectively [63].

nitrate, and heavy fuels [66].

oil/brake dust/other outdoor [65].

**19**

the importance of which was in descent order [61].

In Nigeria, because of the vandalization of pipeline, the soil and water may be contaminated. The PCA gave 78.68% of accumulative contribution of the covariance from the first three PCs. PCA analysis result in soil was similar with that in water [12].

#### *3.3.3 TE apportionment in soil to recall human activities*

In Spain, core was obtained from peat bog, to evaluate trace element distribution and human activity impact in the past 8000 years. It was found that Al, Ba, Cr, Ga, K, Na, Sr, Ti, V, Y and Zr were lithogenic and supplied by atmospheric soil dust, while Cd, Pb, P, and Zn were recognized to anthropogenic, especially the ore exploration. The depth of samples depicted the influence degree of human activities yearly [47]. The EF profile showed that Pb, Zn, and Hg were at peak values in atmospheric in Roman age and nineteenth to twentieth centuries.

From the recent researches, it is concluded that Pb is an important anthropogenic originated element. Some reports argued that the vehicle emissions, brake lining, coal burning, plastics and rubber production, and car barriers are potential source of Pb. Meanwhile, Cu might come from vehicle brake lining, Zn from vehicle tires [51]. For the agriculture soil, Cu are usually cumulated by application of commercial fertilizers and Cu-based pesticides and fungicides [115]. Cd was related to the use of phosphate fertilizers [116]. Mineral fertilizers and animal manure may lead to elevation of Zn and Cu levels in soil.

#### **3.4 TE apportionment in air and particles**

The TEs spread through the air usually as particles. The particle smaller than 10 μm is called PM 10, while PM 2.5 stand for that smaller than 2.5 μm. It is obviously that the haze-day rate has increasing in the past decade, several researchers have reported characteristics, composition, and sources of PM 10 and PM 2.5 in some Chinese cities [17, 64, 117]. At the same time, PM 10 and PM 2.5 in megacities all around the world are investigated [118–121]. TEs, such as Cu, Zn, Pb, Cd, Cr, relating to the PM 2.5 and PM 10, show deleterious effects to human health. Based on the epidemiological and toxicological studies [122, 123], the TEs in ambient PM 2.5 influence the severity of allergic respiratory disease and have a high cancer risk to the exposed populations [81, 82].

In the source apportionment analysis, six types of main resource of ambient particular matter are commonly found: natural sources (including soil dust and sea salt), domestic fuel burning, industry, traffic, unspecified source of human origin pollution. Soil dust refer to the bare soils by local wind. Sea salt particles can be found close to the coast. Domestic fuel burning includes coal, gas fuel and wood for cooking and heating. Traffic is a complex source of PM and TEs. All the burning of fuel and diesel, wear of brake linings, clutch, and tires are source of TEs [124]. The

#### *Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix DOI: http://dx.doi.org/10.5772/intechopen.88818*

"Unspecified sources of human origin" category mainly includes secondary particles formed from unspecified pollution sources of human origin. The reasons of the second outbreak of PM 2.5 are complex, including some chemical reaction. In fact, the reasons of the fog and haze are: (1) the accumulation of the fog and haze, namely the results of combustion, automobile exhaust, and dust effects; (2) the fog and haze particles' upward momentum—hot-air upward movement and wireless communication, namely the electromagnetic wave net sports; (3) no sustained wind. These three conditions indispensable lead to persistent fog and haze weather, and the second outbreak of PM 2.5 results from the above three conditions together. In a review for the source apportionment study, 87% of the record have traffic origin, 66% have industry origin, 45, 100, and 89% have domestic fuel burning, unspecified source of human origin, and natural sources origin, respectively [125].

In southern China, TE source in the PM 2.5 was identified using PCA technique. Three sampling sites were analyzed separately. In YL sampling site, the PC1, with Zn and Pb, were identified as the traffic source, Cu and Cd, high loading on the PC2, originated from coal and other kind of fossil fuel. In KF sampling site, Zn, Cd, and Pb were from vehicle emission and abrasion of automobile tire. Cu, high loaded on the PC2, is a tracer of fossil and other fuel combustion. In the YH site, Cu, Cd, and Pb were associated with domestic fossil fuel burning, and Zn represent brake and tire wear and other transportation processes [17]. During Chinese Spring Festival, haze may occur more frequently, and the PM 2.5 level can be elevated. In Henan province China, sources of PM 2.5 were identified by using PCA, and a model to predict PM 2.5 concentrations using multivariate linear regression was set up. The most important source was burning source, including coal combustion, fireworks, fire crackers and biomass burning, contributing 61% of all the PM 2.5. The second, third, and fourth sources were vehicle emission (27%), soil (8%), and road dust (3.28%), respectively [63].

In Costa Rica, by using the method of PMF, eight important sources of PM 2.5 and PM 10 and TEs were identified. Vehicle exhaust, containing EC, OC, SO4 <sup>2</sup> and certain amount of Fe, residual oil combustion, bringing Ni and V, fresh sea salt, including Cl, Na and Mg, were the first three source. The others are crustal, or dust aerosols originated, organic carbon and sulfate, secondary sulfate, secondary nitrate, and heavy fuels [66].

In the USA, sources of PM 2.5 were determined, variance of meat, secondary aerosols, motor oil/brake dust/other outdoor, dust, cigarette, gasoline, biomass burning, and retene were explained with 23, 14, 10, 9, 6, 6, 5, and 5%, respectively. For the chemicals, meat released cholesterol, the alkanoic acids, OC, and light nalkanes. Ammonium, sulfur, and nitrate were mainly released by secondary aerosol. The PC3 includes cholestanes, hopanes, Ba, and nitrate, which was related to motor oil/brake dust/other outdoor [65].

A 6-year investigation of PM 2.5 levels, source and potential human risk was investigated in Canada. Secondary organic aerosol, secondary nitrate, secondary sulfate, transportation and biomass burning, contributed more than 85% to PM 2.5, the importance of which was in descent order [61].

In Nigeria, source of PM 2.5 was identified to be soil (44%), savannah burning (26%), scrap processing (18%) and vehicular emissions (12%), and soil plus biomass burning (71%), sea salt (22%), scrap processing (5%) and vehicle emissions (tire wear) (2%) for the PM 10. Elements Al, Si, Ca, Ti, Mg, Fe and Na were spread through fine particle, and crustal elements Al, Si, Ca, Ti, Mn, Fe and anthropogenic elements Cl, K, V, Cr, Ni, Br, Pb, and black carbon were spread through coarse particles. Savannah burning release Br, BC and Pb through fine particles. The vehicle emit Na, S, Zn, As, Br, and Pb, and spread through fine particles [67].

originate from vehicular emission and industrial activities. VF2, compositing by Pb, Cd, and Co, originated from anthropogenic activities such as automobiles. Fe, Mn, standing for VF3, and VF4, were natural source origin [54]. In Armenia, Ti, V, Mn, Fe and Co, were identified to be natural originated. The PC2 include two distinguished negative groups, As/Hg, and Pb/Zn. The PC3 is composited mainly with Cu and Mo, and recognized as anthropogenic origin [51]. In Spain, the first source, including Pb, Tl, As, Sb, Cd, pointed to coal combustion. The second source was traffic air pollution origin, which released Cr, Ni, Be, V, Co. The third and fourth factors explained a very low proportion of variance and were considered secondary. These factors included TEs Cu, Zn and Sn, showing mixed behavior with regard to

*Trace Metals in the Environment - New Approaches and Recent Advances*

In Nigeria, because of the vandalization of pipeline, the soil and water may be contaminated. The PCA gave 78.68% of accumulative contribution of the covariance from the first three PCs. PCA analysis result in soil was similar with that in

In Spain, core was obtained from peat bog, to evaluate trace element distribution and human activity impact in the past 8000 years. It was found that Al, Ba, Cr, Ga, K, Na, Sr, Ti, V, Y and Zr were lithogenic and supplied by atmospheric soil dust, while Cd, Pb, P, and Zn were recognized to anthropogenic, especially the ore exploration. The depth of samples depicted the influence degree of human activities yearly [47]. The EF profile showed that Pb, Zn, and Hg were at peak values in

From the recent researches, it is concluded that Pb is an important anthropogenic originated element. Some reports argued that the vehicle emissions, brake lining, coal burning, plastics and rubber production, and car barriers are potential source of Pb. Meanwhile, Cu might come from vehicle brake lining, Zn from vehicle tires [51]. For the agriculture soil, Cu are usually cumulated by application of commercial fertilizers and Cu-based pesticides and fungicides [115]. Cd was related to the use of phosphate fertilizers [116]. Mineral fertilizers and animal manure may

The TEs spread through the air usually as particles. The particle smaller than

researchers have reported characteristics, composition, and sources of PM 10 and PM 2.5 in some Chinese cities [17, 64, 117]. At the same time, PM 10 and PM 2.5 in megacities all around the world are investigated [118–121]. TEs, such as Cu, Zn, Pb, Cd, Cr, relating to the PM 2.5 and PM 10, show deleterious effects to human health. Based on the epidemiological and toxicological studies [122, 123], the TEs in ambient PM 2.5 influence the severity of allergic respiratory disease and have a high

In the source apportionment analysis, six types of main resource of ambient particular matter are commonly found: natural sources (including soil dust and sea salt), domestic fuel burning, industry, traffic, unspecified source of human origin pollution. Soil dust refer to the bare soils by local wind. Sea salt particles can be found close to the coast. Domestic fuel burning includes coal, gas fuel and wood for cooking and heating. Traffic is a complex source of PM and TEs. All the burning of fuel and diesel, wear of brake linings, clutch, and tires are source of TEs [124]. The

10 μm is called PM 10, while PM 2.5 stand for that smaller than 2.5 μm. It is obviously that the haze-day rate has increasing in the past decade, several

the first two factors [53].

*3.3.3 TE apportionment in soil to recall human activities*

lead to elevation of Zn and Cu levels in soil.

**3.4 TE apportionment in air and particles**

cancer risk to the exposed populations [81, 82].

atmospheric in Roman age and nineteenth to twentieth centuries.

water [12].

**18**

Coal mining impact of air pollution, including suspended particles was investigated in India. The PCA and CA results suggested PC1 represent PM 10, SO2, PM 2.5, PM 1.0, Ni and Cu, which are originate from coal burning and active mine fire. PC2 was high loaded with NO2, Pb, Cd and Cr, and originated from crude oil combustion and vehicular emission. The PC3, including Fe and Mn, was mainly contributed by earth crust, wind-blown soil, and coal fly ash [60].

From the reviewed reports, it is concluded that the surface water is more contaminated by major elements, and nitrogen, which may stand for the organic contamination. The ground water is more contaminated by trace elements, As, Cr, Cd, Pb, Hg, Se, etc. The surface water may be impacted by civil and industrial activities, and the ground water may be impacted by water-rock interaction in the rock seam. The most important anthropogenic source of trace elements in the ground water is the coal and metal mines. These mines contain high content of toxic trace elements, which is stable in an anaerobic environment. Once the rock and coal are excavated, trace elements are released. Less contaminated by trace elements in the investigated surface water is not proving of safety of the surface water. Researches of sediment in rivers and lakes have found high content of anthropogenic source trace element, including As, Cr, Cd, Pb, Hg, Se, Cu, Zn, Ni, etc. The sediment and water in river and lake composite a reactive system, in which the sediment is both sink and source of the trace elements. Therefore, the source, reaction pathway in this system need

*Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix*

Researches on soil can roughly be divided into two large group, agriculture soil, and urban/industrial soil. Unsurprisingly, first TE source of agriculture soil is natural, and first TE source of urban/industrial soil is anthropogenic. As the impact of industrial development on environment, researches on urban/industrial soil are increasing, and carried out in a wider scale. Researches on particles have become popular because the air is easily impacted by human activities. In some countries, haze has become an important problem. As the main composition, suspended particles in air, especially PM 2.5, are the important media to transport and spread contaminates. The researches on PM 2.5 are carrying out all around the world, both

The most popular method used are PCA, FA, and positive matrix fractionation (PMF). The PMF is frequently used in the particle researches, but less in water and soil researches. In the study of soil and particle, semi-supervised ML techniques are also implemented [38]. Some researches combine the machine learning method with geochemical method, or two or more machine learning method together. For example, Petrik et al. [56] combined factor analysis and multivariate linear regression. The ANN is a tool to predict air quality based on history data, relative

researches are abundant, Mclean et al. have made a thorough reviewed on this topic [87]. However, very little work has been carried out to identify TE source using

From the reviewed reports, anthropogenic source of trace elements in soil and particle includes mainly metal element, Zn, Mn, Ni, Cu, and some other toxic elements, As, Cd, Cr, Hg, Pb, etc. The soil and particle have similar TE composite.

The techniques of data mining are widely used to trace sources of TEs in water

In water environment, ground water and surface water have relation in the flow network. Human activities, especially for the mining, change the natural reaction environment, releasing trace element into ground water and surface water. Then the sediment in river and lake may be contaminated and be a source to water that may release trace element again. Soil, dust, and air particles may be influenced by varies of human activities, especially in the urban and industrial area. The TE composition is different depending on the environmental media type, human activities, land use type, etc. However, some environmental concern element, As,

More metal TEs are found in soil and particle than that in ground water.

thoroughly researches and regulations.

*DOI: http://dx.doi.org/10.5772/intechopen.88818*

developed and developing countries.

ANN method.

**4. Conclusions**

and solid matrix.

**21**

In the USA, brake wear, tire wear, fertilized soil, and resuspended soil were found to be important sources of copper, zinc, phosphorus, and silicon, respectively, using the method of positive matrix factorization. Zn was found strongly related to tire wear but also contributed to the Pb-rich features and soil. At the same time, the Pb-rich contributions are highly correlated with the tire wear, elevated P contributions within the fertilized soil as well as the Pb-rich feature [68].

Brinkman et al. compared the performance of PCA and PMF on the source apportionment for the particle matters. It was found that most of the PCA factors were easily distinguishable from others by sharp differences in the factor loadings. For many individual compounds, the variance was explained primarily by a single factor. In contrast, the factors obtained with PMF were more difficult to distinguish because anticipated tracer compounds for certain sources appeared in multiple PMF factors [65].

#### **3.5 Summary of method used to identify source of contaminates**

Applications and implementations of multivariate analysis/data mining, combining with geochemical method, on source apportionment of trace element as contaminates in environmental medias are increasing, with the development of techniques of big data, machine learning, and computer software. Four environmental medias, water, sediment, soil, and particles are discussed.

Four types of application can be identified for water contamination: trace the source of TEs, evaluate water quality of surface water and ground water, identify intrusion in coal mines and other scenario, and find and quantify water relationship between different bodies, such as surface-ground water relationship. The sediment and water composite a reaction system, i.e., the sediment could be origin, sink of trace elements in water, or be sink at first step, then origin again. Therefore, the system should be analyzed together. The researches on sediment are less than water, and most of articles on this topic are from China.

The most used method for the source apportionment of TEs in water and sediment is principal component analysis (PCA), probably for it's easy to use and explain. With the developing of data mining algorithm and calculation software, the application of PCA become easier and more efficient. The similar method, factor analysis (FA) is also used. The PCA and FA are both unsupervised ML method. Although having less accuracy than the supervised method, these methods are suitable for this topic.

Supervised ML methods are also used in this area, though much less than the unsupervised ML methods, and its scope of application is different. For example, decision tree is used to classify the sample types [38]. Discriminant analysis is also a supervised method, its implementation can be found, especially on the identifying water inrush source in coal mines, as the labeled data can be obtained [32, 33]. In this sense, other supervised machine learning method, ANN, support vector machine, decision tree, can also be used to identify water inrush source. Usually, ANN need more data to improve predicting quality, than SVM and decision tree.

In order to combing the advantages of unsupervised and supervised machine learning methods, semi-supervised method has been introduced and implemented on this topic [52]. At present, related researches are rare, but promising reports are expected.

#### *Data Mining for Source Apportionment of Trace Elements in Water and Solid Matrix DOI: http://dx.doi.org/10.5772/intechopen.88818*

From the reviewed reports, it is concluded that the surface water is more contaminated by major elements, and nitrogen, which may stand for the organic contamination. The ground water is more contaminated by trace elements, As, Cr, Cd, Pb, Hg, Se, etc. The surface water may be impacted by civil and industrial activities, and the ground water may be impacted by water-rock interaction in the rock seam. The most important anthropogenic source of trace elements in the ground water is the coal and metal mines. These mines contain high content of toxic trace elements, which is stable in an anaerobic environment. Once the rock and coal are excavated, trace elements are released. Less contaminated by trace elements in the investigated surface water is not proving of safety of the surface water. Researches of sediment in rivers and lakes have found high content of anthropogenic source trace element, including As, Cr, Cd, Pb, Hg, Se, Cu, Zn, Ni, etc. The sediment and water in river and lake composite a reactive system, in which the sediment is both sink and source of the trace elements. Therefore, the source, reaction pathway in this system need thoroughly researches and regulations.

Researches on soil can roughly be divided into two large group, agriculture soil, and urban/industrial soil. Unsurprisingly, first TE source of agriculture soil is natural, and first TE source of urban/industrial soil is anthropogenic. As the impact of industrial development on environment, researches on urban/industrial soil are increasing, and carried out in a wider scale. Researches on particles have become popular because the air is easily impacted by human activities. In some countries, haze has become an important problem. As the main composition, suspended particles in air, especially PM 2.5, are the important media to transport and spread contaminates. The researches on PM 2.5 are carrying out all around the world, both developed and developing countries.

The most popular method used are PCA, FA, and positive matrix fractionation (PMF). The PMF is frequently used in the particle researches, but less in water and soil researches. In the study of soil and particle, semi-supervised ML techniques are also implemented [38]. Some researches combine the machine learning method with geochemical method, or two or more machine learning method together. For example, Petrik et al. [56] combined factor analysis and multivariate linear regression. The ANN is a tool to predict air quality based on history data, relative researches are abundant, Mclean et al. have made a thorough reviewed on this topic [87]. However, very little work has been carried out to identify TE source using ANN method.

From the reviewed reports, anthropogenic source of trace elements in soil and particle includes mainly metal element, Zn, Mn, Ni, Cu, and some other toxic elements, As, Cd, Cr, Hg, Pb, etc. The soil and particle have similar TE composite. More metal TEs are found in soil and particle than that in ground water.
