**2. Brief background on ANNs**

only observe the "added" or measured response shown by the solid line in **Figure 2**. The individual responses obtained by the simulations (shown in using the dashed and the dotted lines

In this example, the intensity of the combined spectral response for As and Cd (i.e., one obtained by adding the individual responses for As and Cd from a sample containing both As and Cd, and shown by the solid line in **Figure 2** or by experimentally measuring the combined response), is slightly more than 1.2 (indicated by the horizontal dotted line). Clearly, if the Cd interference on As is left uncorrected, the concentration of As will be reported to be higher than it actually is, in this example, by about 20%. Errors as large as the one discussed here are unacceptable in analytical determinations due to potential legal or regulations compliance reasons or health implications (e.g., as used in clinical analysis for medical diagnostics). In analytical practice (e.g., by commercial chemical analysis laboratories), spectral interference is addressed routinely using a number of methods, as will be

Several approaches have been used to correct the adverse effects of spectral interference. These traditional methods can be roughly divided into three categories: **Chemistry-based approaches** (i.e., via a chemical separation of an analyte from its matrix), **Physics-based approaches** (e.g., through use of high resolution spectrometry) and **Mathematics-based (or statistics-based) approaches** (e.g., via use of inter element correction factors or through use

• *Chemistry-based* approaches involve removal of either the analyte or the interferent using, for instance, some form of sample processing (e.g., via chemical separation). In general, approaches that require use of additional sample processing and manipulation steps are time-consuming and labor-intensive, thus adding to the overall cost per analysis [1–4].

○ Use of high resolution spectrometry [1] to resolve spectral interferences is not feasible in routine analytical laboratories because spectrometers with high resolution (e.g., those

○ Use of a non-interfered spectral line. This is not always possible, especially if an analyte has few spectral lines useful for analytical determinations and the interferent has many

○ Use of inter element correction (IEC) factors [5], in which the intensity of a spectral line of an interferent is measured at a different wavelength, and a correction factor is applied.

○ Use of *chemometrics* methods [6–15]*,* defined as those involving the *"application of mathematical or statistical methods for the treatment of chemical data"*. Among others, examples of chemometrics approaches include adaptive filtering, factor analysis, orthogonal polyno-

Depending on the type of spectrometer used, this is not always possible.

• *Physics-based approaches,* such as use of high resolution spectrometry.

with a long focal length) are expensive and they drift.

(per Zn and Fe example discussed in the introduction).

• *Mathematics-based (or statistics-based) approaches,* include:

mials or curve fitting techniques.

in **Figure 2**) will not be observed and they have been added to facilitate discussion.

briefly outlined below.

230 Advanced Applications for Artificial Neural Networks

of Chemometrics). Briefly:

ANNs fall under the umbrella of artificial intelligence (AI), a general term used to describe *machines displaying human-like intelligence* (presently, only in specific domains). Synonyms used in this field are reflective of the preferences of individual research groups. Examples of synonyms include cognitive computing, computational intelligence, and machine learning in which a machine is trained to use data (e.g., spectra, images, text, or speech) so that it can learn from examples on how to perform a task. These AI methods (no matter what they are called) are contrary to conventional programming paradigms in which explicit program instructions are issued to tell a machine how to perform a task.

ANNs were inspired by biological neurons, and in many respects they mimic them. The ideas behind artificial neurons and their networks date back to the late 1940s and early 1950s followed by the landmark description of the perceptron [20] (a linear classifier, developed in the late 1950s). The influential paper by Hopfield [21] in 1982 addressed the limitations identified by Misnky and Papert and it opened the field of ANNs to their application in a diversity of disciplines [20]. To provide few examples, ANNs have been applied to different areas ranging from finance, to engineering, to physics, to chemistry, to geology, and to medicine and pharmacy. As already been mentioned, a unique advantage of neural networks is their ability to *"learn by example"*.

In chemical analysis, interest in ANNs begun around the mid-1980s. Since then, ANNs have been applied to many chemistry-related areas, limited examples include of IR- and UV-spectra, classification, calibration, nuclear magnetic resonance (NMR), and ion mobility spectrometry (IMS). In my lab, we have been applying ANNs for spectral interference correction in analytical atomic spectrometry [22–27].

A comprehensive description of the theory and practice of ANNs is beyond the scope of this chapter. Briefly [28–32], an ANN is formed by using many individual artificial neurons. An example of an artificial neuron is shown in **Figure 3a**. Neurons are typically organized in layers. The weight (that is the strength of the connection between each neuron) is adjustable. Furthermore, each neuron has its own weighted inputs and its own transfer function (**Figure 3b**). The transfer function is a mapping operation of an input to a neuron or a layer to its output, for example, a linear transfer function (**Figure 3a**); a log-sigmoid function or a hard limit function.

In ANNs, learning can be **supervised or unsupervised**. In supervised learning (used in this work), the network adjusts its internal parameters (e.g., weights) so that a user-specified (i.e., expected) or target-value at the output is reached. At this stage, the error (defined as the difference between the output provided by the network and the target output provided by the user) is computed. Using a **"learning rule"**, the weights are adjusted until the error mentioned above

**Figure 3.** (a) Simplified illustration of a neuron and (b) of a linear transfer function. As shown above, the input to a neuron incorporates an adjustable input bias, *b*. The input *x* is multiplied by the strength of the weight (w), and the weight is being adjusted during learning. The product *w*\**x* is passed through a transfer function (F).

is minimized. The learning rate, determining the rate (or speed) at which the weights change is a key parameter. This is because if the learning rate is too fast (and as a consequence the learning step is too large), the network may become unstable. Conversely, if the learning rate is too slow (due to use of a very small learning step), the network may take too long to converge, or it may get trapped at a local minimum. The challenge is to seek a balance between learning rate and convergence to the lowest possible minimum. To find the minimum, a **gradient descent** algorithm (i.e., finding the derivative, typically using the chain rule for derivatives) with **momentum** (m) was used. Specifically, a fraction m is added to the previous weights to update the current weight. This approach ignores small ridges in the error surface, thus reducing the possibility of being trapped at a local minimum. Momentum values ranging anywhere between 0 and 1 can be used. A **backpropagation** of errors algorithm is often used with gradient descent method applied and it was employed here. To generalize, backpropagation strives to find a set of weights that minimize the errors by gradient descent between the output of the network and the target output.

The predictive ability of an ANN depends on the transfer function (**Figure 3b**), on the learning rule applied and on the network's architecture. The **architecture of the network** consists of the number of neurons in each layer, on the number of layers involved, on the transfer function, and on how the layers are connected to each other (and to the network's inputs). Sequential, **feed-forward architecture** is the most the most widely used [28–32]. In this architecture, each neuron is connected to previous neurons and its output becomes an input to the neighboring neurons. An example of spectral scans used for training purposes of a feedforward network architecture is shown in **Figure 4**.

For validation of network performance (i.e., the ability to predict **A** (analyte) and **I** (Interferent) concentrations) when given an "unknown" scan (i.e., one in which the network is asked to predict the correct or expected value), one spectral scan at a time is fed into the network (bottom frame of **Figure 4**) and the network returns a "predicted" concentration for the Analyte (**A**) and for the Interferent (**I**). For validation (as is typical in this field of research), the performance of ANNs was compared with the performance of a typical chemometric method, such as partial least squares (PLS).

Artificial Neural Networks (ANNs) for Spectral Interference Correction Using a Large-Size… http://dx.doi.org/10.5772/intechopen.71039 233

**Figure 4. ANNs**-top frame: Example spectral scans used as a training set. The intensity between scans is different because the concentrations (and hence intensities) of Analyte (**A**) and Interferent (**I**) are different. Multiple scans are shown in the top half, each scan (with data points taken typically every 0.080 nm) was fed individually into the network (bottom half).
