**4. Multivariate Detection and Quantification from Vibrational Spectra**

#### **4.1. Remote raman experiments**

Different preprocessing methods such as vector normalization (VN), mean centering (MC), auto scaling (AS), multiple scattering correction (MSC), standard normal variate (SNV) and first and second derivatives have been developed to improve a good multivariate quantification. The 56 remote Raman spectra taken from PETN detection in mixes with APAP were randomly split into two groups: a first group with the 70% of the data for calibration and cross validation (training set) and a second group for external validation (test set) formed by the remaining 30% of the data. The quantitative model was performed by using chemometrics tools, such as PLS, iPLS and siPLS. The PLS program used was from PLS-ToolBox™ (Eigenvector Research Inc.) for use with MatLab™. The iPLS and siPLS algorithms used in this work were carried out by employing iToolbox™, (downloaded from http://www. models.kvl.dk). The performance of the final PLS, iPLS and siPLS models were evaluated according to the root mean square error of cross-validation (RMSECV), a leaveone-sample-out cross-validation method and the predictive ability of the models were assessed by the root mean square error of prediction (RMSEP) and the correlation coefficient (R). In general for all the PLS models RMSECV were calculated as follows:

170 Multivariate Analysis in Management, Engineering and the Sciences

Chemometrics Enhanced Vibrational Spectroscopy.

**4.1. Remote raman experiments** 

[28].

The principle of the iPLS is to optimize the predictive capability of PLS regression models and to support in interpretation. This algorithm which develops local PLS models on equidistant subintervals of the full-spectrum region. Its major objective is to provide an overall perspective of the significant information in different spectral subdivisions, thereby focusing on important spectral regions and removing interferences from other regions. The sensitivity of the PLS algorithm to noisy variables is highlighted by the informative *i*PLS plots [32]. For synergy interval PLS (siPLS), the basic principle of this algorithm is the same as iPLS first, it is to split the data set into a number of intervals (variable-wise), next, to develop PLS regression models for all possible combinations of two, three or four intervals. Thereafter, RMSECV is calculated for every combination of intervals. The combination of intervals with the lowest root mean square error of cross-validation (RMSECV) is selected.

Finally, cluster analysis is the name given to a set of techniques that seeks to determine the structural characteristics of multivariate data sets by dividing the data into groups, clusters, or hierarchies. For cluster analysis, each sample is treated as a point in an n-dimensional measurement space. The coordinate axes of this space are defined by the measurements used to characterize the samples. Cluster analysis assesses the similarity between samples by measuring the distances between the points in the measurement space. Samples that are similar will lie close to one another, whereas dissimilar samples are distant from each other

In this chapter, remote Raman detection experiments were performed to quantify HEM such as PETN present in mixtures with non-HEM. The remote measurements were carried out at 10 m employing a frequency-doubled 532 nm Nd:YAG pulsed laser as excitation source. The quantification study was performed by using PLS, iPLS and siPLS as chemometrics tools to achieve the best correlation between the remote Raman signal and the concentration (%) of PETN explosive in a mixture with pharmaceutical compound. Discrimination of chemical warfare agent simulant (CWAS) TEP concealed within commercial beverage bottles using Optical Fiber Coupled Raman Spectroscopy with the use of different chemometrics techniques such as PLS, PLS-DA. Finally infrared spectroscopic information analysis using Chemometrics was designed and implemented in the detection of HEM: 2,4-DNT, TATP, PETN and RDX, present at trace level on surfaces and in air were analyzed by

**4. Multivariate Detection and Quantification from Vibrational Spectra** 

Different preprocessing methods such as vector normalization (VN), mean centering (MC), auto scaling (AS), multiple scattering correction (MSC), standard normal variate (SNV) and first and second derivatives have been developed to improve a good multivariate quantification. The 56 remote Raman spectra taken from PETN detection in mixes with APAP were randomly split into two groups: a first group with the 70% of the data for calibration and cross validation (training set) and a second group for external validation (test set) formed by the remaining 30% of the data. The quantitative model was performed

$$RMSECV = \sqrt{\frac{\sum\_{l=1}^{n\_{cal}} (c\_p - c\_l)^2}{n\_{cal}}} \tag{3}$$

Where *c*i and *c*p are the experimental and predicted concentration, respectively, of the ith calibration sample when situated in a left out segment, *n*cal is the number of calibration samples in the training set. The number of PLS components included in the model is selected according to the lowest RMSECV. This procedure is repeated for each of the preprocessed spectra. For the test set, the root mean square error of prediction (RMSEP) is calculated as follows:

$$RMSE = \sqrt{\frac{\sum\_{l=1}^{n\_{test}} (c\_l - c\_p)^2}{n\_{test}}} \tag{4}$$

The best model with the overall lowest RMSECV will be selected as final model. Correlation coefficients between the predicted and the true concentration are calculated for both the calibration and the test set, which are calculated as follows from Equation 5, where �� � is the mean of the experimental measurement results for all samples in the train and test sets.

$$R = \sqrt{1 - \frac{\sum\_{l=1}^{n} (c\_p - c\_l)^2}{\sum\_{l=1}^{n} (c\_l - c\_l)^2}} \tag{5}$$

The implementation of new methodologies for enhanced detection of hazardous compounds such as explosives is always attractive for many countries principally for defense and security applications. Terrorist employ different ways to pose threats and make illegal acts against military and civilian people. According to this situation our study is focused on detection of explosives present in mixture prepared intentionally with a pharmaceutical product by employing remote Raman detection and chemometrics tools. Remote Raman spectra of PETN, APAP in mixtures of them are illustrated in Figure 8. The results show that mean centering (MC) pre-processing method was the most successful method for correcting background and was selected for construction of further models because they presented small improvement in RMSEC.

The full spectrum was split in 20 independent intervals and the RMSECV values for PLS models constructed with different intervals is shown in Figure 9. Models with no intervals were better than PLS models with all variables (dotted in line) and the intervals 6 (1185.2- 1328.9 cm-1), 9 (1619.8 -1755.4 cm-1), and 19 (2878 -2988.4 cm-1), presened the lowest RMSECV values where more variability exists. These values are shown in Table 1. The number of

latent variables required for the models obtained using different intervals is the numbers shown inside the rectangles.

Multivariate Analysis in Vibrational

(%) RCV RP

Spectroscopy of Highly Energetic Materials and Chemical Warfare Agents Simulants 173

RMSEP

V (%)

3 9 37 2.5 3.1 0.976 0.969 3 19 36 2.7 2.4 0.972 0.988

PLS 6 All 730 1.8 2.2 0.978 0.979 iPLS 3 6 37 2.0 2.8 0.986 0.976

siPLS 7 3,9,19 110 1.4 1.8 0.993 0.992

In synergy interval-PLS (siPLS) model calibration, the number of intervals was also optimized according RMSECV values. Table 1 shows the results of siPLS model calibration when the spectra were split into different number of intervals. The optimum siPLS model was obtained with the combination of 3 intervals (3, 9 and 19) and 7 PLS components. The lowest RMSECV was 1.4, compared with RMSECV values obtained for PLS model with all variables and iPLS models. According to the statistical results illustrated in Table 1, it is important to establish that iPLS or siPLS models with 4 or more intervals (data not shown) including intervals 10-18 were explored. These intervals correspond to noisy areas which were not eliminated in order that the models could choose the spectral region of larger variability. The capability of prediction of siPLS models was better when compared to the other models As shown in the correlation plots in Figure 10, there is a good relationship between the True and Predicted concentration (%) for PETN, with RCV values of 0.993. This can also be appreciated by the good prediction of the test set of samples with values of RMSEP of 1.8% for the corresponding explosives. The final model separated the vibrational spectra into 20 intervals, 7 latent variables were used and the intervals number 3, 9 and 19 were combined The selected intervals included regions of 724.2 - 876.7cm-1,1619.8-1755.4 cm-1 and 2878 -2988.4 cm-1, The first Raman shift region correspond to NO2 scissoring mode and O-N stretching band; the second region is relevant for NO2 asymmetric stretching mode and C=O stretching band; the third region represents the C-H stretching mode [37-39].

**Figure 10.** Predicted vs. True PETN concentration for siPLS model using 3, 9, and 19, intervals and 7

**0 5 10 15 20 25 30 35 40**

**Training Set Test Set Linear (Training Set)**

**True concentracion (%)**

**Table 1.** Full-cross-validated PLS, iPLS, and siPLS models for prediction of PETN in the range 1.0–

Models LVa Intervals NVb RMSEC

34.0% Remote Raman spectra. All models are based on MC data.

a Latent variable b Total number of variables.

latent variables.

**Predicted concentration (%)**

**Figure 8.** Remote Raman spectra of PETN, APAP and mixture of them, collected at 10 m of target to collector distance employing 532 nm laser with 100 pulses of 25 mJ/pulse.

**Figure 9.** RMSECV values for PLS models obtained for the 20 different intervals (bars) used in iPLS models. The horizontal dotted line represents the RMSECV value for the PLS model with all variables. Numbers inside the rectangles are the optimal number of latent variables.


a Latent variable b Total number of variables.

172 Multivariate Analysis in Management, Engineering and the Sciences

shown inside the rectangles.

**Intensity**

**RMSECV (%)**

**10**

**8**

**6**

**4**

**2**

latent variables required for the models obtained using different intervals is the numbers

**Mixture APAP PETN**

**400 900 1400 1900 2400 2900**

**Figure 8.** Remote Raman spectra of PETN, APAP and mixture of them, collected at 10 m of target to

**Figure 9.** RMSECV values for PLS models obtained for the 20 different intervals (bars) used in iPLS models. The horizontal dotted line represents the RMSECV value for the PLS model with all variables.

**Interval Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2 3 3 3 2 3 3 2 3 2 1 1 1 1 2 1 1 2 3 4**

Numbers inside the rectangles are the optimal number of latent variables.

collector distance employing 532 nm laser with 100 pulses of 25 mJ/pulse.

**Raman shift (cm-1)**

**Table 1.** Full-cross-validated PLS, iPLS, and siPLS models for prediction of PETN in the range 1.0– 34.0% Remote Raman spectra. All models are based on MC data.

In synergy interval-PLS (siPLS) model calibration, the number of intervals was also optimized according RMSECV values. Table 1 shows the results of siPLS model calibration when the spectra were split into different number of intervals. The optimum siPLS model was obtained with the combination of 3 intervals (3, 9 and 19) and 7 PLS components. The lowest RMSECV was 1.4, compared with RMSECV values obtained for PLS model with all variables and iPLS models. According to the statistical results illustrated in Table 1, it is important to establish that iPLS or siPLS models with 4 or more intervals (data not shown) including intervals 10-18 were explored. These intervals correspond to noisy areas which were not eliminated in order that the models could choose the spectral region of larger variability. The capability of prediction of siPLS models was better when compared to the other models As shown in the correlation plots in Figure 10, there is a good relationship between the True and Predicted concentration (%) for PETN, with RCV values of 0.993. This can also be appreciated by the good prediction of the test set of samples with values of RMSEP of 1.8% for the corresponding explosives. The final model separated the vibrational spectra into 20 intervals, 7 latent variables were used and the intervals number 3, 9 and 19 were combined The selected intervals included regions of 724.2 - 876.7cm-1,1619.8-1755.4 cm-1 and 2878 -2988.4 cm-1, The first Raman shift region correspond to NO2 scissoring mode and O-N stretching band; the second region is relevant for NO2 asymmetric stretching mode and C=O stretching band; the third region represents the C-H stretching mode [37-39].

**Figure 10.** Predicted vs. True PETN concentration for siPLS model using 3, 9, and 19, intervals and 7 latent variables.

#### **4.2. Optical fiber probe raman spectroscopy experiments**

In the optical fibers coupled Raman spectrum of TEP, shown in Figure 11, the CWAS has characteristics peaks at 733 cm−1 (PO3 symmetric stretching mode), 813 cm−1 (PO3 asymmetric stretch), 1032 and 1098 cm−1 (C–O stretch), 1162 cm−1 (CH3 rocking) and 1279 cm−1 (P–O symmetric stretch) [6]. Mixtures of TEP with commercial liquids were measured in their corresponding commercials bottles. TEP concentration varied from 0 to 100 (%v/v). In Figure 12, TEP Raman spectra are shown for different bottle materials. At all concentrations, the TEP characteristic peaks could be distinguished within the different types of materials of the container with the exception of brown glass and white plastic. These two bottle materials had lower transmittance in the 200 to 1400 cm-1 region and TEP characteristic peaks in the 2700 to 3200 cm-1region. UV-VIS spectra (data not shown) show the increased absorbance in bottle materials such as white plastic and amber glass (Malta™). This confirms nature of the low intensity Raman peaks in the region (200-1400 cm-1) shown in Figure 12. When light scatters turbid materials, such as amber glass or white plastic, the material is absorbing or blocking the light when compared to clear glass and clear plastic. Thickness of the bottle material and coloration also play a role in absorbance and transmission. The high intensity peak at 2300 cm-1 corresponds to the background light (mercury vapor from fluorescent lamps). This peak is shown with higher intensity in Raman spectra of brown glass and white plastic in comparison to the rest of spectra due to the increase in integration time for these two bottle materials. All bottle materials were subject to background light in order to simulate real-time conditions found in military, airport and other environments where a light source is involved. This analysis is based on increased absorbance shown in the UV-VIS Spectra for different bottle materials (data not shown).

**Figure 11.** Raman vibrational spectrum of TEP excited at 488 nm.

Calibration models were performed with PLS regression model to distinguish between the samples that contain TEP in aqueous solutions compared to the solutions with TEP and the commercial product. In Figure 13, eight PLS regression models are chosen in order to show the marked difference between the best and the worst regression model, each of these with

Multivariate Analysis in Vibrational

Spectroscopy of Highly Energetic Materials and Chemical Warfare Agents Simulants 175

and without pre-processing steps. Since integration times were normalized for each bottle, Limits of Detection were similar with the exception of amber/brown glass (Malta™). The aqueous solutions show a better R2 values than the mixtures with the commercial product. For clear glass (Snapple™), the R2 value is 0.9925 for aqueous solutions compared to 0.9747 for the mixtures with the commercial product. The R2 values for Malta™ in aqueous solutions showed a significant increase with optimization (0.4193 without preprocessing and 0.9508 with optimization). However, optimization with Malta™ shows a lower R2 value

0.7646 compared to 0.8047 without preprocessing.

**Figure 12.** Raman spectra of Triethyl Phosphate in various types of bottle materials

were estimated using the equation 6 [40]:

It is clear that the R2 values increase in PLS regression models for aqueous solutions since water does not present strong signatures in Raman Spectroscopy. Every other PLS regression model (green plastic, green glass, clear plastic, clear glass, and white plastic) in aqueous and beverage solution presented nearly similar limits of detection. Each of these limits improved with their respective preprocessing step (vector normalization, standard normal variate and mean centering). With the help of integration time for each bottle material, normalization was achieved with the limits of detection and root-mean-squared error cross-validation (RMSECV). These values were found as acceptable in an average between the best models of approximately 2.5%. The Limits of Detection for PLS methods and without pre-processing steps. Since integration times were normalized for each bottle, Limits of Detection were similar with the exception of amber/brown glass (Malta™). The aqueous solutions show a better R2 values than the mixtures with the commercial product. For clear glass (Snapple™), the R2 value is 0.9925 for aqueous solutions compared to 0.9747 for the mixtures with the commercial product. The R2 values for Malta™ in aqueous solutions showed a significant increase with optimization (0.4193 without preprocessing and 0.9508 with optimization). However, optimization with Malta™ shows a lower R2 value 0.7646 compared to 0.8047 without preprocessing.

174 Multivariate Analysis in Management, Engineering and the Sciences

Spectra for different bottle materials (data not shown).

**Figure 11.** Raman vibrational spectrum of TEP excited at 488 nm.

Calibration models were performed with PLS regression model to distinguish between the samples that contain TEP in aqueous solutions compared to the solutions with TEP and the commercial product. In Figure 13, eight PLS regression models are chosen in order to show the marked difference between the best and the worst regression model, each of these with

**4.2. Optical fiber probe raman spectroscopy experiments** 

In the optical fibers coupled Raman spectrum of TEP, shown in Figure 11, the CWAS has characteristics peaks at 733 cm−1 (PO3 symmetric stretching mode), 813 cm−1 (PO3 asymmetric stretch), 1032 and 1098 cm−1 (C–O stretch), 1162 cm−1 (CH3 rocking) and 1279 cm−1 (P–O symmetric stretch) [6]. Mixtures of TEP with commercial liquids were measured in their corresponding commercials bottles. TEP concentration varied from 0 to 100 (%v/v). In Figure 12, TEP Raman spectra are shown for different bottle materials. At all concentrations, the TEP characteristic peaks could be distinguished within the different types of materials of the container with the exception of brown glass and white plastic. These two bottle materials had lower transmittance in the 200 to 1400 cm-1 region and TEP characteristic peaks in the 2700 to 3200 cm-1region. UV-VIS spectra (data not shown) show the increased absorbance in bottle materials such as white plastic and amber glass (Malta™). This confirms nature of the low intensity Raman peaks in the region (200-1400 cm-1) shown in Figure 12. When light scatters turbid materials, such as amber glass or white plastic, the material is absorbing or blocking the light when compared to clear glass and clear plastic. Thickness of the bottle material and coloration also play a role in absorbance and transmission. The high intensity peak at 2300 cm-1 corresponds to the background light (mercury vapor from fluorescent lamps). This peak is shown with higher intensity in Raman spectra of brown glass and white plastic in comparison to the rest of spectra due to the increase in integration time for these two bottle materials. All bottle materials were subject to background light in order to simulate real-time conditions found in military, airport and other environments where a light source is involved. This analysis is based on increased absorbance shown in the UV-VIS

**Figure 12.** Raman spectra of Triethyl Phosphate in various types of bottle materials

It is clear that the R2 values increase in PLS regression models for aqueous solutions since water does not present strong signatures in Raman Spectroscopy. Every other PLS regression model (green plastic, green glass, clear plastic, clear glass, and white plastic) in aqueous and beverage solution presented nearly similar limits of detection. Each of these limits improved with their respective preprocessing step (vector normalization, standard normal variate and mean centering). With the help of integration time for each bottle material, normalization was achieved with the limits of detection and root-mean-squared error cross-validation (RMSECV). These values were found as acceptable in an average between the best models of approximately 2.5%. The Limits of Detection for PLS methods were estimated using the equation 6 [40]:

$$\text{LOD} \bowtie \Lambda \left( \mathbf{a}, \emptyset, \mathbf{\upnu} \right) \times \text{RMSEC}(\mathbf{\upzeta} \mathbf{1} + \mathbf{\upnu}) \tag{6}$$

Multivariate Analysis in Vibrational

Clear Glass Clear Plastic Green Plastic

Spectroscopy of Highly Energetic Materials and Chemical Warfare Agents Simulants 177

TEP with commercial products and with less data (5, 30, 70 and 0 %v/v) due to limited time. Snapple has lower limits of detection which is favorable for detection of chemical warfare stimulants in commercial bottles made out of various materials. When comparing limits of detection for aqueous solutions versus solutions with commercial beverage product inside commercial bottles, limits of detection are considerably lower. R2 prediction values were higher in aqueous solutions since water does not present significant Raman signal. Limits of detection were found as low as 1 percent for white plastic. Optimization also improves or

Table 2 shows Limits of Detection and Quantification (LOD and LOQ respectively) for various commercial beverage bottle solutions with TEP for the best models. Preprocessing options include Vector Normalization (V.N.), No preprocessing (N/A), Mean Centering (M.C.), Constant Offset Normalization (C.O.N.), First Derivative (F.D.) and Multiplicative Scatter Correction (M.S.C). Higher limits of detection and quantification for amber glass and clear plastic were presented due to their dark coloration in bottle material (amber) and commercial beverage product (Pepsi and Malta). An unexpected low value for limits of detection and quantification for white plastic was observed. This may be due to the low amount of trials (5 instead of 10 for 5, 30, 50 and 70 (%v/v of TEP) as was done with other bottle materials due to the high integration times for this material. Even though TEP, a surfactant agent, did not present a homogeneous solution with milk, integration times were normalized in order to obtain a better model of a clear linear regression with an R2 value of

COMMERCIAL BEVERAGE MIXTURES

Glass

Preprocess V.N. V.N. N/A M.C. + V.N. C.O.N. M.S.C. + M.C.

AQUEOUS MIXTURES

LOD (%) 3 1 26 4 22 3 LOQ (%) 8 3 77 11 66 9

LOD (%) 11 7 16 8 8 4 LOQ (%) 33 21 48 22 25 12 Preprocess V.N. F.D. + V.N. F.D. + V.N. V.N. V.N. N/A

**Table 2.** Limits of Detection and Quantification for the PLS models of TEP in commercial beverage

Multivariate calibration methods such as Partial Least Squares (PLS) models can be formulated as a regression equation [41, 42]. The equation in metrical form is **Y = XB**, where

bottles and aqueous mixtures along with their respective preprocessing methods.

**4.3. Gas phase infrered spectroscopy experiments** 

lowers the limits of detection as shown in Figure 13.

0.9987 and excellent limits of detection of 0.01(1%).

Green GlassWhite Plastic Amber

Root mean squared error of calibration (RMSEC) was obtained from the square fit errors [(cpredicted - ctrue)2/]1/2 where the sum extends to all samples of the calibration set. The degrees of freedom were then calculated as = n - F - 1 where F is the number of latent variables and n is the number of samples in the set. The distance of the predicted sample from zero concentration to the calibration set's mean is the leverage *h*0. Ultimately, Δ (,,) corresponds to a statistical parameter that notices the and probabilities of falsely stating presence/absence of the chemical warfare agent stimulant. Since 25, we used = 3.4 for the LOD. LOQ values as per Eq. 7 were studied at a concentration with a Relative Standard Deviation (RSD) of 15% as stated by Felipe-Sotelo *et al.* [40]:

$$\text{LOQ} = 100 \text{x(RMSECx(1+h\_0)^{0.5}RMS(\%))}\tag{7}$$

**Figure 13.** A) PLS models of TEP in aqueous solution in Snapple™ container (clear glass materials) with (vector normalization) and without preprocessing. B) PLS models of TEP in aqueous solution inside Malta™ container (amber glass materials) with (mean centering, standard normal variate) and without preprocessing. C) PLS models of TEP mixtures with the commercial product Snapple™ (clear glass materials) with (vector normalization) and without preprocessing. D) PLS models of TEP mixtures with the commercial product Malta™ (amber glass materials) with (mean centering) and without preprocessing.

Comparing limits of detection (Figure 13) the same integration times were used for aqueous and commercial beverage bottle solutions. A and B (Figure 5) show Snapple™ and Malta™ in aqueous solutions with TEP. Figures 13C and 13D in the same figure show mixtures of TEP with commercial products and with less data (5, 30, 70 and 0 %v/v) due to limited time. Snapple has lower limits of detection which is favorable for detection of chemical warfare stimulants in commercial bottles made out of various materials. When comparing limits of detection for aqueous solutions versus solutions with commercial beverage product inside commercial bottles, limits of detection are considerably lower. R2 prediction values were higher in aqueous solutions since water does not present significant Raman signal. Limits of detection were found as low as 1 percent for white plastic. Optimization also improves or lowers the limits of detection as shown in Figure 13.

176 Multivariate Analysis in Management, Engineering and the Sciences

Deviation (RSD) of 15% as stated by Felipe-Sotelo *et al.* [40]:

LOD=Δ (,,) x RMSEC(√1+*h*0) (6)

Root mean squared error of calibration (RMSEC) was obtained from the square fit errors [(cpredicted - ctrue)2/]1/2 where the sum extends to all samples of the calibration set. The degrees of freedom were then calculated as = n - F - 1 where F is the number of latent variables and n is the number of samples in the set. The distance of the predicted sample from zero concentration to the calibration set's mean is the leverage *h*0. Ultimately, Δ (,,) corresponds to a statistical parameter that notices the and probabilities of falsely stating presence/absence of the chemical warfare agent stimulant. Since 25, we used = 3.4 for the LOD. LOQ values as per Eq. 7 were studied at a concentration with a Relative Standard

**Figure 13.** A) PLS models of TEP in aqueous solution in Snapple™ container (clear glass materials) with (vector normalization) and without preprocessing. B) PLS models of TEP in aqueous solution inside Malta™ container (amber glass materials) with (mean centering, standard normal variate) and without preprocessing. C) PLS models of TEP mixtures with the commercial product Snapple™ (clear glass materials) with (vector normalization) and without preprocessing. D) PLS models of TEP mixtures with the commercial product Malta™ (amber glass materials) with (mean centering) and

Comparing limits of detection (Figure 13) the same integration times were used for aqueous and commercial beverage bottle solutions. A and B (Figure 5) show Snapple™ and Malta™ in aqueous solutions with TEP. Figures 13C and 13D in the same figure show mixtures of

without preprocessing.

LOQ = 100x(RMSECx(1+h�)���RSD(%) (7)

Table 2 shows Limits of Detection and Quantification (LOD and LOQ respectively) for various commercial beverage bottle solutions with TEP for the best models. Preprocessing options include Vector Normalization (V.N.), No preprocessing (N/A), Mean Centering (M.C.), Constant Offset Normalization (C.O.N.), First Derivative (F.D.) and Multiplicative Scatter Correction (M.S.C). Higher limits of detection and quantification for amber glass and clear plastic were presented due to their dark coloration in bottle material (amber) and commercial beverage product (Pepsi and Malta). An unexpected low value for limits of detection and quantification for white plastic was observed. This may be due to the low amount of trials (5 instead of 10 for 5, 30, 50 and 70 (%v/v of TEP) as was done with other bottle materials due to the high integration times for this material. Even though TEP, a surfactant agent, did not present a homogeneous solution with milk, integration times were normalized in order to obtain a better model of a clear linear regression with an R2 value of 0.9987 and excellent limits of detection of 0.01(1%).


**Table 2.** Limits of Detection and Quantification for the PLS models of TEP in commercial beverage bottles and aqueous mixtures along with their respective preprocessing methods.

#### **4.3. Gas phase infrered spectroscopy experiments**

Multivariate calibration methods such as Partial Least Squares (PLS) models can be formulated as a regression equation [41, 42]. The equation in metrical form is **Y = XB**, where

**B** is computed as **B = W(PTW)-1 QT** and **W** is the matrix of weights of **X, Q** is the loadings matrix of **Y**, and **P** is the **X** loadings matrix. In this study, the **Y** matrix represents the dependent variables but this is changed from continuous to discrete variation, and contains information about different classes of objects [43-45], it is a simple two states function: 1 represents the condition for the presence of explosive in the sample and 0 stands for the absence of explosive in the sample analyzed. By these means it will be possible to decide if an explosive substance is present or not in a sample. The values originating from the analysis: wavenumber range or parts of spectra are the independent variables (**X** matrix). In this study the loading vectors or number of component (**B** matrix) were used for independent variables in the DA.

Multivariate Analysis in Vibrational

Spectroscopy of Highly Energetic Materials and Chemical Warfare Agents Simulants 179

**Prediction of new sample with model of 1 vector**

**-0.5 0.5 1.5**

**Disc-1 Disc-0**

**Disc-1 Disc-0**

**Funtion**

**-0.5 0.5 1.5**

**Funtion**

new model built from the remaining spectra. This procedure was done for each one of the spectra in the data set, and the predicted discriminations were then compared with the experimental observations. The generated percentage of cases correctly classified is called the cross-percentage of cases correctly classified (PCCCC). External validation: before making the model, 100 spectra of air, 100 spectra air with TATP and 50 spectra air with 2,4- DNT were taken from the data set randomly. These spectra were analyzed by the validation

**Figure 15.** Histogram for discrimination models of TATP and external validation.

points). During the PCA runs it was not necessary to eliminate spectral data.

For the PCA analysis of TATP in air spectra were recorded using the EM-27 and the LaserScope™ instruments. A total of 60 spectra were recorded from clean air and 120 spectra from air with TATP present using EM-27 and 35 spectra were recorded from clean air and 30 spectra from TATP present in air using LaserScope™. All PCA analysis including any preprocessing in the spectral data were run using PLS-Toolbox software. PCA runs were made with the raw data and using different preprocessing treatments. The preprocessing treatments used were: auto scale, smoothing, SNV-standard normal variation, Mean center, auto scale + 1st derivate, auto scale + 2nd derivate, mean center + 1st derivate, mean center + 2nd derivative, MSC-multiplicative scattering correction. The algorithm used to carryout smoothing and derivatives was that ofSavitzky-Golay (every 11, 17, 21 and 31

**Frequency**

**Model with 2 vectors Prediction of new sample with model of 2 vectors**

**Frequency**

The infrared data from clean air and air with TATP were run together in the PCA model for each instrument used. Figure 17 shows the Scores plots for the PCA obtained. Figure 17a shows the first two principal components from spectral data using the EM-27 FTIR spectrometer with a Globar source Figure 17b shows the first two principal component analyses from spectral data of TATP detection from LaserScope™ spectrometer using

model.

**Frequency**

**Frequency**

**-0.5 0.5 1.5**

**Model with 1 vector**

**Disc-1 Disc-0**

**Funtion**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

TATP and 2,4-DNT in air were detected using FTIR spectroscopy. At trace levels, the vibrational signatures are not easily perceptible. Vibrational signatures of explosive can be confused with vibrations arising from the background air components. Thus the first task was to determine the possible interference of the two spectra. Figures 14a and 14b show the spectra of flowing gas that contains TATP and 2,4-DNT traces. The characteristic infrared signals of TAPT at 1200 cm-1 and at 1550 cm-1 for 2,4-DNT can be observed in Figure 14 which confirms the presence of these compounds in air. Linear Combination Analysis in the form of Partial Least Squares (PLS) was calculated for all FTIR spectra (7500-600 cm-1). Two and four vectors were required to find the perturbation produced by TATP and 2,4-DNT respectively, on the normal flowing air IR spectrum. The discriminating function used was a two position switch type function: On – Off (Yes/No). The nomenclature in the DA was for classification of samples in terms of "Disc-1=TATP present" or "Disc-0 = TATP not present" in air, for TATP; and "Disc-1 = 2,4-DNT present" or "Disc-0 = 2,4-DNT not present" in air, for 2,4-DNT. The results were presented in the form of histogram, where the y-axis is the frequency and x-axis is the discrimination function. Also the prediction of new sample was present in this form, (Figures 15 and 16) in these graphs, the improvement of models, when vectors are added successively is observed.

**Figure 14.** FTIR vibrational spectra of gas explosive in air: a. TATP and b. 2,4-DNT traces.

The best discriminant function was selected based on statistical significance (p) and the percentage of cases correctly classified (PCCC) [46]. The validation was done by internal jackknifing validation and external validation. Internal validation: in this method, each spectrum was successively removed from the data set, and then it was discriminated from a new model built from the remaining spectra. This procedure was done for each one of the spectra in the data set, and the predicted discriminations were then compared with the experimental observations. The generated percentage of cases correctly classified is called the cross-percentage of cases correctly classified (PCCCC). External validation: before making the model, 100 spectra of air, 100 spectra air with TATP and 50 spectra air with 2,4- DNT were taken from the data set randomly. These spectra were analyzed by the validation model.

178 Multivariate Analysis in Management, Engineering and the Sciences

independent variables in the DA.

vectors are added successively is observed.

**0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05**

**0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05**

**Absorbance**

**a**

**B** is computed as **B = W(PTW)-1 QT** and **W** is the matrix of weights of **X, Q** is the loadings matrix of **Y**, and **P** is the **X** loadings matrix. In this study, the **Y** matrix represents the dependent variables but this is changed from continuous to discrete variation, and contains information about different classes of objects [43-45], it is a simple two states function: 1 represents the condition for the presence of explosive in the sample and 0 stands for the absence of explosive in the sample analyzed. By these means it will be possible to decide if an explosive substance is present or not in a sample. The values originating from the analysis: wavenumber range or parts of spectra are the independent variables (**X** matrix). In this study the loading vectors or number of component (**B** matrix) were used for

TATP and 2,4-DNT in air were detected using FTIR spectroscopy. At trace levels, the vibrational signatures are not easily perceptible. Vibrational signatures of explosive can be confused with vibrations arising from the background air components. Thus the first task was to determine the possible interference of the two spectra. Figures 14a and 14b show the spectra of flowing gas that contains TATP and 2,4-DNT traces. The characteristic infrared signals of TAPT at 1200 cm-1 and at 1550 cm-1 for 2,4-DNT can be observed in Figure 14 which confirms the presence of these compounds in air. Linear Combination Analysis in the form of Partial Least Squares (PLS) was calculated for all FTIR spectra (7500-600 cm-1). Two and four vectors were required to find the perturbation produced by TATP and 2,4-DNT respectively, on the normal flowing air IR spectrum. The discriminating function used was a two position switch type function: On – Off (Yes/No). The nomenclature in the DA was for classification of samples in terms of "Disc-1=TATP present" or "Disc-0 = TATP not present" in air, for TATP; and "Disc-1 = 2,4-DNT present" or "Disc-0 = 2,4-DNT not present" in air, for 2,4-DNT. The results were presented in the form of histogram, where the y-axis is the frequency and x-axis is the discrimination function. Also the prediction of new sample was present in this form, (Figures 15 and 16) in these graphs, the improvement of models, when

**Figure 14.** FTIR vibrational spectra of gas explosive in air: a. TATP and b. 2,4-DNT traces.

**6400 4400 2400 400**

**Wavenumber (cm-1)**

**1300 1000 700**

The best discriminant function was selected based on statistical significance (p) and the percentage of cases correctly classified (PCCC) [46]. The validation was done by internal jackknifing validation and external validation. Internal validation: in this method, each spectrum was successively removed from the data set, and then it was discriminated from a

**-0.025 -0.015 -0.005 0.005 0.015 0.025 0.035 0.045**

**Absorbance**

**b**

**6400 4400 2400 400**

**Wavenumber (cm-1)**

**1700 1200 700**

**Figure 15.** Histogram for discrimination models of TATP and external validation.

For the PCA analysis of TATP in air spectra were recorded using the EM-27 and the LaserScope™ instruments. A total of 60 spectra were recorded from clean air and 120 spectra from air with TATP present using EM-27 and 35 spectra were recorded from clean air and 30 spectra from TATP present in air using LaserScope™. All PCA analysis including any preprocessing in the spectral data were run using PLS-Toolbox software. PCA runs were made with the raw data and using different preprocessing treatments. The preprocessing treatments used were: auto scale, smoothing, SNV-standard normal variation, Mean center, auto scale + 1st derivate, auto scale + 2nd derivate, mean center + 1st derivate, mean center + 2nd derivative, MSC-multiplicative scattering correction. The algorithm used to carryout smoothing and derivatives was that ofSavitzky-Golay (every 11, 17, 21 and 31 points). During the PCA runs it was not necessary to eliminate spectral data.

The infrared data from clean air and air with TATP were run together in the PCA model for each instrument used. Figure 17 shows the Scores plots for the PCA obtained. Figure 17a shows the first two principal components from spectral data using the EM-27 FTIR spectrometer with a Globar source Figure 17b shows the first two principal component analyses from spectral data of TATP detection from LaserScope™ spectrometer using Quantum Cascade Laser Source. The best results achieved for both PCA models (illustrated in Figures 17a and 17b) were using raw data. Both results allowed classifying gas phase TATP explosive from clean air. In Figure 17 can be noticed that PC1 tends to relate the differences between the IR dataset two.

Multivariate Analysis in Vibrational


**Absorbance Ref. TATP**

PC 1 (47.59%) reference TATP

Spectroscopy of Highly Energetic Materials and Chemical Warfare Agents Simulants 181

**Figure 17.** Score plots for the PCA, presented as (a) PC2 vs. PC1 for TATP detection from EM-27 FTIR spectrometer using Globar source and (b) PC2 vs. PC1 for TATP detection from LaserScope™

Other hand, the loadings plot were analyzed too to support the results from PCA with the finality of knowing which spectral signals cause differences between the dataset. Figure 18 shows the PC1 loading from Figure 17b, in this it can be seen than the spectral features are equal to infrared vibrational signal of reference TATP. Some signal recording can be tentatively assigned according to B. Brauer and J. Oxley as [47,48]: 891.8 cm-1 to O–C–O and Me–C–Me sym str, and Me–C–O asym str; 946 cm-1 to C-O str; 1197.6 cm-1 to O–C–O and Me–C–Me asym str, Me–C–O sym str; 1205 cm-1 to O–C–O and Me–C–Me sym str and

**Figure 18.** Figure 18. Loading plot for PC1 from PCA for TATP detection from LaserScope™

OPUS 6.0 Software (Bruker Optics, Billerica, MA, USA) was used to analyze the data obtained. Four spectra were obtained for each sample. The spectra were carried out using as backgrounds: substrate without explosive. PLS was applied to the data using different preprocessing treatments: raw data, auto scale. Mean center, auto scale + 1st derivative, auto scale + 2nd derivative, mean center + 1st derivative, mean center + 2nd derivative. The

875 930 985 1040 1095 1150 1205 1260 1315 1370

**Wavenumber/cm-1**

**4.4. Quantum cascade laser based ir reflectance experiments** 

spectrometer using QCL source.

finally 1234 cm-1 to C-C str,

spectrometer using QCL source.


**Loading**

**Figure 16.** Histogram for discrimination models of 2,4-DNT and external validation.

**-0.5 0.5 1.5**

**Disc-1 Disc-0**

**Funtion**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

**Model with 4 vectors**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

differences between the IR dataset two.

**Frequence**

**Frequency**

**Frequency**

**Frequency**

Quantum Cascade Laser Source. The best results achieved for both PCA models (illustrated in Figures 17a and 17b) were using raw data. Both results allowed classifying gas phase TATP explosive from clean air. In Figure 17 can be noticed that PC1 tends to relate the

**Model with 1 vector Prediction of new sample with model of 1 vector**

**Model with 3 vector Prediction of new sample with model of 3 vector**

**Frequence**

**Frequency**

**Frequency**

**Model with 2 vectors Prediction of new sample with model of 2 vectors**

**Frequency**

**-0.5 0.5 1.5**

**Disc-1 Disc-0**

**Funtion**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

**Prediction of new sample with model of 4 vectors**

**Disc-1 Disc-0**

**-0.5 0.5 1.5**

**Funtion**

**Figure 16.** Histogram for discrimination models of 2,4-DNT and external validation.

**Figure 17.** Score plots for the PCA, presented as (a) PC2 vs. PC1 for TATP detection from EM-27 FTIR spectrometer using Globar source and (b) PC2 vs. PC1 for TATP detection from LaserScope™ spectrometer using QCL source.

Other hand, the loadings plot were analyzed too to support the results from PCA with the finality of knowing which spectral signals cause differences between the dataset. Figure 18 shows the PC1 loading from Figure 17b, in this it can be seen than the spectral features are equal to infrared vibrational signal of reference TATP. Some signal recording can be tentatively assigned according to B. Brauer and J. Oxley as [47,48]: 891.8 cm-1 to O–C–O and Me–C–Me sym str, and Me–C–O asym str; 946 cm-1 to C-O str; 1197.6 cm-1 to O–C–O and Me–C–Me asym str, Me–C–O sym str; 1205 cm-1 to O–C–O and Me–C–Me sym str and finally 1234 cm-1 to C-C str,

**Figure 18.** Figure 18. Loading plot for PC1 from PCA for TATP detection from LaserScope™ spectrometer using QCL source.

#### **4.4. Quantum cascade laser based ir reflectance experiments**

OPUS 6.0 Software (Bruker Optics, Billerica, MA, USA) was used to analyze the data obtained. Four spectra were obtained for each sample. The spectra were carried out using as backgrounds: substrate without explosive. PLS was applied to the data using different preprocessing treatments: raw data, auto scale. Mean center, auto scale + 1st derivative, auto scale + 2nd derivative, mean center + 1st derivative, mean center + 2nd derivative. The

spectral range was 1000-1600 cm-1. PLS shown below was that best results obtained. Figure 19 shows PLS plots of RDX deposited on TB. The best result was achieved using the spectral region of 1000-160 cm-1 and using mean centering as preprocessing. A total of 10 latent variables or factors were necessary to obtain a R2 and RMSECV equal to 0.9915 and 2.32 g/cm2, respectively.

Multivariate Analysis in Vibrational

Spectroscopy of Highly Energetic Materials and Chemical Warfare Agents Simulants 183

Raman and infrared vibrational techniques were used for the detection of highly energetic materials and chemical warfare agents simulants in different matrices such as pharmaceutical mix, commercials bottles and travel baggage. The analysis of the spectral data allows emphasizing certain results. Satisfactory results were found for the quantification of explosives with good values of R2cv, RMSECV. Reliable predictions obtained by remote sensing based on Raman spectroscopy at remote distance of 10 m employing 532 nm laser as excitation source. Remote Raman system using the appropriate chemometrics tools such as PLS, iPLS and siPLS promises to be a reliable technique for finding the existence of highly energetic material such as PETN deliberately hidden in

Partial Least Squares (PLS) calibration models reported limits of detection very low for white plastic in commercial beverage bottle solutions which was the best model. Due to the bottle material and commercial beverage product coloration, Malta was the worst model with reported limits of detection more elevated. Limits of Detection and Quantification for commercial bottles were compared in aqueous and mixtures. It is observed that limits of Detection were significantly lower for mixtures of TEP with the commercial product. Integration times were the same for both aqueous and commercial beverage bottle solutions (each normalized with respect to bottle material, color and thickness). Water does not transmit significant Raman signal, which would make limits of detection lower for aqueous solutions. However, commercial beverage bottles mixtures showed lower limits of detection than aqueous solution since the beverage solutions inside each bottle showed significant

PLS-DA model and discriminant analysis was done to detect TATP and 2,4-DNT traces in fluid air. The region of 600 to 7500 cm-1 was highly significant in the discrimination with p < 0.00001 and 100 % discrimination for two vectors for TATP and four vectors for 2,4-DNT. These results show the ability of the Chemometrics methods to discriminate between vapor

Results obtained from principal component analysis to determine the presence of peroxides explosives such as TATP when they are in gas phase mixed with air shown be useful for distinction between TATP vapors and air. The principal component analysis from infrared spectral data used little PC for predict the variability of the spectral data, being the first two PC more important. PC1 loadings confirm the results from the PCA because it contained features from TATP spectrum. Other hand, the PLS model were shown chemometrics tool for quantify

explosive such as RDX and PETN on substrate of the real world such as travel baggage.

In general, vibrational spectroscopy systems designed based on this work should be useful for National Defense and Security applications, for screening hazardous liquids in government installations, seaports and in public installations to improve defense against

Raman signal and, therefore, increasing CWAS presence in the spectra.

**5. Conclusion** 

matrices with similar chemical structures.

phase explosive (2,4-DNT) and air.

terrorist attacks.

**Figure 19.** PLS of RDX on travel baggage (TB) as substrate.

Figure 20 shows PLS of PETN deposited on TB. The best resulted was achieved using the spectral region of 1000-1600 cm-1 and using mean centering as preprocessing treatment. A total of 10 latent variables or factors were necessary to obtain a R2 and RMSECV equal to 0.9994 and 1.82 g/cm2, respectively.

**Figure 20.** PLS of PETN on Travel Baggage as substrate.
