**3.3.1 Scores interpretation**

Fig. 6 depicts the PCA-scores plots obtained before and after outliers exclusion (panels a and b, respectively). Three main groups, labeled as I, II and III, can be observed in Fig. 6 (a). These groups can be represented by their PC-coordinates as follows: (*+i, +j*), (*-i, -j*) and (*-i, +j*), where *i* and *j* represent the *i-esime* and *j-esime* score value for PC1 and PC2, respectively.

These three groups or clusters are highly related with the three different concentrations of cadmium trapped on the bacterial surfaces. However, there are several potential outliers that should be removed to get a better description of the data structure. In this case, the outliers were selected on the basis of the dispersion existent among objects of the same group along PC1, the component explaining the major percentage of variance of the data set.

32 Principal Component Analysis

to retain the most important spectral information from each sample, multiple scans were conducted in different points of the bacterial sample moving the substrate on an X-Y stage. The Raman system was calibrated with a silicon semiconductor using the Raman peak at 520 cm-1, and further improved using samples of chloroform (CHCl3) and cyclohexane (C6H12). The wavelength of excitation was 830 nm and the laser beam was focused on the

The laser power irradiation over the samples was 45 mW. Each spectrum was registered with an exposure of 30 seconds, two accumulations, and collected in the 1800-200 cm-1

Raman spectra analyzed were collected over dry solid bacterial samples before and after the interaction with different concentrations of metal ions. Therefore, it is highly probable that our measurements include some light scattering effects (background scattering). These effects are

Spectra collected and analyzed in this section were baseline corrected in order to subtract the fluorescence contribution. To perform this correction, a polynomial function was approximated to the spectrum baseline, and after that, subtracted from the spectrum. Also, the spectra were smoothed using Savitzky-Golay method. Light scattering effects were corrected using the multiplicative scatter correction (MSC) algorithm and then, the spectra were mean centred. Data pre-treatment and multivariate spectra analysis were carried out with Origin version 6.0 from Microcal Company, and The Unscrambler® software version

PCA was performed on the pre-treated Raman spectra of each bacteria/metal sample in

The criteria used for PCA-scores and loadings interpretation are depicted in the next subsection. Even when the presented data set corresponds to the bacteria/Cd+2 interaction, the same methodology was employed for the analysis of the other bacteria/metal samples.

Fig. 6 depicts the PCA-scores plots obtained before and after outliers exclusion (panels a and b, respectively). Three main groups, labeled as I, II and III, can be observed in Fig. 6 (a). These groups can be represented by their PC-coordinates as follows: (*+i, +j*), (*-i, -j*) and (*-i, +j*), where *i* and *j* represent the *i-esime* and *j-esime* score value for PC1 and PC2, respectively. These three groups or clusters are highly related with the three different concentrations of cadmium trapped on the bacterial surfaces. However, there are several potential outliers that should be removed to get a better description of the data structure. In this case, the outliers were selected on the basis of the dispersion existent among objects of the same group along PC1, the component explaining the major percentage of variance of the

order to correlate metal concentrations with the spectral information.

in general composed of multiplicative and additive effects (Martens & Naes, 1989).

surface of the sample with a 50X objective.

region with a spectral resolution of 2 cm-1.

**3.2 Spectral data pre-treatment** 

9.8 from CAMO company.

**3.3.1 Scores interpretation** 

data set.

**3.3 Analysis and discussion of PCA results** 

After the removal of outliers, the three groups identified in Fig. 6 (a), became much better defined in the PC-space. However, a different cluster distribution in the PC-space was observed [Fig. 6 (b)]. Table 1 describes these changes in terms of their PC- coordinates before and after the removal of outliers.

Fig. 6. PCA-Scores plots obtained from pre-treated Raman spectra corresponding to three concentrations of bacteria/Cd+2 samples: (■) 0.059 mM, (●) 0.133 mM, (▲) 0.172 mM. (a) With outliers, (b) without outliers.

Additionally, a different distribution of the individual percentage of explained variances was observed in both PCs (PC1, 77% to 59%, and PC2 19% to 38%). However, the total percentage of explained variances before and after the removal of outliers was similar (96%before and 97% after the removal of outliers). This indicates that the removal of outliers did not reduce the information about the data structure provided by both PCs.


Table 1. Cluster coordinates in the PC-space before and after the removal of outliers.

According to Fig 6 (b), a good discrimination between the lowest (group I) and the medium/highest cadmium concentrations (groups II and III) was observed along PC1-axis.

In summary, it can be concluded that PC1 allows a gross discrimination (due to the huge difference in the concentration of samples clustered in I and samples clusters in II and III). Lower differences in Cd+2 concentrations are modelled by PC2 (clusters II and III are well separated along this PC).
