**3.5 Distinction between non-smoking and smoking participants using LF NMR-based MV metabolomics analysis**

1 H NMR-based metabolomics analysis was, for the first time, utilized to determine the value of LF benchtop NMR analysis to discriminate between the salivary <sup>1</sup> H NMR profiles of the cigarette smoking *versus* non-smoking sampling groups. Therefore, the above <sup>1</sup> H NMR-detectable and validated 5 biomolecule variables, i.e. acetate, propionate, formate, methanol and glycine concentrations, as determined on the LF 60 MHz spectrometer, were employed to explore any MV differences between these. PCA, partial-least squares discriminatory analysis (PLS-DA), orthogonal partial least squares-discriminatory analysis (OPLS-DA), and agglomerative hierarchal clustering (AHC) techniques were used for this purpose, as was analysis using a RF model.

*Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

**Figure 5(a)** and **(b)** show three-dimensional (3D) PLS-DA, and a two-dimensional (2D) OPLS-DA scores plots, arising from these forms of MV analysis, and both revealed that it was possible to achieve a satisfactory level of distinction between these two groups. Cross-validating Q<sup>2</sup> (R2 Y) values for these analyses were found to be high (0.721(0.803) and 0.709(0.786) respectively), and permutation tests performed for these models with 2000 permutations were very highly significant indeed (*p* < 5.0 10<sup>4</sup> in both cases). These analyses revealed that only methanol and its potential *in vivo* metabolite formate were important discriminatory variables for this comparison, which were both significantly upregulated in the smoking group (PLS-DA variable importance parameter (VIP) values of 1.88 and 0.81 respectively), with corresponding values for acetate, propionate and glycine being only 0.06, 0.64 and 0.63 respectively, i.e. acetate offered no discriminatory potential whatsoever. These values were reflected by the PCA strategy applied, which had very strong PC1 positive loadings for methanol and formate (0.52 and 0.79 respectively), whereas those for acetate, propionate and glycine were negative, but only weakly so (0.02, 0.08 and 0.32 respectively). These loadings vectors are fully consistent with PC1 being derived from a cigarette tobacco smoking source only: in addition to being an oral microbiome catabolite, formate is an important *in vivo* metabolite of cigarette smokecontaining methanol, the route proceeding through a toxic formaldehyde intermediate [33]. However, it also appears that 2 or more of the non-smoking group of participants are classifying or clustering as cigarette smokers. In view of this observation, it remains a possibility that self-reporting bias may be involved in such cases, as further discussed in Section 4 below.

Similarly, AHC analysis confirmed an at least partial distinction between these two HSS sample classifications (**Figure 5(c)**). However, despite a reasonable-to-good level of discrimination, 2 of the samples donated by non-smoking participants appeared within the smoking group cluster, and 5 of the smoking ones are clustered with the non-smoking cohort. This is possibly explicable by self-reporting bias in the former case, but for the latter, it is possible that the participants concerned smoked their last cigarette some considerable period of time prior to sample collection.

#### **Figure 5.**

*(a) 3D PLS-DA scores plot of PC3* versus *PC2* versus *PC1 showing evidence for distinctive clusterings of tobacco cigarette-smoking and non-smoking participant classifications (green- and red-coded respectively). For this model, PC1, PC2 and PC3 accounted for 42.5, 23.5 and 10.4% of the total model variance respectively. (b) OPLS-DA plot of orthogonal T score {1}* versus *T score {1} also revealing a high level of distinction between the tobaccosmoking and non-smoking groups. (c) AHC analysis dendogram of this dataset, revealing an at least moderate level of differentiation between the smoking and non-smoking groups, with notable sub-clusterings within each classification. Two misclassified non-smoking participant sample donors may arise from a self-reporting smoking bias, whereas the five misclassified smoking participants may result from prolonged durations between sample collection and their last smoking episode. Eight potential outlier samples detected in a provisional PCA were removed from the dataset prior to analysis.*

Finally, an RF model demonstrated that 88 and 95% of the non-smoking and smoking participant donor samples were correctly classified (91% classification success rate overall). Hence, these MV comparisons demonstrate, for the first time, an important <sup>1</sup> H NMR-based metabolomics application which employs a non-stationary LF benchtop spectrometer. Four of the non-smoking participants were misclassified as smokers, whereas only one of the smokers was classified as a non-smoker with this analysis strategy.

### **3.6 Potential clinical and diagnostic significance of salivary metabolite tracking with LF benchtop NMR devices**

In this study we have demonstrated the rapid, virtually non-invasive analysis of human saliva using a compact, LF 60 MHz benchtop NMR spectrometer. The major aim of the pilot investigations described here was to establish the abilities of LF NMR spectrometers to effectively perform the simultaneous quantitative analysis of a series of biomolecules in human saliva, and to consider their potential future value as trackable agents for the monitoring of selected oral diseases. We also critically examined limitations of the applications of this technique and their potential outcomes. Moreover, for the first time we have also applied 'state-of-the art' NMR-linked metabolomics techniques to distinguish between saliva samples donated by both nonsmokers and tobacco cigarette-smoking participants in a case study.

Overall, we have shown that 60 MHz <sup>1</sup> H NMR measurements can be employed to reliably determine selected salivary metabolite concentrations, with potentially much scope for future diagnostic and prognostic applications to oral health conditions. The authors are fully aware of issues related to resonance overlap at low magnetic fields in view of the dependence of resonance frequencies on static magnetic field strength, in which <sup>1</sup> H NMR signals appear to be broader, and with a spectrally-wider chemical shift range for all multiplets, with decreasing spectrometer operating frequencies. Such studies are therefore highly challenging in view of these inherent analyte selectivity considerations, along with potential sensitivity issues expected at such lower magnetic field strengths. However, we found that complications arising from overlapping resonances in complex biofluid samples such as saliva were minimal, or were circumventable, for the most common prominent resonances, and also for those located within relatively interference-free spectral regions. The ability of this LF NMR technique to detect exogenous agents present in this biofluid should also be considered, for example the detection of drugs and other xenobiotics in saliva within specified time zones following their oral ingestion by humans.

Intriguingly, human saliva may afford a transference-dependent '*diluted picture*' of chemopathological changes occurring throughout the human body, in addition to more concentrated, localized metabolic features within the oral environment itself, since a large number of biomolecules, and disease biomarkers (of a range of specificities) have the ability to transfer to this biofluid from blood via intra-, extra-, transand pericellular pathways, which highlight active transport or passive diffusion within the gingival sulcus and salivary glands [25]. Indeed, researchers are now increasingly promoting the employment of saliva as a clinical diagnostic medium [26], and such applications have significant widespread potential. Correspondingly, salivary (particularly parotid salivary) metabolomic and proteomic modifications appear to mirror those observed in human blood [27–29].

In 2002, the study reported by Silwood et al. [19], was described in Ref. [9] as the very first untargeted metabolomics investigation of human saliva. This unique

#### *Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

investigation successfully identified a total of 63 biomolecules therein using 600 MHz 1 H NMR analysis, and quantified 11 key microbial-derived catabolites, 9 of which displayed very highly significant 'between-participant' components of variance. These markers included acetate and lactate, with excessive levels of their corresponding acids being viewed as primary end-point biomarkers involved in the aetiology of dental caries [30]. However, formic and pyruvic acids (both present at millimolar or near-millimolar concentrations in human saliva, the former being higher in smokers as found here) are stronger acids than lactic acid, and therefore may also exert procariogenic activities. Hence, the LF 60 MHz NMR detection and quantification of salivary formate demonstrated in the current study may indeed offer some diagnostic and/or prognostic monitoring potential. However, propionate, along with *n*- and *iso*butyrates, are considered to be primary microbial catabolites involved in periodontal disease progression [31, 32]. Correspondingly, salivary short-chain organic acids/ anions serve as biomarkers for the growth, preponderance and catabolic activities of micro-organisms, and hence species-dependent patterns of these agents may serve as biomarkers of pathologically-mediated alterations to the salivary microbiome [19, 33].

Furthermore, whilst modifications in salivary formate concentrations have been previously linked to between-gender differences (i.e. elevated concentrations in males) [28], which we did not find here, our data provides evidence that one potential source for it is the oxidative metabolism of methanol as an ingested and/or inhaled environmental toxin. Indeed, salivary methanol levels are markedly upregulated via tobacco smoke inhalation [23], and/or alternative exogenous sources such as dietary ones. In our study, although salivary formate levels were *ca.* 2-fold greater in males, which may arise from the impaction of an increased smoking frequency for this gender in our smoking cohort, this difference was found not to be statistically significant (**Table 3**).

Interestingly, the study reported in [28] found that salivary citrate, lactate, pyruvate and sucrose levels were significantly upregulated in saliva samples collected from smokers over those of non-smokers, and formate was downregulated therein. Moreover, although salivary methanol concentrations were *ca.* 3-fold greater in smokers than in non-smokers in this study, as might be expected from the current one, this difference was found not to be statistically significant. Similarly, cigarette humectantderived propane-1,2-diol levels were higher in samples collected from smoking participants in Ref. [28], although again this difference was found not to be significant. Additionally, that investigation detected and determined glucose and sucrose in saliva samples. However, in those collected following the rigorous overnight fasting protocol involved in the current and our other studies, little or none of these carbohydrates are 1 H NMR-detectable in HSS samples collected from participant cohorts. Therefore, it appears that the quite limited pre-collection participant restrictions instigated in the study described in [28] was unsuccessful in completely precluding dietary-derived agents from the saliva samples analyzed. Indeed, participants were only required to not consume alcohol on the day of sample collection (and not also the evening before), and these samples were only collected at least 1 h following the last meal, which in our view is insufficient to remove interferences arising from dietary agents, along with those from alcoholic beverages consumed the previous day. Even with a protocol requesting that all participants refrain from the consumption of alcoholic drinks 24 h prior to the sample collection time-point (Section 2.1), traces of ethanol remain <sup>1</sup> H NMR-detectable in our saliva specimens when detected at operating frequencies of ≥400 MHz. Notwithstanding, generally ethanol consumed at time-points ≥ 24 h was not found in 60 MHz salivary <sup>1</sup> H NMR profiles in view of the lowered sensitivity of

this approach. However, our pilot studies have also shown that if participants drank alcoholic beverages such as a beer, their salivary ethanol levels were indeed detectable and quantifiable using a 60 MHz benchtop NMR facility at least several hours or more thereafter. This observation clearly offers a high level of potential regarding future forensic investigations.

Likewise, citrate is only very rarely <sup>1</sup> H NMR-detectable in our HSS samples collected according to our rigorous overnight fasting protocol, and therefore its direct derivation from human diets (which serve as rich sources of this metabolite), and insufficient periods of fasting in Ref. [34], remains a strong possibility.

### **4. Limitations of the study**

One major limitation of the application of LF salivary <sup>1</sup> H NMR analysis is inherent resonance overlap problems experienced at this operating frequency, which is much lower than those of more traditional MF or HF spectrometers coupled with restrictively-sized superconducting magnets (for example, those of 400–750 MHz operating frequencies). Hence, these resonance superimposition problems clearly give rise to major analytical limitations, notably in complex multianalyte biofluid spectra. Therefore, for future prospective studies involving the quantification of salivary biomolecules and/or xenobiotics, at least some level of caution should be applied when employing such devices. Unfortunately, these complications increase substantially when integrating resonances of higher first-order and more complex coupling patterns, which may markedly hinder such intensity determinations. Nevertheless, with the exception of the propionate-CH3 resonance, these interference problems may be considered minimal for the determination of major, high concentration salivary metabolites, specifically those with prominent resonances in LF spectra obtained. Indeed, these signals have only low or negligible levels of superimposition with lower intensity signals, or appear in relatively 'spectroscopically clean' regions of the spectra acquired, for example formate. Therefore, although the LF 60 MHz <sup>1</sup> H NMR profiles of HSSs are largely commanded by resonances of the highest intensity and with simple coupling patterns and orders, and/or those arising from metabolites of high concentrations, in principle this novel NMR strategy potentially offers valuable quantitative information for those detectable at lower levels, most notably with the advent of higher-field compact benchtop instruments which operate at frequencies of 80 or 100 MHz.

As observed and further explored in Ref. [10], an additional limitation of <sup>1</sup> H NMRbased metabolomics studies featuring LF NMR spectrometers is the intensitydiminishing effects of the H2O/HOD signal presaturation protocol, notably for signals located close to its chemical shift value (δ = 4.8 ppm). However, although such effects substantially influence the C1-H resonances of both the α- and β-glucose anomers (δ = 5.25 and 4.63 ppm respectively), such hurdles may be surmounted by the use of rigorous calibration processes with biomolecule standard solutions, and by the possible integration of alternative resonances derived from the agents affected, namely those with δ values sufficiently distant from the H2O/HOD secondary irradiation one. An additional limitation arises from some significant differences between intramolecular <sup>1</sup> H relaxation times for a number of salivary metabolites, and also some longrange coupling phenomena, results which will be reported in detail elsewhere.

Finally, for the *prima facie* metabolomics investigation conducted here, data available in **Figure 5** indicates that this study may involve a small but significant level of

*Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

self-reporting bias, since two, or perhaps more, of the non-smoking participant samples appeared to co-cluster with the tobacco-smoking cohort in PCA, PLS-DA and OPLS-DA scores plots, and AHC dendograms, as did those of five of the smoking cohort with the non-smoking group. These apparent erroneously-clustered nonsmoking participants may represent those with only a limited or very limited smoking preference, but who preferred to report themselves as 'non-smokers' in this investigation in view of their low smoking incidences and/or smoking irregularities, for example those known as 'closet smokers'. Further investigations to explore this are currently in progress in our laboratories.

### **5. Conclusions**

In this study, we have evaluated the viability of low-field (LF) benchtop <sup>1</sup> H NMR analysis technologies for metabolomics investigations of human saliva. This novel, convenient and near-portable technique was used to detect and/or potentially quantify and hence monitor up to 15 potentially healthcare-impacting oral metabolites in healthy human saliva, a strategy prospectively offering much potential for the direct 'on-site' testing of biofluids from patients affected by oral health or related conditions at clinical locations.

We report the detection of typical salivary metabolites, including propionate, acetate, succinate, glycine, dimethylamine, trimethylamine, methanol, formate and aromatic amino acids all at an operating frequency of only 60 MHz. However, quantification of the salivary levels of biomolecules was limited to only five of those with the most prominent <sup>1</sup> H NMR signals, although succinate (singlet signal, δ = 2.405 ppm) could also be considered if not significantly quantitatively impacted by salivary pyruvate (*s*, δ = 2.388 ppm) and/or glutamine (*m*, δ = 2.42 ppm) resonances. Indeed, since the singlet resonances of formate, methanol and glycine did not suffer from significant resonance overlap issues, and were therefore quite clearly resolved, the direct LF NMR determination of these biomolecules was possible. Excessive salivary levels of organic acid anion catabolites may serve as key biomarkers of the pathogenesis and development of dental caries [19, 20, 30] and periodontal diseases [19, 32], and herein it was found that both salivary acetate and propionate, which represent biomarkers of dental caries and periodontal diseases respectively, could be readily quantitated in this biofluid, despite some bioanalytical concentration limit/ interference problems encountered with the latter.

This study also demonstrated for the first time that LF <sup>1</sup> H NMR-linked metabolomics analysis could be employed to discriminate between the salivary biomolecular profiles of tobacco-smoking and non-smoking participants. When detectable at sufficient salivary concentrations, these analyses may be transferable to the detection and perhaps quantification of exogenous agents such as ingested drugs in this biofluid. Moreover, the approach outlined here may indeed offer some forensic applications involving the identification of illicit drugs in human saliva samples collected at crime scenes, provided that such analytes have resonances present in spectrally-clear regions unaffected by overlapping signal interferences arising from endogenous species.

Notwithstanding, currently the authors recommend that future LF NMR investigations of human saliva and oral diseases should focus on the more prominent resonances present in spectra acquired, most notably those with relatively simple firstorder coupling patterns (i.e., singlets, doublets, triplets, etc.) for quantification

purposes, with the exception of those in which the targeted analyte has resonances(s) located in 'spectroscopically-clear' regions. However, future developments in LF benchtop NMR technologies operating at higher frequencies may serve to provide effective solutions to these issues.

The authors also suggest that this technique may also serve valuable diagnostic and/or prognostic tracking purposes in a clinical context for a range of human diseases. In principle, this non-stationary multi-analytical technique may be employed as a sensitive means of monitoring the salivary metabolic status of patients suffering from oral diseases, and potentially also physiologically-remote conditions, directly at dental surgery and primary healthcare sites, hospitals and hospital laboratories, and perhaps also community pharmacies. In this manner, such patient-contact sites may offer significant diagnostic and monitoring potential for oral health practitioners.
