**2. Materials and methods**

#### **2.1 Saliva sample collection from human participants**

Whole mouth saliva samples (n = 61) were collected from healthy human participants (n = 42, age range 21–65 years, 14 male/28 female), of whom 31 were nonsmokers, and 11 were regular 'mild-to-heavy'smokers of tobacco cigarettes (an average of 3 to ≥20 cigarettes per day, with 44% of these smoking ≥20 per day). These non-smoking and tobacco-smoking participant groups were age-matched, with their mean SEM ages being 42.52 2.25 and 41.91 2.70 years respectively. All ethical considerations were in accord with those of the Declaration of Helsinki 1975 (7th amendment made in 2013). All samples were collected with informed consent and approved by the Faculty of Health and Life Sciences Research Ethics Committee, De Montfort University, Leicester, UK (reference no. 1082). Participants were fasted for a 12-h period prior to providing saliva specimens. All participants were requested to refrain from all oral activities, including eating, drinking, tooth-brushing and

smoking, etc. throughout this period, including the short, *ca*. 5 min. duration between awakening and sample donation. They were also requested not to consume any alcoholic beverages 24 h prior to the sample collection time-point. A range of 1–3 samples were collected from each participant, and those donating >1 sample provided these on separate daily a.m. 'wake-up' episodes. All samples were collected in sterile plastic universal containers and were transferred to the laboratory on ice. These were then immediately centrifuged at 3500 rpm at 4°C for a period of 15 min, and following sample preparation as outlined below, the clear human salivary supernatants (HSSs) arising therefrom were then stored at 80°C for a maximal duration of 72 h until ready for NMR analysis.

#### **2.2 Sample preparation and <sup>1</sup> H NMR analysis**

All reagents and chemicals were purchased from Sigma-Aldrich (Gillingham, UK) unless otherwise stated. Aliquots (500 μL) of HSS samples were treated with 60 μL of pH 7.00 phosphate buffer (1.00 mol/L) containing 0.04% (w/v) sodium azide, and 50 μL of <sup>2</sup> H2O containing 0.05% (w/v) sodium 3-(trimethylsilyl)propionate-2,2,3,3 d4 (TSP) (final added HSS concentration 238 μmol/L). TSP served as an internal chemical shift reference and quantitative calibration standard; sodium azide acted as a microbicidal preservative in order to protect against the artefactual generation and/or consumption of microbial catabolites during the sample transport and preparation stages; phosphate buffer served to control sample pH values; and <sup>2</sup> H2O acted as a field frequency lock. Admixtures were then homogenized and transferred to 5-mm diameter NMR tubes (Norell, Morganton, NC, USA). LF <sup>1</sup> H NMR spectra were acquired on a 60 MHz Magritek Spinsolve Ultra Benchtop spectrometer (Magritek GmbH, Philipsstr. 852068, Aachen, Nordrhein-Westfalen, Germany) with 64 and/or 384 scans, acquisition and repetition times of 6.4, and 10 or 15 s respectively, and a pulse angle of 90**°**; the H2O/HOD presaturation frequency was optimized at δ = 4.80 ppm using the programmed 1D PRESAT function. For the calibration and metabolomics studies described here, scan number and repetition time were standardized at 64 and 10 s respectively. Notwithstanding, total analysis times of <15 min per sample were possible. These samples also underwent medium-field (MF) <sup>1</sup> H NMR analysis at an operating frequency of 400 MHz (Bruker Avance-I 400 spectrometer (Bruker AXS, GmbH, Östliche Rheinbrückenstr. 49 76187, Karlsruhe, Germany, Leicester School of Pharmacy facility, De Montfort University, Leicester, UK), operating at a frequency of 400.13 MHz, and using the noesygppr1d pulse sequence for water suppression (H2O, δ = 4.80 ppm); 32 k data points were acquired in 128 scans, with 2 dummy scans, a sweep width of 4844 Hz, and an automatically-adjusted receiver gain. <sup>1</sup>

H NMR resonances present in each HSS spectrum acquired were routinely assigned by a consideration of chemical shift values, coupling patterns and coupling constants with reference to literature sources, and where required, two-dimensional 1 H-<sup>1</sup> H correlation and total correlation (COSY and TOCSY respectively) spectra were acquired to confirm these assignments. Median <sup>1</sup> H NMR signal-to-noise (STN) ratios were determined from the formula STN = 2.50A/Npp, where A represents resonance height, and Npp the highest peak-to-peak noise difference determined at each chemical shift region selected. Lower limits of detection and quantification (LLOD and LLOQ respectively) values were computed as 3- and 10-times these median STN values. HSS spectral resonances were manually-bucketed, and their intensities determined using *ACD/Spectrus Processor 2019* software; that of residual H2O/HOD was

removed prior to performing univariate (UV) or MV statistical analysis. Salivary biomolecule levels were determined from calibration plots of ratios of their preselected resonance intensities to that of internal TSP against their known concentrations in a series of analytical calibration standard solutions.

Calibration and Bland-Altman dominance plots of the <sup>1</sup> H NMR-determined concentrations of acetate, propionate, formate, glycine and methanol featured matched analysis sample datasets, with determinations made on these salivary metabolites at both 60 and 400 MHz operating frequencies. All determinations which were found to have none detectable (nd, specifically those with values <LLOD) at both operating frequencies utilized, were removed from the datasets. As recommended [13], corresponding <sup>1</sup> H NMR profiles of blank samples, which were prepared as outlined above, but with HPLC-grade water in place of HSSs, were acquired, and their 'noise' intensities at the appropriate δ values were included in these calibration plots. Spectra were acquired on replicate (n = 3) preparations of such blank samples for these purposes.

#### **2.3 UV and MV statistical and metabolomics analyses**

#### *2.3.1 UV analysis*

A paired sample t-test was applied to test for any differences between LF 60 MHz spectra acquired with relaxation delays of 10 or 15 s, and an *XLSTAT2020* software module was employed for this purpose.

An analysis-of-covariance (ANCOVA) experimental design was employed to test the statistical significance of the 'between-smoking status'source of variation, along with the potential effects of essential demographic variables recorded on participant sample donors, on salivary acetate, formate, propionate, glycine and methanol concentrations (Eq. (1)). Overall, this model evaluated the influences of the 'betweenparticipant' (random) effect P(*k*)l, and the 'between-participant ages' (A*i*), 'betweenparticipant-genders' (G*j*), and'between-smoking status' (S*k*) sources of variation (all fixed) on these five sets of <sup>1</sup> H NMR-determined levels. Moreover, the statistical significance of the age � gender, age � smoking status, and gender � smoking status first-order interaction effects (AG*ij*, AS*ik* and GS*jk* respectively) were also assessed. ANCOVA was performed using *XLSTAT2014* and *2020* software modules.

$$\mathbf{y}\_{ijklm} = \boldsymbol{\mu} + \mathbf{A}\_{i} + \mathbf{G}\_{j} + \mathbf{S}\_{k} + \mathbf{P}\_{(k)l} + \mathbf{A}\mathbf{G}\_{ij} + \mathbf{A}\mathbf{S}\_{ik} + \mathbf{G}\mathbf{S}\_{jk} + \mathbf{e}\_{ijklm} \tag{1}$$

An additional ANCOVA model explored the significance of any differences between the two operating frequencies in the non-smoking group only, and in this experimental design, the 'between-ages', 'between-genders' and 'betweenparticipants'sources of variation were also evaluated, as was the age � gender firstorder interaction effect (Eq. (2)). In this design, O*<sup>k</sup>* represents the 'betweenspectrometer operating frequencies' effect (fixed).

$$\mathbf{y}\_{ijklm} = \boldsymbol{\mu} + \mathbf{A}\_i + \mathbf{G}\_j + \mathbf{O}\_k + \mathbf{P}\_{(k)l} + \mathbf{A}\mathbf{G}\_{\vec{\eta}} + \mathbf{e}\_{ijklm} \tag{2}$$

#### *2.3.2 MV metabolomics analysis*

Principal component analysis (PCA) was primarily employed to identify any possible outlier samples present in the 60 MHz operating frequency <sup>1</sup> H NMR dataset, and in total 8 of these were found and classified as such, and then subsequently removed prior to the performance of further MV analysis. PCA was then employed to determine the reproducibility of replicate salivary metabolite determinations made on the LF benchtop spectrometer, and this check was performed with n = 9 duplicated salivary samples randomly selected from the smoking group of participants, with all the above five metabolite variables included, and not non-smokers in view of the restricted availability of data on salivary methanol and, to a lesser extent, formate concentrations above their LLOQ indices. For this PCA analysis, salivary metabolite concentrations were not constant sum-normalized (CSN), nor transformed, and nor auto- or Pareto-scaled.

For the major objective of this study, a LF benchtop <sup>1</sup> H NMR-based metabolomics investigation featured a comparison of saliva specimens collected from the nonsmoking and tobacco-smoking participants, and for this purpose all the above 5 potential predictor variables, determined via TSP-normalization as described above, were incorporated. For these purposes, the dataset was product quotient normalized (PQN), generalised log10 (glog)-transformed, and Pareto-scaled prior to MV analysis, which involved PCA, partial least squares-discriminatory analysis (PLS-DA), orthogonal partial least squares-discriminatory analysis (OPLS-DA), random forest (RF) and agglomerative hierarchical clustering (AHC) techniques (*MetaboAnalyst 5.0*, University of Alberta and National Research Council, National Institute for Nanotechnolgy, (NINT), Edmonton, AB, Canada). Distinctions found between the two groups with the above PLS-DA and OPLS-DA strategies were cross-validated with determination of Q<sup>2</sup> statistics, and also permutation tests with 2000 permutations. A Q<sup>2</sup> value of ≥0.50 was considered as a significant discriminatory cut-off threshold [14].

PQN converts <sup>1</sup> H NMR metabolomics profiles according to an overall estimate of the most probable 'dilution' influence [15], and for saliva this includes reductions in salivary flow-rate (SFR), which has been reported to be significantly reduced in cigarette smokers [16]. This strategy usually involves the subtraction of a mean or median column-bucketed reference spectral profile from those of either all or a subset of study samples, and for this investigation the non-smoking control group was employed for this purpose.

At an operating frequency of 60 MHz, some missing data in the MV datasets were unavoidable in view of resonance overlap complications. Therefore, for this metabolomics model, these randomly missing values, *ca.* 9% of the total available <sup>1</sup> H NMR-determined concentrations, were estimated and imputed using the non-linear iterative partial least squares (NIPALS) approach (*XLSTAT2020*) [17], since this method is considered appropriate for MV metabolomics datasets such as that analyzed in the current study. For UV data analysis, these imputations were accompanied by a corresponding decrease in degrees of freedom available for the parametric statistical evaluations conducted. Metabolite concentration values below the detection limit (i.e., zero analyte values or '*less-thans*') were replaced using the simple multiplicative replacement approach described in Ref. [18], specifically as 65% of their metabolite LLOD values.

Multivariate ANOVA (MANOVA) was performed using *XLSTAT2014* software. The non-parametric RF analysis featured 1000 trees and 2 predictor variables selected at each node following tuning. The dataset was randomly split into training and test sets comprising *ca.* two-thirds and one-third of the samples respectively. The training set was used to construct the RFs model and determine an out-of-the-bag (OOB) error value in order to assess classification performance.

*Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

### **3. Results**

#### **3.1 Outline of <sup>1</sup> H NMR analysis results: detection and quantitative determination of salivary biomolecules at 60 MHz operating frequency**

**Figure 1** shows the 60 MHz <sup>1</sup> H NMR profile of a typical human salivary supernatant (HSS) sample obtained with 384 scans, a process which took 60 min to acquire. This spectrum contains clear <sup>1</sup> H resonances assignable to acetate (signal 7) and methanol (signal 14), which are both ascribable to their dCH3 groups. Further prominent resonances in the spectra acquired were those of propionate (both dCH3 and dCH2 functions, signals 1 and 8 respectively) and glycine (α-CH2 protons, signal 15), together with that of the single <sup>1</sup> H NMR-detectable proton (H-CO2 ) of formate (signal 20). Further, albeit weak signals were those assignable to dimethyl- and trimethylamine, the terminal-CH2 groups of amine species such as lysine and 5 aminovalerate, and the >NdCH3 group of creatinine/creatine, together with two aromatic resonances (one a composite phenylalanine/tyrosine one), although all these biomolecules predominantly had salivary concentrations below their LLOQ values. A full list of all resonances identified in the 60 MHz 1H (superscript 1) NMR profiles of HSS samples analysed is provided in **Table 1**.

Acceptable benchtop 60 MHz spectra were also obtained with only 64 scans, which involved a 10 min acquisition time, although all <sup>1</sup> H NMR resonances therein were, however, notably affected by significantly lower signal-to-noise (STN) parameters, as expected. All carboxylic acid anions detectable represent oral microbial catabolites, and the simultaneous <sup>1</sup> H NMR measurement of their salivary concentrations in this manner may offer significant potential regarding the provision of valuable diagnostic and prognostic screening information for dental surgeons and oral healthcare specialists alike, especially those regarding conditions such as dental caries and periodontal diseases, as previously noted in studies performed with conventional specialist NMR laboratory-based medium- and high-resolution 400 and 600 MHz operating frequency spectrometers respectively [19, 20]. Moreover, amino acids detectable such as glycine are potentially derived from the actions of proteolytic bacteria on salivary proteins, although there are, of course, also host sources of this metabolite [21], as indeed there are for some of the organic acid anions, such as acetate.

#### **Figure 1.**

*Experimental 60 MHz <sup>1</sup> H NMR profile of a HSS sample acquired with 384 scans. Numerical assignment codes correspond to those in Table 1. TSP represents the 3-(trimethylsilyl)propionate-2,2,3,3-d4 chemical shift reference and internal quantitative <sup>1</sup> H NMR standard, and H2O/HOD the residual water signal.*


*Abbreviation: 5-AV, 5-aminovalerate.*

*\* Indicates resonances of molecules which may also arise from exogenous sources, e.g., propane-1,2-diol and methanol from tobacco smoking [23], and ethanol from alcoholic beverage ingestion [19].*

#### **Table 1.**

*Assignments for resonances present in the 60 MHz <sup>1</sup> H NMR spectra shown in Figure 1 (coupling patterns for these are also provided). Resonances highlighted in red are visible in both LF (60 MHz) and HF (400 MHz) <sup>1</sup> H NMR profiles, whereas those in blue are observable but not readily quantifiable at the lower operating frequency in view of overlap with or close localization to other biomolecule signals, or being below the lower limits of quantification (LLOQ) values for their assigned metabolites. The identities of selected signals were confirmed via reference to [22].*

Although the 60 MHz spectral profiles of HSS specimens investigated are largely dominated by the highest intensity resonances therein (i.e. those assigned to biomolecules of the highest salivary concentrations such as acetate and propionate), this technique also lent itself to the assignment of lower intensity signals, and the quantification of salivary metabolites present at significantly lower concentrations. Indeed, the singlet resonances of formate, methanol and glycine were clearly resolved from potential overlapping signals, and therefore appeared suitable for quantification purposes. As previously documented [23], one major source of salivary methanol in humans is the ingestion of cigarette smoke. However, in some HSS specimens, the

*Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

lactate-CH3 function doublet resonance at δ = 1.33 ppm was also clearly visible in 60 MHz spectra, most especially those with quite high millimolar salivary levels. Although highly variable, reported mean levels for salivary lactate in adults are 0.1– 20.3 mmol/L [24]. However, our previously reported overall mean lactate concentration in saliva samples collected from a pre-fasting healthy control population was found to be 13.3 mmol/L, although those of replicate samples from n = 20 participants also varied substantially, i.e. from 0.08 to 100.9 mmol/L [18].

TSP was present in HSS analyte solutions at an added level of only 238 μmol/L (i.e. a 238/9 = 26.44 μmol/L single <sup>1</sup> H nucleus equivalent value), and because this was also one of the most intense signals present in the spectra acquired (*s*, δ = 0.00 ppm), concentrations of <60 μmol/L were also readily visible in spectra of chemical model systems containing this internal standard. However, since this intense resonance arises from a total of 9 protons (i.e. 3 equivalent Si-CH3 units), it is conceivable that many salivary and perhaps other biofluid metabolites containing only single, or one or more magnetically-distinguishable dCH3 functions with singlet resonances (for example, acetate) are clearly detectable and potentially also quantifiable at concentrations of ≤150 μmol/L in this complex, multicomponent biofluid matrix, although further investigations are required to explore this. Moreover, for such dCH3 function-containing analytes, a LLOD value of *ca.* 100 μmol/L was estimated for HSS samples (i.e. 3 � the mean spectral noise intensities at their specified chemical shift values in water blank solution spectra acquired under the same experimental conditions, or a SNR value of 3). Such LLOD values will also be influenced by further factors, such as *T2*-dependent resonance line-widths (the potential influence of which will be much greater at LFs), the dependence of noise on chemical shift values, saturation effects, digital resolution and baseline slants, etc. Notwithstanding, further key experiments are required to determine these LLOQ values (with SNR = 10) for key biomolecules present within LF salivary spectra, particularly oral disease-linked biomarkers of interest.

The relaxation delay employed for acquisition of LF 60 MHz <sup>1</sup> H NMR spectra, i.e. 10 or 15 s, did not give rise to any differences in the estimated salivary concentrations of acetate, propionate, formate, glycine and methanol (paired sample t-tests performed on untransformed datasets, n ≥ 10 matched spectra for each biomolecule). These data demonstrated that a relaxation delay of 10 s was sufficient for full T1 relaxation of the <sup>1</sup> H environments involved.

#### **3.2 Comparative evaluations of the <sup>1</sup> H NMR profiles of human salivary supernatants at 60 and 400 MHz operating frequencies**

All resonances detectable in the above 60 MHz 1H NMR profiles were, of course, also readily detectable in corresponding 400 MHz spectra, and as expected, median resonance STN values were much greater, and hence corresponding LLOQ and LLOD parameters were substantially lower with this MF spectrometer. Indeed, signals arising from metabolites with detectable but not quantifiable signals in the 60 MHz profiles acquired, for example the malodorous amines DMA and TMA, and the amino acids phenylalanine and tyrosine, were readily quantifiable at the 400 MHz operating frequency. Additionally, in view of the much improved spectral resolution and quality, metabolites which were only detectable but unquantifiable, or completely undetectable at 60 MHz, were also detectable and predominantly quantifiable at MF, and these included leucine (dCH3 (*t*), δ = 0.96 ppm); valine (dCH3s (both *d*), δ = 0.98 and 1.03 ppm); alanine (dCH3 (*d*), δ = 1.48 ppm); glutamate (γ-CH2 (*m*),

δ = 2.34 ppm); glutamine (γ-CH2 (*m*), δ = 2.44 ppm); taurine (dCH2NH3 <sup>+</sup> and dCH2SO3 (both *t*), δ = 3.23 and 3.47 ppm respectively); *n*-butyrate (dCH3, dCH2 and CH2CO2 (*t*, *m* and *t*), δ = 0.94, 1.55 and 2.15 ppm respectively); 2,2-dimethylsuccinate (dCH3s (*s*), δ = 1.22); 3-D-hydroxybutyrate (dCH3 (*d*), δ = 1.24 ppm); lactate (dCH3 and dCH (*d* and *q*), δ = 1.33 and 4.13 ppm respectively); 5 aminovalerate (β/γ-, α- and δ-CH2s (*m*, *t* and *t*), δ = 1.63, 2.23 and 3.05 ppm respectively); pyruvate (dCH3 (*s*), δ = 2.39 ppm); succinate (dCH2s (*s*), δ = 2.405 ppm); choline (dN(CH3)3 <sup>+</sup> head group (*s*), δ = 3.21 ppm); ethanol (dCH3 and dCH2OH (*t* and *q*), δ = 1.18 and 3.66 ppm respectively); carbohydrates such as glucose and sucrose (C1H anomeric protons located at 4.66/5.25 for the former (both *d*s), and 5.41 ppm (*d*) for the glycosidic proton of the latter, respectively), where detectable; dihydroxyacetone (dCH2OH (s), δ = 4.46 ppm), and N-acetylsugars and N-acetylamino acids, both high- and low-molecular mass (broad and narrow dNHCOCH3 signals respectively (*s*), δ = 2.01–2.08 ppm); aromatic amino acids (aromatic ring resonances of tyrosine (2 *d*, δ = 6.88 and 7.25 ppm), phenylalanine (3 *m*, δ = 7.32, 7.38 and 7.43 ppm) and histidine (2 s, δ = 7.07 and 7.81 ppm); and those assignable to a number of pyrimidine, or nicotinate and nicotinamide pathway metabolite(s).

Plots of salivary acetate, glycine and methanol concentrations determined on the LF 60 MHz NMR facility *versus* those obtained on the HF 400 MHz instrument were all linear (r = 0.990, 0.987 and 0.995 respectively), and these results confirmed goodto-excellent correlations and agreements between these two methods of <sup>1</sup> H NMR analysis for these biomolecules. However, this correlation was found to be less strong for formate (r = 0.927). Moreover, for propionate, despite a strong linear correlation between these two forms of NMR analyses (r = 0.973), 95% confidence intervals (CIs) for its regression coefficient were found to be significantly greater than the 1.00 value expected for good agreement between these values (i.e. 1.22–1.41). This observation is explicable by potential interferences arising from further <sup>1</sup> H NMR resonances located within the δ = 0.92–1.18 ppm chemical shift range spanned by the propionate-CH3 signal at 60 MHz (15.3 Hz in total: *J* = 7.67 Hz for this signal). These potential interfering signals are those assignable to the terminal-CH3 functions of both longand alternative short-chain fatty acids (including that of *n*-butyrate at δ = 0.94 ppm and those of branched-chain amino acids such as valine). Therefore, the apparent propionate concentration determined at 60 MHz was significantly inflated by a mean value of approximately 1.32-fold over those determined at the more conventional 400 MHz operating frequency.

In order to explore the significant LF spectral overestimations of propionate further, the complete chemical shift span of its -CH3 function triplet at 60 MHz (0.92– 1.18 ppm) was integrated in the corresponding 400 MHz spectra acquired, and apparent propionate concentrations obtained in this manner were then plotted against those determined at 60 MHz (**Figure 2**). As expected, this plot was found to display a much-improved agreement between the two estimated concentration values, with 95% CI values for the regression coefficient and y-intercept both covering unity and 0.00 respectively (r = 0.985). However, a comparison of these bioanalytical calibration plots indicated only a relatively marginal interference from the above potentially overlapping, albeit minor signals for the direct determinations of propionate at 60 MHz when its salivary level was *ca.* ≤1.0 mmol/L. Therefore, for the LF NMR determination of salivary propionate, one possible solution is that samples with concentrations greater than this value are first diluted to levels close to or below this limit in order to facilitate its quantification.

*Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

**Figure 2.**

*Comparison of a plot of the estimated salivary concentrations of propionate determined at 60 MHz with that from spectra acquired at 400 MHz for its* d*CH3 function resonance encompassing its chemical shift span i.e. 2 7.67 = 15.34 Hz at both operating frequencies, and equivalent to 0.256 and 0.038 ppm at 60 and 400 MHz respectively (red data-points and regression line), with that obtained from employment of the corresponding 60 MHz 0.256 ppm bucket span for integration purposes at 400 MHz, i.e. 0.92–1.18 ppm (blue data-points and regression line).*

Nevertheless, a novel statistical approach was employed for determining the maximal concentration determinable at this operating frequency. Firstly. an alternative ANCOVA model (model 2) was designed and employed for statistical analysis of the non-smoking participant group only (Eq. (2)), and this included a consideration of variance contributions from differential ages, genders, participants and spectrometer operating frequencies, plus the age gender first-order interaction effect. Secondly, *p* values for the statistical significance of the fixed 'betweenoperating frequencies' (O*k*) effects were isolated for a series of these ANCOVA models arising from sequential removal of the highest or highest remaining estimated salivary propionate concentration, i.e. [propionate]*max* (at 60 MHz), starting from a prior sample size of n = 30 observations (i.e. with coupled 60 and 400 MHz <sup>1</sup> H NMR determinations for each of n = 15 participants), down to only n = 6 (with coupled measurements made for only n = 3 participants); as expected, these *p* values increased with decreasing sample size, i.e. the degree of statistical significance between the two operating frequencies tested decreased. Thirdly, log10 transformations of these *p* values were then plotted as a function of [propionate]*max* value (**Figure 3**), and the ordinate axis value of this curve set at *p* = 0.05 served to provide an estimate for the latter value's limit for bioanalytical purposes, i.e. that at which the difference observed between salivary concentrations determined at 60 and 400 MHz operating frequencies became statistically insignificant at the 5% level. This limit was therefore estimated as 1.2 mmol/L for salivary propionate concentrations determined at 60 MHz, a value similar to the 1.00 mmol/L limit proposed above.

#### **Figure 3.**

*(a) Polynomial plot of -log10* p *value obtained for the significance of the 'between-operating frequencies' mean square of the ANCOVA model of Eq. (2) as a function of decreasing maximal salivary propionate concentration ([propionate]*max*) with sample size, the latter diminishing via the sequential removal of the [propionate]*max*, value (from n = 15 to n = 3 matched duplicate samples, one determination made at 60 MHz, one at 400 MHz <sup>1</sup> H NMR operating frequencies). The horizontal black line represents the log10* p *index arising from a* p *value of 0.05, i.e. the minimal level required for statistical significance; its crossing with blue polynomial plot yields a [propionate]*max *value of 1.2 mmol/L. The quadratic equation fitted to the experimental data was log10* p *= 0.603 + 1.261[Propionate]max + 0.226[Propionate]*max*<sup>2</sup> (R<sup>2</sup> = 0.9795), which was an improved fit over that obtained with a standard linear relationship.*

## **3.3 PCA and MANOVA assessments of the bioanalytical precision of metabolite determinations at 60 MHz operating frequency**

Subsequently, principal component analysis (PCA) was utilized to monitor the precision of duplicate, 'between-assay'sample analyses with this facility in a model containing 5 LF NMR-detectable salivary metabolites with quantifiable concentrations (acetate, propionate, formate, glycine and methanol). For this purpose, n = 9 duplicate sets of samples were randomly drawn from the tobacco-smoking group, and analyzed in different assay batches conducted with LF 60 MHz spectral acquisitions made on separate work-days. With the exception of two sets of matched analyses, there was a good agreement of both PC1 and PC2 scores obtained for all duplicate samples analyzed (**Figure 4**). Moreover, MV analysis-of-variance (MANOVA) of these experimental data found that the 'between-replicates' and the participant replicate interaction effects were both statistically insignificant (*p* = 0.80 and 0.97 respectively), although there was a very highly significant difference observed 'between-participants' (*<sup>p</sup>* = 5.50 <sup>10</sup><sup>4</sup> ), as expected.

*Metabolomics Distinction of Cigarette Smokers from Non-Smokers Using Non-Stationary… DOI: http://dx.doi.org/10.5772/intechopen.101414*

#### **Figure 4.**

*PCA scores plot of PC2* versus *PC1 for duplicate LF NMR determinations of 5 metabolites in saliva specimens collected from n = 9 tobacco smoking participants at an operating frequency of 60 MHz. This plot demonstrates that, with the exception of samples 1 and 4, there was a satisfactory agreement between the repeated determinations. Abbreviations: n\_1 and n\_2 refer to the first and second duplicate determinations made on sample n, and so on.*

### **3.4 UV evaluations of metabolic differences between smoking and non-smoking participants**

Mean SEM salivary levels of acetate, propionate, formate, glycine and methanol determined using the LF 60 MHz benchtop spectrometer for both the non-smoking and smoking groups are provided in **Table 2**. Univariate statistical analyses of these data revealed that the salivary concentrations of methanol, glycine and acetate were all significantly higher in smokers, the substantial upregulation in methanol observed being concordant with results previously acquired by Percival et al. [23].

The above model 1 experimental design featured the 'between-age (A*i*), -gender (G*j*) and -smoking status (S*k*)' main factors (fixed effects), the 'between-participants' random effect (P(*<sup>k</sup>*)*l*), and the AG*ij*, AS*ik* and GS*jk* first-order interactions effects. Results from this factorial analysis are shown in **Table 3**. Overall, the participant age factor was not significant for any metabolite; the gender predictor variable was only close to statistical significance for glycine, with males having higher concentrations than females; cigarette smoking exerted highly significant effects on acetate, glycine and, of course, methanol. For the first-order interactions investigated, only the GS*jk* effect was statistically significant, but only for acetate and glycine, i.e., non-additive responses to the four differential gender-smoking status combinations were observed. The random sP <sup>2</sup> effect confirmed very highly significant differences 'betweenparticipants' for all metabolites investigated.


*Ϯ Propionate levels were determined at an operating frequency of 400 MHz in view of resonance overlap complications observed at 60 MHz.*

*ϮϮAt an operating frequency of 60 MHz, this agent was <sup>1</sup> H NMR-quantifiable in n = 3 samples only in the non-smoking participant cohort.*

*\** p *< 0.05 for differences between the mean values of non-smokers and smokers; \*\**p *< 10<sup>3</sup> (test performed for determining the significance of the 'between-smoking status' mean square in the model 1 ANCOVA model design delineated in (Eq. (1))).*

#### **Table 2.**

*Mean SEM concentrations of salivary biomolecules determined by LF 60 MHz <sup>1</sup> H NMR analysis of HSS samples in non-smoking and smoking sampling groups (sample sizes were n = 33 and 19 for these groups respectively; ranges are provided in brackets).*


*Abbreviations: sp 2 , 'between-participant' component of variance; ns, not significant. \**

*As expected, this difference was statistically significant for only the smoking participants and not the non-smoking group, and this* p *value corresponds to that cohort only.*

#### **Table 3.**

*Statistical significance (*p *values) of all main sources of variation (both fixed and random), and first-order interaction effects, from ANCOVA analysis of the model 1 salivary metabolite dataset.*
