*2.1.1 Data set*

solid materials and some are toxic to fish and other forms of aquatic environment. Very low concentrations of these molecules have unfavorable effects on the taste

All phenolic compounds can be considered as important parameters of the organoleptic (color, flavor, and aroma) and nutritional qualities of food products. The phenolic compounds which participate in the vegetable aroma are relatively simple volatile compounds whose odors can be pleasant or unpleasant. Vanilla, for example, is the most popular aroma in the world, and its production is estimated at 1500 tons per year [4]. Approximately 250 compounds are responsible for vanilla aroma and among these are about 20 phenolic compounds, the most abundant of which are vanillin, p-hydroxybenzaldehyde, and vanillic acid [5]. The spices we use to enhance taste and flavor of food contain volatile compounds characterized by the presence of a methoxyl group. 4-vinyl guaiacol is responsible for the pleasant odors that occur during the manufacture and storage of citrus juices (orange and grapefruit in particular). This compound is formed from the degradation of ferulic acid, and the quality of the orange juice aroma is directly related to changes in free ferulic acid and 4-vinyl guaiacol contents [6]. These two compounds are also produced during the thermal degradation of lignin. With their derivatives (4-methyl guaiacol, 4-ethyl guaiacol, vanillin, vanillic acid, etc.), they are at the origin of the aroma developed by the smoking techniques used in meat and fish

Some alkylated phenols represent another group of compounds with a constantly weak odor. In addition, some individual odorants in this group have been described in several studies as having various sensory properties. Because of their obviously high odor potency, the odor thresholds of the alkylated phenols have been

The multidimensional quantitative structure-activity/property relationship (multidimensional-QSAR/QSPR) analysis is a computational method used to predict biological activities or chemical properties of existing or supposed chemical compounds. With incessant development, the multidimensional-QSAR/QSPR analyses have made notable achievement in diverse fields, such as toxicology and medicinal chemistry [8, 9]. Through the fast progress of computer science and theoretical study, it can quickly and accurately find molecular information (chemical descriptors) of compounds by computation. These chemical descriptors used in the construction of the QSAR/QSPR models can increase the interpretability and

The release of odorant molecules from a solid or liquid medium and their passage in the vapor phase is the first step before a possible perception due to the activation of the olfactory receptors present in the nasal cavity followed by a series of complex neurophysiological reactions, in order to code a particular smell, that's why in this study, a series of 29 volatile alkylated phenols, including monoalkylated phenols and di- and trimethylphenols, were subjected to a quantitative structure retention relationships (QSRR) studies, we have developed two- and three-

dimensional quantitative structure retention relationships (2D- and 3D-QSRR) for a series of 29 molecules odorants based on phenol. We construct 2D-QSRR model using 28 descriptors. The 3D-QSAR/QSPR models were constructed using the comparative molecular field analysis (CoMFA) [11] tools that collect and interpret complex data from series of bioactive molecules to construct computational models that correlate chemical properties with biological activity/propriety [12]. Through this approach, molecular features responsible for the retention property of the investigated compounds (alkylated phenols) were identified using the CoMFA contour plots. Furthermore, the statistical consistency of the developed models was evaluated on the basis of their correlation ability for the training set, as well as their

can predict the activity/property of new molecules [10].

and odor of water and fish [3].

*Sino-Nasal and Olfactory System Disorders*

conservation [7].

extensively evaluated.

**156**

The reliability of the 2D-QSRR analysis is depending on the available data set, and the method of analysis and the validations. In the present analysis, a series of 29 selected alkylated phenols that have been evaluated for their linear retention indices was taken from literature, and as reported in the literature [14], high-resolution GC/O (HRGC/O) analyses were performed with a type 5160 gas chromatograph (Carlo Erba), and the analyses were accomplished using DB-1701, as demonstrated by Czerny et al. [14]. We considered to carry out the 2D-QSRR analysis: 24 molecules are selected to propose the quantitative model (training set) and 5 compounds that have been selected randomly and were not used in training set have served to test the performance of the proposed model (test set). **Table 1** shows the studied compounds and the experimental linear retention indices values (LRI).


#### **Table 1.** *Alkylated phenols used in this study and their experimental linear retention indices.*

### *2.1.2 Molecular descriptors generation*

Twenty-eight molecular descriptors were calculated using ACD/ChemSketch and ChemOffice programs [15, 16] to predict the correlation between these descriptors and the retention property of studied compounds and to develop a linear model [17]. The descriptors used in this study are displayed in **Table 2**.

## *2.1.3 Statistical analysis*

To explain the structure-property relationship, 28 descriptors are calculated for the 29 molecules using the ChemOffice and ChemSketch software, and they were subjected to a stepwise multiple linear regression (MLR) available in the SPSS software [18]. The stepwise MLR was generated to predict retention property values Log(LRI). Equation was justified by the correlation coefficient (r), the root mean square of the errors (RMSE), the Fishers F-statistic (F), and the significance level (*P*-value) [19].

no. 18, which was the most active compound (with highest Log(LRI)), was used as

*2D- and 3D-QSRR Studies of Linear Retention Indices for Volatile Alkylated Phenols*

Based on the molecular alignment, CoMFA studies were performed to analyze the specific contributions of steric and electrostatic effects. These interactions were calculated using the Tripos force field with a distance-dependent dielectric constant at all interactions in a regularly spaced (2 Å) grid taking a sp3 carbon atom

as steric probe and a +1 charge as electrostatic probe. The cutoff was set to 30 kcal/mol [25]. With standard options for scaling of variables, the regression analysis was carried out using the fully cross-validated partial least squares (PLS)

method (leave one out) [26]. The final model that is non–cross-validated

conventional analysis was developed with the optimum number of components to

The 3D-QSRR models were generated using a training set of 24 molecules. Predictive power of the resulting models was evaluated using a test set of five molecules (**Table 1**). The test compounds have been selected randomly. PLS analysis used to construct the 3D-QSRR models is an extension of multiple regression analysis in which the initial variables are replaced by optimum number of components of their linear combinations. PLS statistical method with leave-one-out (LOO) cross-validation procedure was used in this work to determine the optimal numbers of components considering cross-validated coefficient rCV for the training set of 24 molecules. The external validation of created models was determined using five compounds (test set). The final analysis (non-cross-validated analysis) was carried out using the optimum number of components obtained from the cross-validation

A 2D-QSRR study was carried out for a series of 29 alkylated phenols, as indicated above, to determine a quantitative relationship between the structure and the

template (**Figure 1**).

**Figure 1.**

*2.2.2 CoMFA studies*

yield a non–cross-validated r2 value.

*2.2.3 Partial least squares analysis (PLS) and validation*

*3D structure of the core (molecule no. 1) and the template (molecule no. 18).*

*DOI: http://dx.doi.org/10.5772/intechopen.89576*

analysis to get correlation coefficient r2 [27, 28].

**3. Results and discussions**

**3.1 2D-QSRR study**

**159**

*3.1.1 Data set for analysis*

The final stage of this 2D-QSRR analysis consists of statistical validation in order to assess the significance of the model and hence its ability to predict property of other compounds. In this chapter, the model was validated internally by the crossvalidation test. The cross validations are statistical techniques in which different proportions of chemicals are iteratively held out from the training set used for model development. In this chapter, the leave-one-out procedure is used; this process sequentially removes one compound from the training set containing 24 compounds. A 2D-QSRR model is created on a "23" set of molecules, and the molecule removed is predicted by the constructed model. This process is repeated "24" times in order to predict the retention property of all compounds [20].

#### **2.2 3D-QSRR study**

#### *2.2.1 Minimization and alignment*

Chemical structures of studied compounds were sketched with sketch module in SYBYL [21] and minimized using Tripos force field [22] with the Gasteiger-Hückel charges [23] and conjugated gradient method, and gradient convergence criteria of 0.01 kcal/mol. Simulated annealing on the energy minimized structures was performed with 20 cycles.

Molecular alignment is one of the most sensitive parameters in 3D-QSRR methods. In this work, all studied compounds were aligned on the common core (compound no. 1), using the simple alignment method in Sybyl [24]. Compound


#### **Table 2.**

*Descriptors selected and software packages used in the calculation of descriptors.*

*2D- and 3D-QSRR Studies of Linear Retention Indices for Volatile Alkylated Phenols DOI: http://dx.doi.org/10.5772/intechopen.89576*

**Figure 1.** *3D structure of the core (molecule no. 1) and the template (molecule no. 18).*

no. 18, which was the most active compound (with highest Log(LRI)), was used as template (**Figure 1**).
