PC1 = 0.20 (QN1) + 0.06 (Gap energy) + 0.71 (Mor05u) − 68 (MlogP). (2)

From this equation, more active nitrofurans, in general, can be obtained when we have lower values for the QN1 combined with lower values for Gap energy and Mor05u and higher values for MlogP.

#### *3.2.3.2 HCA model*

The results of the HCA model are displayed in the dendrogram in **Figure 9** and are similar to those of PCA model. The nitrofurans are fairly well grouped according to their activity. From this figure, the two clusters (+ and −) mirror the same two classes displayed by PCA model (**Figure 7**).


*Inhibitor concentration of 5 μM. b Growth inhibition ≥ 75, more active (MA)<sup>c</sup> , and growth inhibition < 75, less active (LA)<sup>c</sup> .*

#### **Table 2.**

*Values for the four most important descriptors which classify the studied nitrofuran compounds, in vitro T. cruzi growth inhibition (experimental data), activity, and correlation matrix.*

#### *3.2.3.3 KNN model*

**Table 4** shows the results for the KNN models obtained with the KNN technique and constructed with one (1NN) to four (4NN) nearest neighbors. To all models the percentage of correct information was 100%. We used the model 4NN because the greater the number of the nearest neighbors, the better the reliability of the KNN technique, and the same was used for validation of the training set from **Figure 2**.

#### *3.2.3.4 SDA model*

In the construction of the SDA model, the discrimination functions for groups more active and less active, respectively, are given below:

**61**

**Figure 7.**

**Figure 8.**

Group MA (more active):

Group LA (less active):

*Molecular Electrostatic Potential and Chemometric Techniques as Tools to Design Bioactive…*

0.51(QN1) + 0.43Gap energy + 3.05Mor05u − 1.5MlogP–0.62 (3)

*Loading vector plots of the first PCs, PC1 and PC2, for four variables responsible for the separation of the 23* 

*nitrofurans (training set) into two classes: (+) more active and (−) less active against T. cruzi.*

*Score plots of the two first PCs, PC1 and PC2, responsible for the separation of the 23 nitrofurans (training set)* 

−0.80QN1 − 0.67Gap energy − 4.75Mor05u + 2.34MlogP − 3.92 (4)

Also, through the discrimination functions, Eqs. (3) and (4), and of the value of each descriptor for the nitrofurans, we obtain the classification matrix by using all compounds from the training set (**Table 5**). The classification error was 0.00% resulting in a satisfactory separation of more active and less active compounds. From SDA model, the allocation rule was derived when the activity against *T. cruzi* of new nitrofurans is investigated: (a) initially calculate, for the new compound, the value of the most important descriptors obtained in the construction of the SDA model, (b) put these auto-scaled values in the two discrimination functions

*DOI: http://dx.doi.org/10.5772/intechopen.89113*

*into two classes: (+) more active and (−) less active against T. cruzi.*

*Molecular Electrostatic Potential and Chemometric Techniques as Tools to Design Bioactive… DOI: http://dx.doi.org/10.5772/intechopen.89113*

#### **Figure 7.**

*Cheminformatics and Its Applications*

**Nitrofurans QN1 Gap** 

**energy (kcal/mol)**

− 0.201 220.9 −3.966 1.135 30 LA − 0.201 220.9 −2.938 1.708 20 LA − 0.165 220.9 −2.723 0.181 32 LA 4+ 0.165 226.5 −6.869 1.980 92.7 MA 5+ 0.165 225.3 −7.439 3.155 83.7 MA 6+ 0.169 229.7 −0.016 1.708 96.2 MA 7+ 0.164 208.3 −7.439 1.889 81.9 MA − 0.164 205.2 −4.854 0.334 26.7 LA − 0.166 215.9 −3.292 0.478 58 LA 10+ 0.166 215.9 −7.470 2.146 90 MA 11+ 0.164 208.3 −5.674 1.354 87.4 MA 12+ 0.164 208.3 −8.435 3.307 92.3 MA − 0.167 195.2 −4.338 0.751 12 LA − 0.161 203.3 −2.872 0.501 3 LA − 0.167 208.3 −4.217 0.411 30 LA − 0.167 225.3 −2.373 0.609 20 LA − 0.167 225.9 −4.054 1.063 6 LA 18+ 0.167 225.3 −6.339 2.001 75 MA − 0.166 225.3 −4.145 0.398 31 LA − 0.167 226.5 −4.786 0.667 35 LA − 0.167 225.3 −3.398 1.157 23 LA − 0.166 218.4 −3.876 0.802 14 LA 23+ 0.166 224.6 −6.314 3.014 90.5 MA

**Mor05u MlogP % in vitro** 

*T. cruzi* **growth inhibitiona,b**

*, and growth inhibition < 75, less active* 

**Activityc**

**60**

*a*

*(LA)<sup>c</sup> .*

**Table 2.**

*3.2.3.3 KNN model*

Gap energy −0.171

*Inhibitor concentration of 5 μM. b*

Mor05u 0.27 −0.006

MlogP 0.026 −0.184 −0.785

*growth inhibition (experimental data), activity, and correlation matrix.*

*3.2.3.4 SDA model*

**Table 4** shows the results for the KNN models obtained with the KNN technique and constructed with one (1NN) to four (4NN) nearest neighbors. To all models the percentage of correct information was 100%. We used the model 4NN because the greater the number of the nearest neighbors, the better the reliability of the KNN technique, and the same was used for validation of the training set from **Figure 2**.

*Growth inhibition ≥ 75, more active (MA)<sup>c</sup>*

*Values for the four most important descriptors which classify the studied nitrofuran compounds, in vitro T. cruzi* 

In the construction of the SDA model, the discrimination functions for groups

more active and less active, respectively, are given below:

*Score plots of the two first PCs, PC1 and PC2, responsible for the separation of the 23 nitrofurans (training set) into two classes: (+) more active and (−) less active against T. cruzi.*

#### **Figure 8.**

*Loading vector plots of the first PCs, PC1 and PC2, for four variables responsible for the separation of the 23 nitrofurans (training set) into two classes: (+) more active and (−) less active against T. cruzi.*

Group MA (more active):

0.51(QN1) + 0.43Gap energy + 3.05Mor05u − 1.5MlogP–0.62 (3)

Group LA (less active):

$$-0.80 \text{QN1} \text{ - } 0.67 \text{Gap energy} \text{ - } 4.75 \text{Mor} \text{05u} \text{ + } 2.34 \text{MlogP} \text{ - } 3.92 \qquad \text{(4)}$$

Also, through the discrimination functions, Eqs. (3) and (4), and of the value of each descriptor for the nitrofurans, we obtain the classification matrix by using all compounds from the training set (**Table 5**). The classification error was 0.00% resulting in a satisfactory separation of more active and less active compounds. From SDA model, the allocation rule was derived when the activity against *T. cruzi* of new nitrofurans is investigated: (a) initially calculate, for the new compound, the value of the most important descriptors obtained in the construction of the SDA model, (b) put these auto-scaled values in the two discrimination functions

#### *Cheminformatics and Its Applications*


**Table 3.**

*Variables matrix for the first three principal components.*

#### **Figure 9.**

*Dendrogram obtained with HCA technique for the separation of the nitrofurans into two classes: (+) more active and (−) less active against T. cruzi.*


#### **Table 4.**

*Classification obtained with the KKN technique.*

performed in this work, and (c) check which discrimination function, Eq. (3) or Eq. (4), presents higher value. The new compound is more active if it is related to discrimination function of group more active and vice versa.

In order to check the reliability of the model, the "leave-one-out technique" was employed. One nitrofuran compound is excluded from the data set, and the remaining compounds are used in building the classification functions.

Subsequently, the removed analogue is classified according the generated classification functions. In the further step, the omitted compound is included, and a new nitrofuran is removed, and the procedure goes on until the last compound is removed. In **Table 6** the results obtained with the cross-validation model are summarized.

**63**

biological receptor.

*Molecular Electrostatic Potential and Chemometric Techniques as Tools to Design Bioactive…*

**Classification group or class Number of compounds More active Less active** Group (Class): more active 9 9 0 Group (Class): less active 14 0 14 Total 23 9 14 % Correct information — 100 100

**Classification group or class Number of compounds More active Less active** Group (class): more active 9 9 0 Group (class): less active 14 0 14 Total 23 9 14 % correct information — 100 100

**True group**

**True group**

The SIMCA model were built with the same descriptors as PCA, HCA, KNN, and SDA models and used two (2) PCs in the modeling of the two classes: more active nitrofurans (**4–7**, **10–12**, **18,** and **23**) and less active (**1–3, 8, 9, 13–17,** and **19–22**) nitrofurans. In **Table 7**, the obtained results for the SIMCA model are shown. In this case, the information percentage was also 100%. According to the PCA, HCA, KNN, SDA, and SIMCA models, we can also notice that the QN1, gap energy, Mor05u, and MlogP descriptors are key properties for explaining the anti-*T.* 

*Classification matrix obtained by using SDA technique with cross-validation technique.*

As QN1, gap energy, Mor05u, and MlogP properties were selected in the chemometric modeling as the most important characteristics to describe the antitrypanosomal activity, some considerations about them may be relevant to the understanding of the behavior of more active nitrofurans. According to classical chemical theory, chemical interactions can be classified in two categories: electrostatic (polar) or orbital (covalent). Electrical charges in the molecule are indubitably the impelling cause of electrostatic interactions. It has been demonstrated that local electron densities or charges are important in many chemical reactions, physicochemical properties, and ligand–receptor interactions [89, 90]. Thus, charge-based parameters have been widely employed as chemical reactivity indices or as measures of weak intermolecular interactions. Many quantum–chemical descriptors are derived from the partial charge distribution in a molecule or from the electron densities on particular atoms [91]. From **Table 2**, we can observe that, in general, QN1 for more active analogues must present lower values than the less active ones. This is an indication that biological processes can occur through electrostatic interactions between the more active nitrofurans and an eventual

Gap energy is an important stability index. A large gap energy implies high stability for the molecule in the sense of its lower reactivity in chemical reactions.

*cruzi* activity of the nitrofurans training set (**Figure 2**).

*DOI: http://dx.doi.org/10.5772/intechopen.89113*

*Classification matrix obtained using SDA technique.*

*3.2.3.5 SIMCA model*

**Table 5.**

**Table 6.**

*Molecular Electrostatic Potential and Chemometric Techniques as Tools to Design Bioactive… DOI: http://dx.doi.org/10.5772/intechopen.89113*


#### **Table 5.**

*Cheminformatics and Its Applications*

*Variables matrix for the first three principal components.*

**62**

summarized.

**Figure 9.**

**Table 3.**

**Table 4.**

*active and (−) less active against T. cruzi.*

*Classification obtained with the KKN technique.*

performed in this work, and (c) check which discrimination function, Eq. (3) or Eq. (4), presents higher value. The new compound is more active if it is related to

*Dendrogram obtained with HCA technique for the separation of the nitrofurans into two classes: (+) more* 

**Variable PC1 PC2 PC3** QN1 0.20 0.66 0.69 Gap energy 0.06 −0.70 0.70 Mor05u 0.71 0.11 −0.10 MlogP −0.68 0.26 0.17

**Category Number of compounds Compounds incorrectly classified**

Class:more active 9 0 0 0 0 Class: less active 14 0 0 0 0 Total 23 0 0 0 0 % Correct information 100 100 100 100

**1NN 2NN 3NN 4NN**

In order to check the reliability of the model, the "leave-one-out technique" was employed. One nitrofuran compound is excluded from the data set, and the remain-

Subsequently, the removed analogue is classified according the generated classification functions. In the further step, the omitted compound is included, and a new nitrofuran is removed, and the procedure goes on until the last compound is removed. In **Table 6** the results obtained with the cross-validation model are

discrimination function of group more active and vice versa.

ing compounds are used in building the classification functions.

*Classification matrix obtained using SDA technique.*


#### **Table 6.**

*Classification matrix obtained by using SDA technique with cross-validation technique.*
