*2.5.3 Molecular complexity and flexibility*

The structural descriptors used to quantify fraction of sp3 hybridized carbons (Fsp3 ) [23, 58, 63, 70], fraction of chiral centers (CCF) [23, 59, 63, 70], fraction of aromatic atoms (Faro-atm), globularity [60], principal moments of inertia (PMI), normalized principal moments of inertia ratio (NRP) [61, 62], molecular complexity, shape index of Kier, and molecular flexibility were calculated with DataWarrior program [69] and MOE 2018.0101 [64]. **Figures 14**–**19** showed the descriptors utilized to evaluate the complexity and the molecular flexibility.

**Tables 6**–**8** summarize the statistics of the distribution of Fsp3 , FCC, and Faroatm of NPs and reference data sets. These results indicate that the NP data set has the largest complexity molecular in Fsp3 (0.63) and CCF (0.16) and a low distribution of Faro-atm (0.67–0.78). In contrast, GNF, MMV, St. Jude, and GSK DBs are very similar in these three metrics with values between 0.25 and 0.37, 0.27 and 0.37, and 0.014 and 0.025, respectively. In contrast, the structural flexibility was evaluated with the index of form presenting all databases in the range of 0.41–0.58 indicating that many of the compounds present sphericity and intermediate molecular flexibility (data not presented).

**93**

**Figure 15.**

**Figure 13.**

**Figure 14.**

*Distribution of the fraction of sp3*

*Chemoinformatic Approach: The Case of Natural Products of Panama*

The descriptors globularity, PMI, and NRP did not prove to be suitable metrics to measure and differentiate the molecular complexity in the data sets evaluated. This is because the corresponding values computed for all data sets were very low

*Distribution of the fraction of chiral centers in different databases.*

*Scaled Shannon entropy of the most frequent scaffolds with values ranging from 10 to 40 in natural products.*

 *hybridized carbons in different databases.*

*DOI: http://dx.doi.org/10.5772/intechopen.87779*

*Chemoinformatic Approach: The Case of Natural Products of Panama DOI: http://dx.doi.org/10.5772/intechopen.87779*

*Cheminformatics and Its Applications*

**Similarity PubChem/Tanimoto coefficient**

**DBs Number of** 

**Table 4.**

**Compounds (M)**

**Unique chemotypes (N)**

*The statistical values of the similarity of the Tanimoto coefficient with PubChem.*

*2.5.3 Molecular complexity and flexibility*

*chemotype required to recover 50% of the molecules.*

the largest complexity molecular in Fsp3

flexibility (data not presented).

The structural descriptors used to quantify fraction of sp3

*Summary of the scaffold diversity of the eight databases analyzed in this work.*

inertia (PMI), normalized principal moments of inertia ratio (NRP)

*M = number of molecules in the BD, N = number of chemotypes or substructures, FN/M = chemotype diversity fraction, NSING = singleton number, FNSING/M = singleton fraction between total molecules, FNSING/N = fraction of singleton among total chemotypes, AUC = area under the curve, F50 = fraction of* 

**Tables 6**–**8** summarize the statistics of the distribution of Fsp3

) [23, 58, 63, 70], fraction of chiral centers (CCF) [23, 59, 63, 70],

**FN/M NSING FNSING/M FNSING/**

NPs 1298 629 0.4846 400 0.3082 0.6359 0.7125 0.1685 DBK 5 5 1.0000 5 1.0000 1.0000 0.4800 0.4000 CHEMBL 24 18 0.7500 16 0.6667 0.8889 0.6072 0.3333 OSM 89 39 0.4382 27 0.3034 0.6923 0.7453 0.1025 MMV 124 122 0.9839 120 0.9677 0.9836 0.5079 0.4918 St. JUDE 915 479 0.5235 325 0.3552 0.6785 0.6551 0.2474 GNF 4860 3229 0.6644 2690 0.5535 0.8331 0.7054 0.1615 GSK 12,463 6703 0.5378 5009 0.4019 0.7473 0.6982 0.1837

**DBs Min. 1st Qu. Median Mean 3rd Qu. Max.** GSK 0.08125 0.24500 0.37555 0.40263 0.54002 1.00000 NPs 0.03684 0.32298 0.43802 0.46184 0.58621 1.00000 OSM 0.03684 0.32340 0.43902 0.46253 0.58730 1.00000 MMV 0.03684 0.32444 0.44033 0.46321 0.58791 1.00000 ST JUDE 0.03684 0.38224 0.47143 0.47624 0.56195 1.00000 GNF 0.00000 0.40598 0.48117 0.47800 0.55446 1.00000

**NS**

**AUC F50**

fraction of aromatic atoms (Faro-atm), globularity [60], principal moments of

[61, 62], molecular complexity, shape index of Kier, and molecular flexibility were calculated with DataWarrior program [69] and MOE 2018.0101 [64]. **Figures 14**–**19** showed the descriptors utilized to evaluate the complexity and

atm of NPs and reference data sets. These results indicate that the NP data set has

tion of Faro-atm (0.67–0.78). In contrast, GNF, MMV, St. Jude, and GSK DBs are very similar in these three metrics with values between 0.25 and 0.37, 0.27 and 0.37, and 0.014 and 0.025, respectively. In contrast, the structural flexibility was evaluated with the index of form presenting all databases in the range of 0.41–0.58 indicating that many of the compounds present sphericity and intermediate molecular

hybridized car-

, FCC, and Faro-

(0.63) and CCF (0.16) and a low distribu-

**92**

bons (Fsp3

**Table 5.**

the molecular flexibility.

**Figure 13.** *Scaled Shannon entropy of the most frequent scaffolds with values ranging from 10 to 40 in natural products.*

**Figure 14.** *Distribution of the fraction of sp3 hybridized carbons in different databases.*

#### **Figure 15.**

*Distribution of the fraction of chiral centers in different databases.*

The descriptors globularity, PMI, and NRP did not prove to be suitable metrics to measure and differentiate the molecular complexity in the data sets evaluated. This is because the corresponding values computed for all data sets were very low

**Figure 16.**

*Distribution of the fraction of aromatic atoms (Faro-atm) in different databases.*

#### **Figure 17.**

*Shape index distribution of different databases.*

#### **Figure 18.**

*Distribution of the molecular flexibility in different databases.*

**95**

**Table 8.**

similar metrics [23, 63, 71].

*Distribution of fraction of aromatic atoms.*

*Chemoinformatic Approach: The Case of Natural Products of Panama*

**)**

**DBs Min 1qst median mean 3qrt max dev.st** NPs 0.000 0.481 0.636 0.656 0.833 2.000 0.254 CHEMBL 0.167 0.342 0.536 0.621 0.627 1.333 0.374 MMV 0.000 0.167 0.300 0.316 0.402 0.800 0.190 OSM 0.000 0.174 0.255 0.277 0.338 0.893 0.145 DBK 0.250 0.438 0.519 0.463 0.545 0.565 0.175 GNF 0.000 0.227 0.364 0.377 0.500 2.667 0.207 STJUDE 0.000 0.222 0.333 0.353 0.471 1.136 0.178 GSK 0.000 0.250 0.375 0.372 0.500 1.500 0.180

**DBs min 1qst median mean 3qrt max dev.st** NPs 0.000 0.033 0.139 0.161 0.267 0.656 0.145 CHEMBL 0.000 0.000 0.036 0.128 0.141 0.533 0.192 MMV 0.000 0.000 0.000 0.014 0.000 0.111 0.028 OSM 0.000 0.000 0.000 0.008 0.000 0.286 0.035 DBK 0.000 0.000 0.019 0.020 0.040 0.043 0.024 GNF 0.000 0.000 0.000 0.025 0.040 0.556 0.053 STJUDE 0.000 0.000 0.000 0.024 0.045 0.217 0.037 GSK 0.000 0.000 0.000 0.017 0.034 0.500 0.033

**DBs min 1qst median mean 3qrt max dev.st** NPs 0.000 0.000 0.324 0.341 0.600 1.133 0.294 CHEMBL 0.000 0.299 0.556 0.509 0.690 1.091 0.321 MMV 0.261 0.682 0.826 0.817 0.956 1.429 0.230 OSM 0.000 0.677 0.733 0.786 0.860 1.500 0.232 DBK 0.538 0.591 0.733 0.720 0.862 0.875 0.171 GNF 0.000 0.522 0.667 0.670 0.818 1.714 0.235 STJUDE 0.000 0.553 0.712 0.708 0.857 1.556 0.216 GSK 0.000 0.571 0.706 0.713 0.857 1.400 0.208

(close to zero) and did not differentiate the data sets (data not shown). The large molecular complexity of NPs measured is in agreement with previous studies using

*DOI: http://dx.doi.org/10.5772/intechopen.87779*

 **hybridized atoms (Fsp3**

 *in different databases.*

**Fraction of chiral centers (CCF)**

*Distribution of FCC in different databases.*

**Fraction of aromatic atoms (Faro-atm)**

**Fraction of sp3**

**Table 6.** *Distribution of Fsp3*

**Table 7.**

**Figure 19.**

*Distribution of the molecular complexity in different databases.*

### *Chemoinformatic Approach: The Case of Natural Products of Panama DOI: http://dx.doi.org/10.5772/intechopen.87779*


#### **Table 6.**

*Cheminformatics and Its Applications*

*Distribution of the fraction of aromatic atoms (Faro-atm) in different databases.*

**Figure 16.**

**Figure 17.**

**Figure 18.**

*Shape index distribution of different databases.*

*Distribution of the molecular flexibility in different databases.*

*Distribution of the molecular complexity in different databases.*

**94**

**Figure 19.**

*Distribution of Fsp3 in different databases.*


#### **Table 7.**

*Distribution of FCC in different databases.*


#### **Table 8.**

*Distribution of fraction of aromatic atoms.*

(close to zero) and did not differentiate the data sets (data not shown). The large molecular complexity of NPs measured is in agreement with previous studies using similar metrics [23, 63, 71].
