**2.2 Molecular descriptors**

*Cheminformatics and Its Applications*

Computational exploration of NPs has increased in recent years, giving greater relevance to studies that include structural diversity metrics calculated with parameters based on distances such as Euclidean distance, Manhattan distances, and Cosine distance. Other criteria are based on circular fingerprint (ECFP-4, ECFP-6) [22–24, 38–45] and fingerprint based on substructure (MACCS, PubChem) [22–24, 39–45]. Another metric used in NPs is the comparison by similarity that uses the

*Biological endpoints and targets in which natural products from Panama present bioactivity.*

In this study, the molecular scaffolds of natural products have been obtained using the Murcko method [22–24, 50–57]. Meanwhile, the molecular complexity is

) [23], fraction of chiral centers (FCC) [23], and globularity [22–24, 58–63]. An update of the Natural Products Database from the University of Panama (UPMA) containing 454 compounds (Unpublished data) has been evaluated against different therapeutic targets such as cytotoxicity bioassay in cell lines, antifungal assay in vitro, parasites of tropical diseases (*Leishmania* sp., *Plasmodium falciparum*, and *Trypanosoma cruzi*), and the bioassay against HIV-1 virus, demonstrating an inhibitor effect on protease, reverse transcriptase, nuclear factor NFkappaB, and Tat protein affecting the viral replication. These are the most significant biological targets in which the natural products from Panama present bioactivity. The values

hybridized carbons

Tanimoto index/Tanimoto coefficient [22–24, 45–49].

**natural products from Panama**

**2.1 Preparation curated and processing of data set**

frequently evaluated by descriptors in 2D such as fraction of sp3

of their biological activities are represented as percentages in **Figure 1**.

**2. Application of chemoinformatic antimalarial databases: case of** 

In this chapter, we present a chemoinformatic analysis of natural products with antimalarial activities (in vitro), expressed as pIC50 against sensitive and resistant

**84**

(Fsp3

**Figure 1.**

The descriptors of physicochemical properties, hydrogen bond acceptors (HBAs), hydrogen bond donors (HBDs), number of rotatable bonds (NRBs), the octanol/water partition coefficient (logP), topological polar surface area (TPSA),


#### **Table 1.**

*Databases analyzed with chemoinformatic tools.*

and molecular weight (MW), or others such as molar refractivity, are important physicochemical parameters for quantitative structure-activity relationship (QSAR) analysis. These molecular descriptors are based on Lipinski's rule and Verger's rule regarding the prediction of the pharmacological similarity of orally active pharmacological potential [65–67]. The statistical analysis of the physicochemical properties was realized with RStudio Software 1.0.136 AGPL [68].

### **2.3 3D visualization of chemical space of compounds with antimalarial activity**

PCAs were done with MOE software [64], and the dominant characteristics are expressed as covariance and visualized with the corresponding 2D or 3D graphic score plot with DataWarrior program v. 5.0 [69]. **Figures 2**–**8** showed the distribution of different compounds with antimalarial activities in the chemical spaces.

In **Figures 2**–**8** we observed that NPs, drugs, and synthetic compounds occupy, in general, similar chemical space and are overlapping in most of the evaluated databases.

### **2.4 Molecular diversity based on fingerprints**

Three binary molecular fingerprints were calculated with RStudio package rcdk: Extended connectivity fingerprints with diameter 4 (ECFP-4) for similarity searching, molecular access system (MACCS) keys of 166 bits for determining similarity and molecular diversity, and PubChem keys of 881 bits for encoding molecular fragment information [42–44]. The similarity of fingerprints by structural pairs of compounds was calculated with the Tanimoto coefficient and analyzed with the cumulative distribution function (CDF). This approach has been used to calculate, measure, and represent the molecular variety of compound data sets [23].

**Figures 9**–**11** show the CDFs of the pairwise similarity of the different data sets evaluated with Tanimoto coefficient and ECPF-4, MACCS keys, and PubChem fingerprints, respectively.

**87**

**Figure 4.**

**Figure 3.**

*Chemoinformatic Approach: The Case of Natural Products of Panama*

**Figures 9**–**11** provide information on the structural diversity of the six databases. Similar approach has been previously published [23]; the curves obtained with ECFP-4 did not prove to be a suitable fingerprint representation for these data sets. In the three similarity graphs based on fingerprints, it is shown that the database of natural products with antimalarial activity, OMS, and MMV has the

In **Tables 2**–**4**, the statistical values of the pairwise Tanimoto similarity with the data sets analyzed are shown. In these tables, CHEMBL and DrugBank databases

lowest molecular diversity, while GSK DB was the most diverse.

*3D visualization of the chemical space of synthetic compounds.*

are excluded from our analysis, due to the small amount of data.

*3D visualization of the chemical spaces of natural products and GNF DBs.*

*DOI: http://dx.doi.org/10.5772/intechopen.87779*

**Figure 2.**

*3D visualization of the chemical space of natural product databases.*

*Chemoinformatic Approach: The Case of Natural Products of Panama DOI: http://dx.doi.org/10.5772/intechopen.87779*
