**2. Application of chemoinformatic antimalarial databases: case of natural products from Panama**

#### **2.1 Preparation curated and processing of data set**

In this chapter, we present a chemoinformatic analysis of natural products with antimalarial activities (in vitro), expressed as pIC50 against sensitive and resistant

**85**

**Table 1.**

*Chemoinformatic Approach: The Case of Natural Products of Panama*

strains. Databases of natural products with antimalarial activity (NPAs) were constructed in-house by reviewing published articles including those compounds that were isolated and characterized by spectroscopic techniques of nuclear magnetic resonance. Around 1312 compounds were compared to 8 reference data sets: an open database, DrugBank (antimalarial drug), European Bioinformatics Institute. (CHEMBL drug indications) (antimalarial activities), Open Source Drug Discovery (OSDD) Malaria, Malaria Box (Medicines for Malaria Venture (MMV)), St. Jude Children's Research Hospital (St. Jude), Novartis (GNF Malaria Box), and GlaxoSmithKline (GSK) Tres Cantos antimalarial set. All data sets were curated using the "Wash" function implemented in the Molecular Operating Environment (MOE2018.0101) software [64]. The structure of the studied compounds was represented by simplified molecular input line entry system (SMILES) notation, thus obtaining 20,364 unique molecules that are summarized in **Table 1**. The difference between initial compounds and unique compounds is due to the fact that during the data preparation (curation process), the duplicate compounds are eliminated, those that have positive or negative partial loads have neutralized their protonation states, the metals are disconnected, and the energy is minimized using the molecular mechanistic force field (MMFF94). The result of the data curation is the reduction of the initial number of molecules present in the databases evaluated in this work.

The descriptors of physicochemical properties, hydrogen bond acceptors (HBAs), hydrogen bond donors (HBDs), number of rotatable bonds (NRBs), the octanol/water partition coefficient (logP), topological polar surface area (TPSA),

**compounds**

Novartis-GNF Malaria Box 4.878 4.868 Available in: https://www.

**Unique compounds**

1353 1312 Databases of NP in house

26 4 https://www.drugbank.ca

27 24 [https://www.ebi.ac.uk/

93 88 http://opensourcemalaria.

124 124 https://www.ebi.ac.uk/

1.478 1.478 https://www.ebi.ac.uk/

12.470 12.466 Open Source Malaria

**Source**

chembl]

org/

chemblntd

chembl/malaria/source

ncbi.nlm.nih.gov/pmc/ articles/PMC3941073/ Available in: https://www.ebi.

(GSK-TCMDC). Available in: https://www.ebi.ac.uk/

ac.uk/chemblntd

chemblntd

*DOI: http://dx.doi.org/10.5772/intechopen.87779*

**2.2 Molecular descriptors**

Natural Products Antimalarial

DrugBank Version 5.0. (Drug

Open Source Drug Discovery

St. Jude Children's Research

GlaxoSmithKline Tres Cantos

*Databases analyzed with chemoinformatic tools.*

Malaria Box-Medicine of Malaria

European Bioinformatics Institute. (CHEMBL Drugs Indications) (Antimalarial activities

(NPAs)

Antimalarial)

(OSDD) Malaria

Venture (MMV)

Hospital's

Antimalarial

**Databases Initial** 

*Chemoinformatic Approach: The Case of Natural Products of Panama DOI: http://dx.doi.org/10.5772/intechopen.87779*

strains. Databases of natural products with antimalarial activity (NPAs) were constructed in-house by reviewing published articles including those compounds that were isolated and characterized by spectroscopic techniques of nuclear magnetic resonance. Around 1312 compounds were compared to 8 reference data sets: an open database, DrugBank (antimalarial drug), European Bioinformatics Institute. (CHEMBL drug indications) (antimalarial activities), Open Source Drug Discovery (OSDD) Malaria, Malaria Box (Medicines for Malaria Venture (MMV)), St. Jude Children's Research Hospital (St. Jude), Novartis (GNF Malaria Box), and GlaxoSmithKline (GSK) Tres Cantos antimalarial set. All data sets were curated using the "Wash" function implemented in the Molecular Operating Environment (MOE2018.0101) software [64]. The structure of the studied compounds was represented by simplified molecular input line entry system (SMILES) notation, thus obtaining 20,364 unique molecules that are summarized in **Table 1**. The difference between initial compounds and unique compounds is due to the fact that during the data preparation (curation process), the duplicate compounds are eliminated, those that have positive or negative partial loads have neutralized their protonation states, the metals are disconnected, and the energy is minimized using the molecular mechanistic force field (MMFF94). The result of the data curation is the reduction of the initial number of molecules present in the databases evaluated in this work.
