**4. Protein databases**

A protein database is a body of data derived from physical, chemical and biological information about the sequence, domain structure, function, three-dimensional structure, and protein-protein interactions. Together, protein databases can serve as a database of protein sequences. Therefore, it is significant to utilize suitable protein databases that can analyze and store data relating to protein science and also expedite the utilization of analytical software accessible to the scientific community. Protein databases can be broadly grouped into two types. The first is a universal type, a set of proteins found in all identified biological species. The second kind of protein database is a specialized database that deals with proteins belonging to a specific group or family of certain species. In addition, each protein database can be further categorized according to the type of information required [79].

#### **4.1 Categories of a protein database**

Since protein datasets are being developed from different experimental groups, it would be necessary to provide suitable databases to meet their needs. Presently there are several types of protein databases accessible to the public, which can be further classified into more specialized categories based on the type of information sought [79].

#### *4.1.1 Protein sequence database*

Protein sequences consist of 20 different amino acids; this sequence is known as the primary structure of a protein. This type of protein database, which collects amino acid sequences of proteins and related information, is termed a protein sequence database. Examples of this type of database include; Swiss-Prot [80], TrEMBL [80], PIR [81], DDBJ [82], etc.

#### *4.1.2 Protein structure databases*

Protein structure regulates function, given that the specificity of active sites and binding sites hinges on the exact three-dimensional conformation. Protein structure databases contain information related to three-dimensional protein structure and secondary structure obtained from analyses by X-ray crystallography, electron microscopy and NMR. Examples include Protein Data Bank (PDB) [83], etc.

*Therapeutic Inhibitors: Natural Product Options through Computer-Aided Drug Design DOI: http://dx.doi.org/10.5772/intechopen.104412*

## *4.1.3 Protein-protein interaction databases*

A protein-protein interaction database is developed on the basis of protein-protein interaction information gotten from yeast two-hybrid, co-purification, affinity column chromatography, in vitro binding and IP/coIP (protein immunoprecipitation (IP)/ co-immunoprecipitation (Co-IP) methods. Examples include; BIND (biomolecular interaction network database) [84], DIP (database of interacting proteins) [85], MINT (molecular interactions database) [86], etc.

## *4.1.4 Protein pattern and profile databases*

Motifs can be identified in protein, DNA, and RNA sequences, but the most familiar use of motif-based analysis is the identification of sequence motifs conforming to structural or functional features in proteins. One of the essential instruments for sequence analysis is the utilization of protein sequences or profiles to establish protein function [87, 88]. Example, Interpro [89], etc.

### *4.1.5 2-D PAGE databases*

These 2-D PAGE databases comprise gel image data acquired by examining the 2-DE and documented data on gel spots about molecular mass (M.W.), isoelectric point (pI), a status report on the identified location, and cross-reference links [90].

### *4.1.6 Metabolic pathway databases*

Metabolic databases offer descriptive data on enzymes, biochemical reactions and metabolic pathways. Examples are BioCyc [91], MetaCyc [92], etc.

### *4.1.7 Signaling pathway databases*

This signaling pathway database is to inspire complementary investigation in individual laboratories and to enable access to essential information on biological signaling pathways. This database can be classified into the following areas, depending on the format, for it contains both graph and tree-type data structures.

Examples are TRANSPATH [93], etc.

With these receptor/protein/enzyme databases and natural product databases, more *in-silico* research aiming towards the discovery and development of more therapeutic inhibitors from natural products can be initiated. At the present time, *in-silico* approaches have become an essential aspect of the drug discovery procedure. The use of *in-silico*/ computational approaches to discover, develop, and analyze drugs and similar biologically active molecules is referred to as Computer-Aided Drug Design.

## **5. Computer-aided drug design/repurposing**

Computer-aided drug design, which commenced in about the early 1970s, is a process where new drug molecules are designed/identified, redesigned or repurposed to bind with a biological target of known or predictable 3D structure and express substantial affinity/specificity [94]. The core purpose of drug design methods is to utilize the receptor/ligand tertiary structures for accelerating the drug discovery process and also repurposing or enhancing the inhibition properties of a ligand, which could act as a therapeutic inhibitor. In performing computer-aided

drug design, two approaches can be implemented. The first is structure-based (target-based), while the second is ligand-based (analogue-based) [95].

Methods by which the 3D structure of a protein can be generated include X-ray crystallography, NMR, electron microscopy, or prediction based on homology in silico. Once the 3D structure has been resolved, the protein's binding site or active site is identified. Structural-based drug design methods recognize/design an inhibitor having functional properties complementary to the protein binding site. These include molecular docking and the design of de novo molecules. Molecular docking techniques assess a molecule's most viable binding geometries at the binding site of a target protein in the 3D space. These binding geometries are termed binding poses, which include both configurations, which are the molecule's position in the target or the receptor-binding site and conformational sampling. These binding geometries are recorded using molecular mechanics and calibrated according to the intensity of the interaction with the receptor. This process can be performed on large high-speed databases (virtual screening), allowing rapid molecular screening to recognize the right inhibitors. De novo design approaches form ligands that have not been synthesized before. In this approach, the functional groups responsible for interactions with the target receptor are positioned in the additional 3D space of the protein binding site and are linked to the binding scaffolding. This technique assumes that only the functional groups of a molecule are responsible for their activity and not the scaffold [96].

Ligand-based drug design approaches like quantitative structure-activity relationship (QSAR) and pharmacophore modeling have established their effectiveness in designing/envisaging the action of new molecules and in searching chemical databases to detect novel lead scaffolds in the absence of target receptor 3D structure [97–99]. QSAR and quantitative structure-property relationship approach developed a mathematical model for biological activity employing numerous structural and functional properties [100–102]. This activity (dependent quantity) and property (independent quantity) model can be used to contemplate the activity of novel molecules as inhibitors without knowing the structure of the 3D receptor. These relations can be obtained using statistical measurements such as regression approach, neural networks, main component analysis (PCA), partial least squares (PLS).

These days *in-silico* drug repurposing is attracting global awareness as a result of the accessibility of a huge amount of data on protein structures, pharmacophores, disease data, clinical investigations, or gene expression profiles of medicines. As well, increased public social networking technologies and computational access to genetic information have greatly helped computational approaches predict new indications. As a result, most pharmaceutical companies use bioinformatics or modern computing resources to reposition drugs from various chemical spaces. The ultimate desire of each pharmaceutical company is to be able to put medication into the market with increased speed and at the same time lower the cost of design and development. The powerful in-silico technology can provide these benefits. With the increase of drug-related data available, new computational approaches with improved recall and precision for targeted profiling of small compounds have been developed. These approaches enhance the repurposing procedure by including chemoinformatics, bioinformatics, network biology, systems biology or genomic information to uncover unidentified targets and mechanisms for approved drugs with accelerated timeframes.

### **6. New therapeutic inhibitors and natural products**

In utilizing existing drugs in drug repurposing and natural molecules in the discovery and development of new therapeutic inhibitors, all we have discussed above:

#### *Therapeutic Inhibitors: Natural Product Options through Computer-Aided Drug Design DOI: http://dx.doi.org/10.5772/intechopen.104412*

classification of therapeutic inhibitors, protein database, natural product database and Computer-Aided Drug design, have essential roles to play.

Before one can aim at targeted drug design, there must be a disease of interest at heart. Then, understanding the pathophysiology and pathogenesis of the disease. A good understanding of these processes to identify the protein and enzymes involved in the pathophysiology and pathogenesis and also the role(s) each of these proteins and enzymes plays. Apart from the role the proteins and enzymes play in the diseases of interest development, there might also be other positive roles (s) these proteins and enzymes play in the system. With all these, a proper decision can be made on the possibility of achieving a beneficial therapeutic effect without causing a chronic negative outcome to the system.

Protein databases already described above ensure the availability of proteins and enzymes in a format that can be downloaded and utilized for *in-silico* studies. These databases contain proteins and enzymes from humans, animals and different levels of organisms. Some of the databases can be accessed for free, making them open for any interested researcher to access. An interesting feature of most of these proteins available on protein databases like protein data banks is that their active sites are specified, with a ligand molecule attached, making it easier for a specific study to be carried out using the proteins and enzymes.

The natural product databases described above and existing drugs library like drug bank provide the ligands (molecules) which can be utilized or from which new therapeutic inhibitors can be sourced for the purpose of drug repurposing. Because of the number of these natural products as contained in the databases, handling such an enormous amount of data might be challenging, but with the advances in *in-silico* high throughput screening, there are drug design applications that can be deployed in minimizing the number of ligands that could lead to hit molecule(s).

With the advances in computer-aided drug design and bioinformatics, certain steps can be undertaken using natural products towards the discovery and development of better therapeutic inhibitors and also repurposing already existing drugs for the discovery and development of better therapeutic inhibitors. So many studies are already in progress using these steps. There are so many applications that can utilize in the *in-silico* study for the discovery and development of new therapeutic inhibitors. With these applications, drug-likeness and ADMET of thousands of natural compounds can be predicted, protein active site can be established, molecular docking can be simulated, molecular dynamic simulation can be carried out, and *in-silico* that is of the essence in determining which molecule(s) has the lowest chance of failure if taken to in-vitro experiments.
