Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule Docking Approaches

*Sefika Feyza Maden, Selin Sezer and Saliha Ece Acuner*

#### **Abstract**

Proteins (e.g., enzymes, receptors, hormones, antibodies, transporter proteins, etc.) seldom act alone in the cell, and their functions rely on their interactions with various partners such as small molecules, other proteins, and/or nucleic acids. Molecular docking is a computational method developed to model these interactions at the molecular level by predicting the 3D structures of complexes. Predicting the binding site and pose of a protein with its partner through docking can help us to unveil protein structure-function relationship and aid drug design in numerous ways. In this chapter, we focus on the fundamentals of protein docking by describing docking methods including search algorithm, scoring, and assessment steps as well as illustrating recent successful applications in drug discovery. We especially address protein–small-molecule (drug) docking by comparatively analyzing available tools implementing different approaches such as ab initio, structure-based, ligandbased (pharmacophore-/shape-based), information-driven, and machine learning approaches.

**Keywords:** molecular docking, drug design, drug discovery, protein interactions, machine learning

#### **1. Introduction**

The molecular machines of the cell, i.e., proteins, are essential to many cellular processes such as signal transduction and cell regulation. Proteins seldom act alone in the cell, but they function through interacting with other small or macromolecules. Therefore, understanding protein interactions at the atomic level is critical to understanding biological processes [1]. Primary structure, i.e., amino acid sequence, of the interacting proteins is a necessary but insufficient source of information at the atomic level. After being synthesized, proteins fold and acquire a stable native structure, i.e., tertiary structure that can be defined in a three-dimensional (3D) plane in order to be functional. It is known that proteins with different sequence information can have similar functional structures, that is, different amino acid sequences can show

similar folding trends in 3D space and structure is more conserved than sequence [2]. Therefore, it is crucial to understand the interaction details at the structural level. Proteins physically interact with their partners via non-covalent associations, namely H-bond, hydrophobic, and electrostatic interactions, with the exception of covalent disulfide bridges. These intermolecular physical forces also dominate the protein folding process.

The 3D structure of the macromolecules can be determined using the experimental methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and cryo-EM and then deposited in the Protein Data Bank (PDB) (https://www.rcsb. org). However, there is a huge gap between the number of known protein sequences and structures [3, 4]. Computational modeling approaches that can predict 3D structures of macromolecules can help to bridge this gap. A recent machine learning algorithm developed by DeepMind, called AlphaFold [5], can predict 3D structures of proteins using the sequence information with high accuracy and has been accepted as a breakthrough in the structural biology field. In 1 year, approximately 1 million new structures have been predicted and deposited at AlphaFold Protein Structure Database (https://alphafold.ebi.ac.uk/). In order to have a complete understanding of the proteome, computational techniques are not only needed for modeling single protein structures, but also the interactions between them.

Molecular docking is a method used to predict the structures of proteins in complex with other proteins, nucleic acids, or small molecules. It can be defined as predicting the appropriate low-energy binding pose of the ligand in complex with the target structure, by randomly colliding proteins and their potential partners in space, first creating a rigid complex structure model, and then focusing on the binding sites of that model with flexible interface refinement [6]. Energy minimization of randomly docked conformations in space requires a multidimensional calculation. Initially developed molecular docking method was treating ligands and receptors as rigid bodies without considering any conformational changes [7]. However, interactions between proteins can become quite complex even with small changes in the conformation of the structures [7], and docking algorithms may not physically solve this complex problem correctly [8]. The main factor that creates computational difficulties in docking algorithms is when the protein backbone changes its conformation significantly upon binding [9, 10]. To address this problem, different techniques that consider backbone flexibility have been successfully implemented in docking algorithms [10].

Many diseases today, such as cancer, are likely to be linked to problems in proteinprotein interactions and targeting them can therefore enable the development of next-generation therapeutic methods [11]. Modeling the complex structures formed by proteins with other proteins or small molecules holds the key to understand many biological processes such that modeling enzyme-substrate or protein-drug interactions can reveal insights into binding sites/interface regions, function, and mechanism of action. The main protein–small-molecule docking applications in drug discovery include drug repositioning, structure- and ligand-based (pharmacophore−/ shape-based) drug design approaches using virtual and reverse screening [11–14]. Today, with the continuously developing technology; targeted drug design, drug target search, evaluation of the side effects of existing drugs, or finding new targets for these drugs can be achieved with the help of molecular modeling and machine learning methods [12]. Deep learning neural network models have strong computational ability on big data and attract attention in structural biology field [15]. There are antibiotic discovery studies using deep neural networks [16] and deep learning studies adapted to drug design [17].

*Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

In this chapter, we focus on the protein–small-molecule docking fundamentals and the steps of the docking algorithm and procedure in detail. We then give recent successful applications in drug design and discovery that use different docking approaches, namely virtual screening, reverse screening, and machine learning. Lastly, we comparatively analyze some of the available protein–small-molecule docking tools using the structure of SARS-CoV-2 main protease in complex with a non-covalent inhibitor Jun8-76-3A as a case study.

#### **2. Fundamentals of protein–small-molecule docking**

Protein–small-molecule interactions are essential for the sustainability of biological processes such as enzymatic catalysis and overall homeostasis in the body [18]. The engineering of protein–small-molecule interactions is one of the computational approaches used to solve critical problems in biology [18]. Protein–small-molecule docking, i.e., modeling the interaction between chemical compounds and their target protein receptors at the atomic level, is an effective tool in drug design. In the structure-based design of small-molecule drugs, a good estimation of the binding pose is required to clearly demonstrate important interactions and design drugs with increased selectivity and efficacy [19]. The procedures that can be followed and the tools that can be used before, during, and after molecular docking are explained in the following subsections and summarized in **Figure 1**.

#### **2.1 Before docking: molecule preparation**

Before starting the docking studies, first of all, the most suitable protein and ligand structures should be selected [20]. There are databases to access the experimentally determined structures of target proteins such as PDB, Uniprot, and Therapeutic Target Database (TTD). If the experimental structure is not available, modeled structures can be obtained from AlphaFold Database or can be modeled

#### **Figure 1.**

*The procedures that can be followed and the tools that can be used before, during, and after protein-ligand molecular docking in drug design.*

using relevant structure modeling software. The most frequently used databases for getting the small-molecule ligand/chemical structures are: DrugBank [21], PubChem [22], ZINC [23], ChEMBL [24], and Chemspider [25] (**Figure 1**). DrugBank, Chemspider, and ZINC databases include more than 500,000, 100 million, and 230 million compounds/drug molecules, respectively.

The molecular docking algorithms may require preliminary preparation of the structures that are obtained in PDB format (lacking H atoms). There are tools available for such preliminary preparations such as Open Babel [26] and AutoDockTools (**Figure 1**) [27].

It is also of crucial importance to guide docking with preliminary information on the binding site. Otherwise, there are no binding site constraints, blind docking takes place, and it is more difficult to detect the correct binding poses when the ligand search space is large. There are various guiding algorithms for active site prediction that can be used when binding sites are not known. Some of them can be listed as: GRID [28], SurfNet [29], COACH [30], SCFbio [31], CASTp [32], DeepSite [33], and PUResNet (**Figure 1**) [34].

The capabilities of docking algorithms can differ from each other, and in this respect, it is important to carefully choose the algorithm to use in accordance with the purpose of the study before starting the docking.

#### **2.2 Docking algorithm steps**

There are many approaches and algorithms for molecular docking, based on different parameters, and they aim to perform the protein-ligand docking with the best performance [12]. The steps of molecular docking algorithms can be summarized as follows: molecule flexibility, conformational search algorithms (ligand sampling), and scoring functions (**Figure 2**) [12, 35].

#### *2.2.1 Molecule flexibility*

During molecular docking, structures can be considered rigid or flexible. Rigid docking takes into account only the translation and rotation degrees of freedom. Providing flexibility means also considering the rotation about single bonds so that they have the same bond lengths and angles but different torsion angles. Although

#### **Figure 2.** *Methods for protein-ligand molecular docking.*

*Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

flexible docking approach is more realistic than rigid docking, when there are many rotatable bonds, the ligand conformational search space becomes so large that it is difficult to find the correct binding pose with the lowest binding free energy (global minimum solution). Some algorithms, such as HADDOCK [36], first treat the structures as rigid to increase time efficiency and then perform flexibility improvements on the poses of molecules with the best energy scores. Molecular docking software can be grouped according to the flexibility treatments of molecules such as Rigid Docking, Semi-Flexible Docking, and Soft Docking [35, 37].

In rigid docking, protein and ligand molecules are treated as rigid entities [37, 38]. During docking, the positions of the molecules change without losing their shape [37], i.e., only translation and rotation but no conformational degrees of freedom are considered.

Semi-flexible docking is based on the principle of keeping the protein structure rigid and letting the ligand structure be flexible by allowing rotatable bonds. Thus, various conformational poses of the ligand on the protein are sampled [35, 37, 38]. It gives more accurate results than rigid docking [37].

In soft docking, van der Waals interactions between atoms are softened, making the structures of both receptor and ligand molecules implicitly flexible as overlap is allowed to a small extent [39, 40]. Soft docking process is carried out realistically by ensuring that both the protein and the ligand are rotatable as in their natural states [37, 38]. It is an advantageous method due to its computational efficiency and ease of application [35, 37].

#### *2.2.2 Conformational search algorithms*

Conformational search algorithms can identify different conformational orientations (poses) of the ligand sampled around the experimentally determined active site or other binding sites on the protein [35, 41, 42]. These algorithms are generally classified as: shape matching, systematic, stochastic, and simulation methods [35, 38, 43].

Shape matching algorithms have the advantage of speed over other algorithms [35, 44] and adopt a sampling principle in which the conformation of the ligand should be structurally complementary to the protein binding site [38]. It ensures that the ligand is positioned in such a way that best complements the molecular surface of the binding site on the protein [35]. Some example software using shape matching are: DOCK [45], FLOG [46], EUDOC [47], Surflex [48], LibDOCK [49], SANDOCK [50], and MDock [51].

Using systematic search algorithms, a large number of possible binding poses can be obtained by gradually changing the degrees of freedom of the ligands [35, 52] toward the direction of minimum energy. Systematic search algorithms can be divided into two as exhaustive search and fragmentation (incremental structure) [35, 41, 53]. Exhaustive search algorithm is based on systematically generating flexible ligand conformations by rotating the rotatable bonds in the ligand [35]. If the number of rotatable bonds is large, there is a combinatorial explosion in the number of poses, i.e., the search space, so that some filtering and optimization procedures are applied for practical purposes [35]. Glide [54] and FRED [55] are example docking software using exhaustive conformational search algorithms. In the fragmentation method, the ligand is divided into smaller fragments, each fragment is placed and augmented at the binding site gradually through covalent bonding to the

previous one [35]. DOCK [56], LUDI [57], FlexX [58], and eHiTs [59] are example software using fragmentation.

The algorithms used in stochastic search methods are more efficient but do not guarantee an accurate result as they are based on generating random ligand conformations, and therefore, the docking process is iterative in these algorithms [41, 44]. Monte Carlo, swarm optimization, evolutionary algorithms, and Tabu search methods are among the most used stochastic algorithms [35, 38, 52]. Example software using stochastic conformational search method include AutoDock [60], GOLD [61], DockThor [62], and MolDock [63].

Simulations of the obtained ligand poses (simulation methods) represent protein and ligand flexibility better than the other algorithms but have a slow flow and can make insufficient sampling [38, 44]. For this reason, they are used as a complement to other conformational search methods [38].

#### *2.2.3 Scoring functions*

In the previously described conformational search step, many structures are created and most of them should be eliminated by selecting the biologically appropriate structures. Therefore, the possible poses created by conformational search algorithms are evaluated and ranked by using a scoring function [35]. The scoring function is a measure to evaluate the docking poses obtained [35, 38, 52] in terms of their binding free energies [11, 44, 64].

With the scoring functions that estimate the binding energies of the created complex structures, various physicochemical properties should be evaluated in order to distinguish good results from the bad ones. These physicochemical properties can be intermolecular interactions, desolvation from solvent, electrostatic and entropic effects, etc. [65]. As the number of evaluated parameters increases, the accuracy of the scoring function will increase; but the computational load will also increase. Therefore, scoring functions with ideal efficiency, especially when working with large ligand sets, are those that are balanced in terms of accuracy and speed [11]. The scoring functions can be classified as: force-field-based, empirical, knowledge-based, and consensus scoring.

The Force Field Scoring Function (FFSF) is designed to work with multiple force fields such as AMBER [66], CHARMM [67], GROMOS [68], and OPLS [69] individually or in combination. The designed FFSFs estimate the free energy of ligand binding by considering van der Waals energy terms such as electrostatic interactions and hydrogen bonds [35, 38].

Empirical scoring functions use simpler energy terms to estimate the free energy of ligand binding such as hydrogen bonds and ionic interaction, and they can be calculated more easily and faster than FFSFs [35, 38, 52]. Some examples of empirical scoring functions are GlideScore [54], PLP [70], LigScore [71], LUDI [72], SCORE [73], and X-Score [74].

Knowledge-based scoring functions use statistical analysis of protein-ligand complex structures to derive protein-ligand distance [44]. These functions can show high performance in a short time [52]. They can also model some uncommon interactions, such as sulfur-aromatic, that other functions do not address [44].

Consensus scoring function, not a specific scoring system, aims at an effective scoring with a combination of multiple scoring functions with the idea of minimizing the possible error margins of existing scoring systems [35, 38, 44].

*Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

#### **2.3 After docking: evaluation of the results**

After performing protein-ligand docking studies, the accuracy of pose estimations needs to be evaluated [41, 52]. The best way to evaluate the docking algorithm is to compare the predicted binding pose of the ligand with position of the reference ligand in the experimentally determined structure, if possible. The structural comparison is quantified by using root mean squared deviation (RMSD) (Eq. 1), with the unit of Å [41, 75]. It is preferred that this value is between 2 and 4 Å or less for a good docking. RMSD calculations are simple, but this metric is not normalized to number of atoms and therefore should not be considered as an absolute measure [76]. As a more systematic approach, in order to ensure the consistency of the docking algorithm used, it should be checked whether the same poses are obtained by repeating the docking process [52] at least 50 times and clustering the poses of the side chains and references according to a certain threshold value [77]. With this method, whether the docking algorithm correctly and consistently creates a pose in the right position can be determined [41, 44, 78].

$$RMSD = \sqrt{\frac{1}{N} + \sum\_{i=1}^{N} \left(\mathbf{x}\_{ai} - \mathbf{x}\_{bi}\right)^2 + \left(y\_{ai} - y\_{bi}\right)^2 + \left(z\_{ai} - z\_{bi}\right)^2} \tag{1}$$

Eq. (1) Root mean squared deviation for the coordinates of two molecules, a and b, with N atoms.

Modeling successes and capabilities of docking algorithms are being evaluated in a competition called CAPRI (Critical Assessment of Protein Interactions) (https:// www.capri-docking.org/) since 2001 [79, 80]. Experimentally determined complex structures that have not yet been published in PDB are submitted to CAPRI and without knowing the experimental structure of the complex, the participants try to predict the most similar structure to the experimentally determined complex structure through docking algorithms [79]. A solution set of 10 models is presented to the CAPRI committee for evaluation based on the geometry similarity and biological relevance of the predicted complex structures. The results of CAPRI show very good predictions for easy targets with simple conformational changes, but rather worse ones for difficult targets with conformational changes upon binding [9].

#### **3. Molecular docking approaches and applications in drug design**

Computational methods have become an important part of the drug discovery process with increasing accuracy of algorithms. Various docking methods based on different algorithms are constantly being developed to determine the structural relationships of potential drug molecules and their targets [44]. In addition, studies in this area shed light on the candidate drugs in terms of the pharmacodynamic properties, affinity, and selectivity [11]. The main molecular docking applications in drug discovery include drug repositioning (repurposing), structure- and ligand-based drug design approaches using virtual and reverse screening [11–14].

Drug repositioning seeks out new targets for natural compounds, drugs currently in use, or candidate ligands to reveal their unknown therapeutic potentials [81]. Many successful repositioning studies are available in the literature [81–83]. Virtual screening (VS) and reverse screening (RS) techniques are frequently used in drug discovery

and repositioning. VS offers a more effective and rational approach compared with traditional methods [36]. The atomic-level analyzable results presented to us by virtual screening studies guide us in understanding the function of the target and in new drug discoveries [5, 36, 55]. In the RS approach, interest is on a single ligand molecule, and there is a search for a biological target for this molecule [12]. Unlike virtual screening (VS), the search library consists of potential target receptors. RS approach has the potential to lead studies such as testing toxicity or side effects of the existing drugs [38]. The potential side effects of a drug need to be evaluated in the drug discovery process. Molecular docking studies can offer an important perspective in this regard, and there are inverse (reverse) docking studies that provide bioactivity data by detecting off-target bindings [25]. Lastly, the subclasses of Artificial Intelligence (AI): Machine Learning (ML) and Deep Learning (DL) methods have significant contributions in pharmaceutical industry [84]. AI can be applied to different steps such as drug design with VS, *de novo* generation of drug molecules, and computational planning of drug synthesis [85]. Recent developments are promising that molecular docking methods may benefit from the machine learning methods more in the future [84].

#### **3.1 Virtual screening**

Virtual screening (VS) approach uses a target receptor and a library of small molecules. Libraries can be created manually, or already existing libraries can be used. The library consists of a large number of chemically diverse bioactive small molecules with a high probability of binding to the receptor. This virtual computing technique is considered as the *in silico* equivalent of *in vitro* methods such as high-throughput screening (HTS) [11]. VS is preferred as a guide in scientific studies because its success rate is 400 times higher [86], less costly, faster, and requires less labor compared with high-throughput screening methods [87]. VS studies aim to reduce a large number of potential drug candidates to manageable numbers applying various filters. The biggest challenge in VS is the detection of false negatives [19].

Ligand-based VS methods conduct research by identifying common properties of compound sequences, such as molecular volume and protonation state [11]. In addition to chemical similarity [88] and rule-based [89] software included in filtration strategies, there are also various software such as freely add-on pharmacophore and quantitative structure-activity relationship (QSAR) models [87, 90]. The most commonly used ligand-based virtual screening method is the QSAR method. Ligandbased VS does not contain structural information about the receptor, it only scans using receptor sites known to be active and tries to detect active ligand molecules [85].

Structure-based VS methods are often used when the receptor has different conformations. The aim is to predict receptor binding affinity by processing structural information using a variety of techniques, such as binding site similarity and pharmacophore mapping. By estimating the different binding modes, the molecules are sorted for evaluation [11]. Analysis of the predicted poses can be done manually using visualization programs. It has been reported that nAPOLI, a web server developed in recent years, analyzes results automatically [91].

Structure-based pharmacophore generation is one of the most frequently used methods for small molecules in the virtual screening method. Here, 3D pharmacophore model interfaces of the scaffolds of the ligands are created, and ligands that will adapt to the binding site and provide the desired bioactivity are selected. Some of the programs that use pharmacophore modeling are HipHop [92], PHASE [93], MOE,

*Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

which are commercial, SCAMPI [94], PharmaGist [95], ALADDIN [96], which are suitable for academic use.

A recent example of VS application on the non-structural protein of SARS-CoV-2, nsp1, one of the virulence factors causing viral infection, is by G. O. Timo *et al.* [74]. They estimated the exact pattern of nsp1 interaction through molecular simulation studies and analyzed 8694 potential inhibitors from the DrugBank database using the virtual screening method and proposed 16 inhibitor molecules with the best binding energy scores [74]. There is another recent study on the transcription factor BRF2, which is among the therapeutic targets as its upregulation is observed in the formation of various types of cancer, but there is no available specific drug targeting BRF2. By performing drug repositioning through virtual screening of drug molecules that are potential candidates for BRF2 inhibition, Rashidieh *et al.* found that the bexarotene molecule led to a serious decrease in the proliferation of this type of cancer cells [97].

#### **3.2 Reverse screening**

Reverse screening (RS) is also called inverse docking, reverse docking, inverse virtual screening, or target screening. Libraries are more limited for target hunting and profiling [12] and can be created manually using the most common accessible databases such as PDB [98] and TTD [12, 99]. But this process requires a long preparation time and effort. There are various algorithms used to detect interactions by reverse screening. Some web platforms (INVDOCK [100], idTarget [101], ACTP [102], etc.) have been developed for reverse docking, which use libraries prepared for specific diseases and docked using programs such as standard AutoDock and AutoDock Vina [12].

A recently developed Consensus Reverse Docking System (CRDS) detects potential binding sites by screening approximately 5200 candidate proteins for the ligand molecule using three different scoring methods [103]. In another example, Stepanova *et al.* tested the antimicrobial activity against *Mycobacterium tuberculosis* strain by reverse screening for chemicals that had been successful in experimental studies and determined the most appropriate target as aspartate 1-decarboxylase by performing docking studies using 35 different target protein structures [104]. Reverse screening was also used for Bazedoxifene, an FDA-approved drug for the prevention of postmenopausal osteoporosis, and Xiao *et al.* defined the inhibitory power of Bazedoxifene on IL-6/GP130 signaling pathway (critical for cancer survival) by using computational techniques and confirmed the result with *in vivo* studies [83].

#### **3.3 Machine-learning-based approaches**

Machine learning techniques take information from biological data and make predictions about them, thus contributing to building a structural model [9]. Once a model is built, it must be improved so that the state with the lowest potential energy (global minimum) can be reached. Global minimum means a stable and sterically acceptable structure, and reaching it without being stuck at the local minima is very important in the field of bioinformatics and computational structural biology. A recent machine learning algorithm developed by DeepMind, called AlphaFold [5], implements deep learning and can predict 3D structures of proteins using the sequence information with high accuracy and has been accepted as a breakthrough in the structural biology field.

**Figure 3.** *Schematic illustration of artificial intelligence subfields: Machine learning and deep learning.*

Machine learning makes classifications by learning on datasets and needs human intervention to evaluate possible outcomes. Deep learning is a more advanced model having the neural network with ability to decide the right result without human intervention (**Figure 3**). Machine learning can use supervised or unsupervised learning. Supervised learning performs machine learning on datasets that we know about, whereas unsupervised learning detects and labels similarities and orientations in a created cluster [38, 90].

The training set used in machine learning constitutes the performance of the algorithm. Machine learning studies in the field of virtual screening are generally focused on improving the performance of the scoring function [85]. Studies have shown that working with small subsets of the same family, which consists of similar structures, gives better scoring results rather than working with large data from different complexes [105]. Working with subsets of interest is also a better approach in terms of computational requirements [38].

Machine learning and deep learning can describe more diverse data than other computational systems and can be representative of structural biology. Nonparametric machine learning has great potential to be the next step in computer-based programming to improve the accuracy of molecular docking studies [41]. Machine learning can be used to refine predetermined function data as well as provide high-quality data to complement pharmaceutical discovery research and development.

#### **4. Case study: comparison of docking tools**

As a case study for comparing different protein-ligand docking tools, the crystal structure of the SARS-CoV-2 (COVID-19) main protease in complex with its noncovalent inhibitor Jun8-76-3A (PDB ID: 7KX5) is used as the experimental reference structure to evaluate the accuracies of the complex structures predicted using

AutoDock Vina, HADDOCK, and SwissDock programs and changing some of the parameters to test their effects on prediction capabilities. The inhibitor in the experimental protein structure is removed and then molecular docking is performed using the initial coordinates of the main protease structure of SARS-CoV-2 and its inhibitor Jun8-76-3A, separately.

#### **4.1 Docking with AutoDock Vina**

AutoDock is a free software that predicts the binding compatibility of small ligands to macromolecule targets with a flexible-rigid (semi-flexible) docking approach [27]. It uses a grid-based method to place the ligand in the active region determined on the macromolecule [106]. AutoDockTools (http://mgltools.scripps. edu/downloads) is the user interface to produce and examine grid information required for the preparation of the protein and ligand structures in the relevant format and the configuration file [27].

As a docking input in AutoDock Vina, a configuration file, which contains the coordinate information of the protein and ligand structures and the ligand-binding region on the receptor, is required. For docking the case study ligand to the receptor using AutoDock Vina, the structure file was downloaded from RCSB PDB database (https://www.rcsb.org) in .pdb format (PDB ID:7KX5). AutoDockTools (v1.5.6) interface was used to prepare input files, such that, water molecules in the relevant protein structure were deleted, polar H bonds were added to the structure and both the receptor and ligand structures were saved in .pdbqt file format. After preparing the ligand and protein structures, the most important input information for AutoDock is the docking parameter. The docking parameter involves determining the coordinates of the ligand-binding region on the target protein. While determining the docking parameter, if the binding region on the protein is not known, blind docking can be performed by putting the whole protein in the grid box (**Figure 4A**), or a small grid box can be placed in the specific known/predicted ligand-binding region on the protein (**Figure 4B**). Lastly, after determining the region on the protein where the ligand is to be bound by using the "grid box" in AutoDockTools, the protein coordinates were

#### **Figure 4.**

*Grid box usage in docking: (A) blind docking with a grid box of size:* 44 72 68 × × *and center coordinates: 10.711, 0.0, 3.782, (B) specific docking with a grid box of size:* 14 14 16 × × *and center coordinates: 10.735, −2.409, 21.173.*


#### **Table 1.**

*Specific and blind docking studies with AutoDock were repeated three times.*

specified in the input configuration file. Preparing all the required inputs, docking was performed using AutoDock Vina by repeating each docking process three times in order to observe the consistency of the algorithm (**Table 1**).

In order to examine the accuracy of the docking results, the poses obtained from AutoDock Vina were aligned with the original PDB structure by using the PyMol program [107]. When the energies of the poses predicted with specific docking (i.e., using specific grid on the binding site) and blind docking are compared, although the energy scores of the blind docking results are better, the comparison of the poses with the reference ligand shows that the most accurate binding is achieved with specific docking (**Figure 5**). Alignment of the first poses (with the lowest energy score) predicted with specific docking (green) and blind docking studies (blue) with the reference ligand (red) shows that the specific docking pose was in a more similar position with the reference ligand (green vs. red), than the blind docking pose (blue vs. red).

#### **4.2 Docking with HADDOCK**

An integrative platform called High Ambiguity-Driven biomolecular DOCKing (HADDOCK) is used for molecular docking of two or more molecules [108] and is a popular algorithm [36]. Although it is mainly suitable for protein-protein interactions, it can also be applied to model the protein–small-molecule complexes [109]. HADDOCK automatically decides the most suitable configuration of the ligand according to the given restrictions [108]. Protein-protein docking is more complex than protein–small-molecule docking, as the proteins are flexible and the conformational space is larger [110].

HADDOCK does not require CPU and allows the user to see all the docking steps from start to finish. It should be noted that the success of HADDOCK studies is directly related with the amount of data entered into the system [36]. HADDOCK allows processing different types of molecules with the help of different platforms such as WHATIF, ProDRG, PDB. There is no need to create different conformer

*Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

#### **Figure 5.**

*Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5\_chain (A) in complex with the blind docking (blue), specific docking (green) poses predicted with AutoDock Vina and the reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5\_chain B). This figure was drawn with PyMol 2.5.2.*

sequences as the system selects the most compatible conformers based on the shape constraints. With restriction files, we can set clear target sites, binding distances, or select active or passive residues (areas that are likely to interact). Defining semi-flexible regions is also allowed.

HADDOCK algorithm consists of three stages: rigid-body minimization and randomization of orientations (it0), semi-flexible simulated annealing in torsion angle space (it1), and refinement in 3D space with explicit solvent (water) (https:// www.bonvinlab.org/education/HADDOCK-protein-protein-basic/). it0 stage treats structures as rigid solids and 1000 poses with the best score are selected. it1 optimizes orientations by allowing different docking poses from it0 to have different flexible regions defined. Two-hundred models with the best energy pass to the final stage. In the final step, a complex solvent medium (DMSO or water) is considered to improve the interaction energy and the final models are automatically aggregated.

To dock the case study inhibitor-protein complex (PDB ID:7XK5), the guideline tutorial (HADDOCK small-molecule binding site screening protocol) [111] was followed and two different approaches were tested: (i) using an unambiguous (distance) restraint file, indicating the target that should bind the ligand, (ii) by defining the active and passive residues. This case study consists of a pre-docking for the detection of the binding region and a second docking for the detection of binding pose.

First, we tested HADDOCK's accuracy of binding site detection. Two different binding sites were detected in the top 10 clusters with the best energy scores and 70% (7 out of 10) of the clusters were in the correct binding site (**Figure 6A**). Secondly, an ambiguous and unambiguous restraint file was created by identifying the region with the highest number of interactions between the ligand and the receptor. The restraint files can be created manually or using the link in the protocol. However, it may be necessary to make corrections in the distance restraints. The structure with the best energy is visualized in **Figure 6B**. Secondly, active and passive residues were defined on the system, and the pose with the best energy result is visualized in **Figure 6C**. HADDOCK results are summarized in **Table 2**.

Comparison of the results shows that HADDOCK is successful in detecting the binding site. However, according to the results obtained in the second stage, the

#### **Figure 6.**

*Crystal SARS-CoV-2 main protease structure (gray, PDB ID: 7KX5\_chain (A) in complex with the docking poses (blue) predicted with HADDOCK and reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5\_chain B). A. Top 10 clusters for binding site determination. B. Pose with the best energy using ambiguous/unambiguous restraints. C. Pose With the best energy using active/passive restraints. This figure was drawn with PyMol 2.5.2.*

algorithm was not successful enough to find the correct conformation of the ligand in binding site. Defining ambiguous/unambiguous restraint files or selecting active and passive residues did not make a significant contribution in detecting the correct binding pose (**Figure 6B** and **C**). Docking with both approaches was repeated several times and no significant similarity was detected.

#### **4.3 Docking with SwissDock**

SwissDock is a database to improve protein–small-molecule docking using amino acid sequence information from genome projects. Moreover, it is a web browser and programmatic interface that enables creating three-dimensional protein models from protein amino acid sequences [112]. It also has user interfaces such as Swiss-Pdb Viewer (DeepView) to simultaneously analyze several proteins [113]. Using the SwissDock web server, the starting crystal structures of the target proteins can

*Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*


### **Table 2.**

*HADDOCK results.*

#### **Figure 7.**

*Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5\_chain (A) in complex with the blind docking (blue), specific docking (green) poses predicted by SwissDock and the reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5\_chain B). This figure was drawn with PyMol 2.5.2.*

be searched and fetched from protein and ligand structure databases. If there is no crystal structure available to compare, it provides homology modeling of the studied protein. During the docking process, the user does not have to do any calculations because all calculations are handled by the server side [112]. As a docking constraint, the ligand binding region can be defined or blind docking can be applied with no information.

Using the case study, both specific and blind dockings were performed on the SwissDock server, and the results were compared. The server presented 256 poses. The best scores obtained by specific docking (green) blind docking (blue) were −9.88 and −9.35 kcal/mol, respectively (**Figure 7**). Although both of the predicted poses did not show the same conformation with the reference ligand, it was observed that the pose obtained from the specific docking (green) was more similar to the reference ligand (red) (**Figure 7**).

#### **5. Conclusions**

Molecular docking is a computational method that predicts the 3D structures of receptor-ligand complexes. Modeling the atomic details of the ligand pose with the receptor protein by molecular docking can assist in understanding protein structurefunction relationship and in drug design studies in several ways. Computational modeling approaches complement and/or lead experiments by eliminating irrelevant drug candidates and selecting the ones with the best binding properties. With the continuously developing technology, there are many different approaches and algorithms for molecular docking studies, and they are successfully used in therapeutic applications such as targeted drug design, drug target search, evaluation of the side effects of existing drugs, or finding new targets for these drugs.

The crystal structure of the SARS-CoV-2 (COVID-19) main protease in complex with its non-covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) was used as an experimental reference case study to compare and evaluate the prediction accuracies of AutoDock Vina, HADDOCK, and SwissDock programs as well as to test the effects of some parameters on their prediction capabilities. One of the main observations is that the ligand poses with the lowest binding energy scores are not necessarily the best solution. Therefore, docking results should always be evaluated in terms of biological relevance. Moreover, when *a priori* information about the ligand-binding site is included as grid box placement and size in AutoDock Vina and as ligand binding residues in SwissDock, the binding accuracy is improved significantly.

In summary, before starting the molecular docking, it is of crucial importance to obtain detailed information on the target protein and ligand from various sources and servers and to decide which docking algorithm to use. Moreover, the top predicted poses with the best scores should not be unquestioningly accepted as the best solutions but further structural analyses and evaluations should be incorporated in the decision process.

#### **Acknowledgements**

We would like to thank dear Merve DEMİR AKYÜZ and Merve YÜCETÜRK (a.k.a Merves), who are fourth-year undergraduate students at Molecular Biology and Genetics Department of Istanbul Medeniyet University, for their contribution to the writing of the introduction section.

*Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

#### **Author details**

Sefika Feyza Maden, Selin Sezer and Saliha Ece Acuner\* Department of Bioengineering and Science and Advanced Technologies Research Center (BILTAM), Istanbul Medeniyet University, Istanbul, Turkey

\*Address all correspondence to: ece.ozbabacan@medeniyet.edu.tr

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Russell RB, Alber F, Aloy P, Davis FP, Korkin D, Pichaud M, et al. A structural perspective on protein–protein interactions. Current Opinion in Structural Biology. 2004;**14**(3):313-324

[2] Sadowski MI, Jones DT. The sequence–structure relationship and protein function prediction. Current Opinion in Structural Biology. 2009;**19**(3):357-362

[3] Petrey D, Honig B. Structural bioinformatics of the interactome. Annual Review in Biophysics. 2014;**43**(1):193-210

[4] Stein A, Mosca R, Aloy P. Threedimensional modeling of protein interactions and complexes is going 'omics. Current Opinion in Structural Biology. 2011;**21**(2):200-208

[5] Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;**577**(7792):706-710

[6] Andrusier N, Mashiach E, Nussinov R, Wolfson HJ. Principles of flexible protein-protein docking. Proteins. 2008;**73**(2):271-289

[7] Bonvin AM. Flexible protein–protein docking. Current Opinion in Structural Biology. 2006;**16**(2):194-200

[8] Vakser IA. Protein-protein docking: From Interaction to Interactome. Biophysical Journal. 2014;**107**(8):1785-1793

[9] Harmalkar A, Gray JJ. Advances to tackle backbone flexibility in protein docking. Current Opinion in Structural Biology. 2021;**67**:178-186

[10] Wang C, Bradley P, Baker D. Protein–protein docking with backbone flexibility. Journal of Molecular Biology. 2007;**373**(2):503-519

[11] Ferreira L, dos Santos R, Oliva G, Andricopulo A. Molecular docking and structure-based drug design strategies. Molecules. 2015;**20**(7):13384-13421

[12] Pinzi L, Rastelli G. Molecular docking: Shifting paradigms in drug discovery. IJMS. 2019;**20**(18):4331

[13] March-Vila E, Pinzi L, Sturm N, Tinivella A, Engkvist O, Chen H, et al. On the integration of in silico drug design methods for drug repurposing. Frontiers in Pharmacology. 2017;**23**(8):298

[14] Wilson GL, Lill MA. Integrating structure-based and ligand-based approaches for computational drug design. Future Medicinal Chemistry. 2011;**3**(6):735-750

[15] Anighoro A. Deep learning in structure-based drug design. Methods in Molecular Biology. 2022;**2390**:261-271

[16] Stokes JM, Yang K, Swanson K, Jin W, Cubillos-Ruiz A, Donghia NM, et al. A deep learning approach to antibiotic discovery. Cell. 2020;**180**(4):688-702.e13

[17] Elton DC, Boukouvalas Z, Fuge MD, Chung PW. Deep learning for molecular design—a review of the state of the art. Molecular System and Design Engineering. 2019;**4**(4):828-849

[18] Allison B, Combs S,

DeLuca S, Lemmon G, Mizoue L, Meiler J. Computational design of protein-small molecule interfaces. Journal of Structural Biology. 2014;**185**(2):193-202

*Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

[19] Śledź P, Caflisch A. Protein structure-based drug design: From docking to molecular dynamics. Current Opinion in Structural Biology. 2018;**48**:93-102

[20] Guterres H, Im W. Improving protein-ligand docking results with high-throughput molecular dynamics simulations. Journal of Chemical Model. 2020;**60**(4):2189-2198

[21] Wishart DS. DrugBank: A comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research. 2006;**34**(90001):D668-D672

[22] Li Q, Cheng T, Wang Y, Bryant SH. PubChem as a public resource for drug discovery. Drug Discovery Today. 2010;**15**(23-24):1052-1057

[23] Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: A Free Tool to Discover Chemistry for Biology. Journal of Chemical Information and Modeling. 2012;**52**(7):1757-1768

[24] Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Research. 2012;**40**(D1):D1100-D1107

[25] Pence HE, Williams A. ChemSpider: An online chemical information resource. Journal of Chemical Education. 2010;**87**(11):1123-1124

[26] O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open Babel: An open chemical toolbox. Journal of Cheminformatics. 2011;**3**(1):33

[27] Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: Automated docking with selective

receptor flexibility. Journal of Computational Chemistry. 2009;**30**(16):2785-2791

[28] Goodford PJ. A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. Journal of Medicinal Chemistry. 1985;**28**(7):849-857

[29] Laskowski RA. SURFNET: A program for visualizing molecular surfaces, cavities, and intermolecular interactions. Journal of Molecular Graphics. 1995;**13**(5):323-330

[30] Yang J, Roy A, Zhang Y. Proteinligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics. 2013;**29**(20):2588-2595

[31] Narang P, Bhushan K, Bose S, Jayaram B. Protein structure evaluation using an all-atom energy based empirical scoring function. Journal of Biomolecular Structure & Dynamics. 2006;**23**(4):385-406

[32] Binkowski TA, Naghibzadeh S, Liang J. CASTp: Computed Atlas of Surface Topography of proteins. Nucleic Acids Research. 2003;**31**(13):3352-3355

[33] Jiménez J, Doerr S, Martínez-Rosell G, Rose AS, De Fabritiis G. DeepSite: Protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics. 2017;**33**(19):3036-3042

[34] Kandel J, Tayara H, Chong KT. PUResNet: Prediction of protein-ligand binding sites using deep residual neural network. Journal of Cheminformatics. 2021;**13**(1):65

[35] Huang SY, Zou X. Advances and Challenges in Protein-Ligand Docking. IJMS. 2010;**11**(8):3016-3034

[36] de Vries SJ, van Dijk M, Bonvin AMJJ. The HADDOCK web server for datadriven biomolecular docking. Nature Protocols. 2010;**5**(5):883-897

[37] Fan J, Fu A, Zhang L. Progress in molecular docking. Quantitative Biology. 2019;**7**(2):83-89

[38] Crampon K, Giorkallos A, Deldossi M, Baud S, Steffenel LA. Machine-learning methods for ligand– protein molecular docking. Drug Discovery Today. 2022;**27**(1):151-164

[39] Jiang F, Kim SH. "Soft docking": Matching of molecular surface cubes. Journal of Molecular Biology. 1991;**219**(1):79-102

[40] Ferrari AM, Wei BQ, Costantino L, Shoichet BK. Soft docking and multiple receptor conformations in virtual screening. Journal of Medicinal Chemistry. 2004;**47**(21):5076-5084

[41] Torres PHM, Sodero ACR, Jofily P, Silva-Jr FP. Key topics in molecular docking for drug design. IJMS. 2019;**20**(18):4574

[42] Gioia D, Bertazzo M, Recanatini M, Masetti M, Cavalli A. Dynamic docking: A paradigm shift in computational drug discovery. Molecules. 2017;**22**(11):2029

[43] Sousa SF, Fernandes PA, Ramos MJ. Protein-ligand docking: Current status and future challenges. Proteins. 2006;**65**(1):15-26

[44] Meng XY, Zhang HX, Mezei M, Cui M. Molecular docking: A powerful approach for structure-based drug discovery. Caduceus. 2011;**7**(2):146-157

[45] Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. A geometric approach to macromolecule-ligand

interactions. Journal of Molecular Biology. 1982;**161**(2):269-288

[46] Miller MD, Kearsley SK, Underwood DJ, Sheridan RP. FLOG: A system to select ?quasi-flexible? ligands complementary to a receptor of known three-dimensional structure. Journal of Computer-Aided Molecular Design. 1994;**8**(2):153-174

[47] Pang YP, Perola E, Xu K, Prendergast FG. EUDOC: A computer program for identification of drug interaction sites in macromolecules and drug leads from chemical databases. Journal of Computational Chemistry. 2001;**22**(15):1750-1771

[48] Jain AN. Surflex: Fully automatic flexible molecular docking using a molecular similarity-based search engine. Journal of Medicinal Chemistry. 2003;**46**(4):499-511

[49] Diller DJ, Merz KM. High throughput docking for library design and library prioritization. Proteins. 2001;**43**(2):113-124

[50] Burkhard P, Taylor P, Walkinshaw MD. An example of a protein ligand found by database mining: Description of the docking method and its verification by a 2.3 Å X-ray structure of a Thrombin-Ligand complex. Journal of Molecular Biology. 1998;**277**(2):449-466

[51] Huang SY, Zou X. Ensemble docking of multiple protein structures: Considering protein structural variations in molecular docking. Proteins. 2006;**66**(2):399-421

[52] Prieto-Martínez FD, Arciniega M, Medina-Franco JL. Acoplamiento Molecular: Avances Recientes y Retos. TIP RECQB. 2018. [cited 2022 May 15];21. Available from: http://tip.

*Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

zaragoza.unam.mx/index.php/tip/ article/view/143

[53] Guedes IA, de Magalhães CS, Dardenne LE. Receptor–ligand molecular docking. Biophysical Reviews. 2014;**6**(1):75-87

[54] Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: A new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. Journal of Medicinal Chemistry. 2004;**47**(7):1739-1749

[55] McGann MR, Almond HR, Nicholls A, Grant JA, Brown FK. Gaussian docking functions. Biopolymers. 2003;**68**(1):76-90

[56] Ewing TJA, Kuntz ID. Critical evaluation of search algorithms for automated molecular docking and database screening. Journal of Computational Chemistry. 1997;**18**(9):1175-1189

[57] Böhm HJ. The computer program LUDI: A new method for the de novo design of enzyme inhibitors. Journal of Computer-Aided Molecular Design. 1992;**6**(1):61-78

[58] Rarey M, Kramer B, Lengauer T, Klebe G. A fast flexible docking method using an incremental construction algorithm. Journal of Molecular Biology. 1996;**261**(3):470-489

[59] Bentham Science Publisher BSP. eHiTS: An innovative approach to the docking and scoring function problems. CPPS. 2006;**7**(5):421-435

[60] Trott O, Olson AJ. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry. 2009

[61] Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD. Improved protein-ligand docking using GOLD. Proteins. 2003;**52**(4):609-623

[62] de Magalhães CS, Almeida DM, Barbosa HJC, Dardenne LE. A dynamic niching genetic algorithm strategy for docking highly flexible ligands. Information Sciences. 2014;**289**:206-224

[63] Thomsen R, Christensen MH. MolDock: A new technique for high-accuracy molecular docking. Journal of Medicinal Chemistry. 2006;**49**(11):3315-3321

[64] Forli S, Huey R, Pique ME, Sanner MF, Goodsell DS, Olson AJ. Computational protein–ligand docking and virtual drug screening with the AutoDock suite. Nature Protocols. 2016;**11**(5):905-919

[65] Bentham Science Publisher BSP. Scoring functions for protein-ligand docking. CPPS. 2006;**7**(5):407-420

[66] Weiner PK, Kollman PA. AMBER: Assisted model building with energy refinement. A general program for modeling molecules and their interactions. Journal of Computational Chemistry. 1981;**2**(3):287-303

[67] Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry. 1983;**4**(2):187-217

[68] van Gunsteren WF, Berendsen HJC. Computer simulation of molecular dynamics: Methodology, applications, and perspectives in chemistry. Angewandte Chemie (International Ed. in English). 1990;**29**(9):992-1023

[69] Jorgensen WL, Tirado-Rives J. The OPLS Potential Functions for Proteins. Energy Minimizations for Crystals of Cyclic Peptides and Crambin. p. 10

[70] Parrill AL, Reddy MR. Rational Drug Design: Novel Methodology and Practical Applications. American Chemical Society; 1999 [cited 2022 May 23]. (ACS Symposium Series; vol. 719). Available from: https://pubs.acs.org/doi/ book/10.1021/bk-1999-0719

[71] Krammer A, Kirchhoff PD, Jiang X, Venkatachalam CM, Waldman M. LigScore: A novel scoring function for predicting binding affinities. Journal of Molecular Graphics & Modelling. 2005;**23**(5):395-407

[72] Böhm HJ. The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure. Journal of Computer-Aided Molecular Design. 1994;**8**(3):243-256

[73] Wang R, Liu L, Lai L, Tang Y. SCORE: A new empirical method for estimating the binding affinity of a protein-ligand complex. Journal of Molecular Modeling. 1998;**4**(12):379-394

[74] Wang R, Lai L, Wang S. Further development and validation of empirical scoring functions for structure-based binding affinity prediction. Journal of Computer-Aided Molecular Design. 2002;**16**(1):11-26

[75] Dias R, de Azevedo W. Molecular docking algorithms. CDT. 2008;**9**(12):1040-1047

[76] Waszkowycz B, Clark DE, Gancia E. Outstanding challenges in protein– ligand docking and structure-based

virtual screening. WIREs Computational Molecular Science. 2011;**1**(2):229-259

[77] Morris GM, Lim-Wilby M. Molecular docking. In: Kukol A, editor. Molecular Modeling of Proteins. Totowa, NJ: Humana Press; 2008. pp. 365-382

[78] Verdonk ML, Taylor RD, Chessari G, Murray CW. Illustration of current challenges in molecular docking. In: Structure-Based Drug Discovery. Dordrecht: Springer Netherlands; 2007. pp. 201-221

[79] Janin J, Henrick K, Moult J, Eyck LT, Sternberg MJE, Vajda S, et al. CAPRI: A critical assessment of PRedicted interactions. Proteins. 2003;**52**(1):2-9

[80] Janin J. Protein–protein docking tested in blind predictions: The CAPRI experiment. Molecular BioSystems. 2010;**6**(12):2351

[81] Hurle MR, Yang L, Xie Q, Rajpal DK, Sanseau P, Agarwal P. Computational drug repositioning: From data to therapeutics. Clinical Pharmacology and Therapeutics. 2013;**93**(4):335-341

[82] Scherman D, Fetro C. Drug repositioning for rare diseases: Knowledge-based success stories. Thérapie. 2020;**75**(2):161-167

[83] Xiao H, Bid HK, Chen X, Wu X, Wei J, Bian Y, et al. Repositioning Bazedoxifene as a novel IL-6/GP130 signaling antagonist for human rhabdomyosarcoma therapy. PLoS ONE. 2017;**12**(7):e0180297

[84] Gupta RR. Application of artificial intelligence and machine learning in drug discovery. Methods in Molecular Biology. 2022;**2390**:113-124

*Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule… DOI: http://dx.doi.org/10.5772/intechopen.105815*

[85] Thomas M, Boardman A, Garcia-Ortegon M, Yang H, de Graaf C, Bender A. Applications of artificial intelligence in drug design: Opportunities and challenges. Methods in Molecular Biology. 2022;**2390**:1-59

[86] Zhu T, Cao S, Su PC, Patel R, Shah D, Chokshi HB, et al. Hit identification and optimization in virtual screening: Practical recommendations based on a critical literature analysis: Miniperspective. Journal of Medicinal Chemistry. 2013;**56**(17):6560-6572

[87] Neves BJ, Mottin M, Moreira-Filho JT, Sousa BK de P, Mendonca SS, Andrade CH. Best practices for dockingbased virtual screening. In: Molecular Docking for Computer-Aided Drug Design. Academic Press (Elsevier); 2021. pp. 75-98

[88] Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Advanced Drug Delivery Reviews. 2001;**46**(1-3):3-26

[89] Veber DF, Johnson SR, Cheng HY, Smith BR, Ward KW, Kopple KD. Molecular properties that influence the oral bioavailability of drug candidates. Journal of Medicinal Chemistry. 2002;**45**(12):2615-2623

[90] Neves BJ, Braga RC, Melo-Filho CC, Moreira-Filho JT, Muratov EN, Andrade CH. QSARbased virtual screening: Advances and applications in drug discovery. Frontiers in Pharmacology. 2018;**9**:1275

[91] Fassio AV, Santos LH, Silveira SA, Ferreira RS, de Melo-Minardi RC. nAPOLI: A graph-based strategy to detect and visualize conserved proteinligand interactions in large-scale.

IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2019:1-1

[92] Kurogi Y, Güner OF. Pharmacophore modeling and three-dimensional database searching for drug design using catalyst. Current Medicinal Chemistry. 2001;**8**(9):1035-1055

[93] Dixon SL, Smondyrev AM, Knoll EH, Rao SN, Shaw DE, Friesner RA. PHASE: A new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results. Journal of Computer-Aided Molecular Design. 2006;**20**(10-11):647-671

[94] Chen X, Rusinko A, Tropsha A, Young SS. Automated pharmacophore identification for large chemical data sets. Journal of Chemical Information and Computer Sciences. 1999;**39**(5):887-896

[95] Schneidman-Duhovny D, Dror O, Inbar Y, Nussinov R, Wolfson HJ. PharmaGist: A webserver for ligandbased pharmacophore detection. Nucleic Acids Research. 2008;**36**:W223-W228

[96] Fan N, Bauer CA, Stork C, de Bruyn KC, Kirchmair J. ALADDIN: Docking approach augmented by machine learning for protein structure selection yields superior virtual screening performance. Molecular Informatics. 2020;**39**(4):e1900103

[97] Rashidieh B, Molakarimi M, Mohseni A, Tria SM, Truong H, Srihari S, et al. Targeting BRF2 in cancer using repurposed drugs. Cancers. 2021;**13**(15):3778

[98] Berman HM. The protein data bank. Nucleic Acids Research. 2000;**28**(1):235-242

[99] Chen X. TTD: Therapeutic target database. Nucleic Acids Research. 2002;**30**(1):412-415

[100] Chen YZ, Zhi DG. Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule. Proteins. 2001;**43**(2):217-226

[101] Wang JC, Chu PY, Chen CM, Lin JH. idTarget: A web server for identifying protein targets of small chemical molecules with robust scoring functions and a divide-and-conquer docking approach. Nucleic Acids Research. 2012;**40**:W393-W399

[102] Xie T, Zhang L, Zhang S, Ouyang L, Cai H, Liu B. ACTP: A webserver for predicting potential targets and relevant pathways of autophagymodulating compounds. Oncotarget. 2016;**7**(9):10015-10022

[103] Lee A, Kim D. CRDS: Consensus reverse docking system for target fishing. Bioinformatics. 2019

[104] Stepanova EE, Balandina SY, Drobkova VA, Dmitriev MV, Mashevskaya IV, Maslivets AN. Synthesis, in vitro antibacterial activity against *Mycobacterium tuberculosis*, and reverse docking-based target fishing of 1,4-benzoxazin-2-one derivatives. Archiv der Pharmazie. 2021;**354**(2):2000199

[105] Imrie F, Bradley AR, van der Schaar M, Deane CM. Protein familyspecific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data. Journal of Chemical Information and Modeling. 2018;**58**(11):2319-2330

[106] Kitchen DB, Decornez H, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: Methods

and applications. Nature Reviews. Drug Discovery. 2004;**3**(11):935-949

[107] Yuan S, Chan HCS, Hu Z. Using PyMOL as a platform for computational drug design. WIREs Computers Molecular Science. 2017;**30**(2):70

[108] Koukos PI, Réau M, Bonvin AMJJ. Shape-restrained modeling of protein– small-molecule complexes with high ambiguity driven DOCKing. Journal of Chemical Information and Modeling. 2021;**61**(9):4807-4818

[109] Koukos PI, Xue LC, Bonvin AMJJ. Protein–ligand pose and affinity prediction: Lessons from D3R Grand Challenge 3. Journal of Computer-Aided Molecular Design. 2019;**33**(1):83-91

[110] Stanzione F, Giangreco I, Cole JC. Use of molecular docking computational tools in drug discovery. In: Progress in Medicinal Chemistry. Elsevier; 2021. pp. 273-343

[111] Sennhauser G, Amstutz P, Briand C, Storchenegger O, Grütter MG. Drug export pathway of multidrug exporter AcrB revealed by DARPin inhibitors. PLoS Biology. 2007;**5**(1):e7

[112] Grosdidier A, Zoete V, Michielin O. SwissDock, a protein-small molecule docking web service based on EADock DSS. Nucleic Acids Research. 2011;**39**(suppl):W270-W277

[113] Guex N, Peitsch MC. SWISS-MODEL and the Swiss-Pdb Viewer: An environment for comparative protein modeling. Electrophoresis. 1997;**18**(15):2714-2723

#### **Chapter 3**

## Molecular Docking: Metamorphosis in Drug Discovery

*Kishor Danao, Deweshri Nandurkar, Vijayshri Rokde, Ruchi Shivhare and Ujwala Mahajan*

#### **Abstract**

Molecular docking is recognized a part of computer-aided drug design that is mostly used in medicinal chemistry. It has proven to be an effective, quick, and low-cost technique in both scientific and corporate contexts. It helps in rationalizing the ligands activity towards a target to perform structure-based drug design (SBDD). Docking assists the revealing of novel compound of therapeutic interest, forecasting ligand-protein interaction at a molecular basis and delineating structure activity relationships (SARs). Molecular docking acts as a boon to identify promising agents in emergence of diseases which endangering the human health. In this chapter, we engrossed on the techniques, types, opportunities, challenges and success stories of molecular docking in drug development.

**Keywords:** molecular docking, drug discovery, ligand-protein interaction, SAR, molecular recognition, drug design

#### **1. Introduction**

Medicinal chemistry relates to the design and production of compounds that can be used in medicine for the prevention, treatment or cure of human and animal diseases. Medicinal chemistry includes the study of existing drugs for their biological properties and structure activity relationships (SARs) [1, 2]. The discovery and development of a new drug with desired therapeutic activity is a long, tedious and expensive process. The industry statistics suggest that up to 10,000 compounds are synthesized and tested, up to 100 compounds are assessed for safety and only 10 compounds are tested clinically in humans for every drug that is approved for medical use. Today it takes approximately ten years and requires high cost to bring a new drug in market. In spite of the tremendous costs involved the payoff is also high and improvement made in preventing and controlling human disease. Even when the new drugs come in the market its success is not assured [3, 4]. Many centuries ago, human beings started using chemicals to treat the diseases. Hippocrates recommended the use of metallic salts such as copper and zinc, iron sulphate and cadmium oxide as drugs. In 1500 A.D., Carpensis employed mercuric compounds to treat syphilis. Urea was the first organic compounds to be synthesized in laboratory by Wohler in 1852. Between eighteenth

and nineteenth century, several organic compounds were synthesized which included drugs such as salicylic acid (Kolbe), antipyrine (Knorr), aspirin (Dresser), barbital (Emil Fischer and Mering), prontosil, the first sulpha drug (G. Domagk), chlorpromazine (Charpentier), phenyl magnesium bromide (Victor Grignard), polyethers (Charles J. Pedersen) and others [5]. Except, the therapeutic utility of these agents, nothing more was known about their mechanism of action and it was only believed that they were effective because of their physicochemical parameters like partition coefficient, hydrogen bonding, van der Waal's forces, dipole-dipole interactions and anionic bonds, etc. [6, 7]. Earlier to the chemical era, it was the natural products mostly from plant sources, which were used in therapeutics. Later, progress in knowledge of chemistry helped to isolate and identify the active ingredients in plants. Some of the outstanding achievements of such phytochemical approach include the discoveries such as digitalis glycosides from foxglove plant by William Withering in 1785; the opium alkaloids like morphine and codeine from poppy plant by Serturner in 1806; anti-malarial such as quinine, quinidine, cinchonine from cinchona bark by Pelletier and Dumas in 1823; belladonna alkaloids like atropine and scopolamine by Mein in 1833; rauwolfia alkaloids (reserpine and deserpidine) by Muller et al. in 1952, etc. In addition, many important natural products like antibiotics, steroids and peptide hormones, vitamins, enzymes, prostaglandins and pheromones were discovered in the concurrent period [8, 9]. The synthesis of compounds is followed by screening of its pharmacological actions. The observation of interest and repeatable biological activity in such screening had always opened the pathways for additional chemical research to prepare their analogs so as to obtain significant newer medicinal products. A small change in structure frequently leads a profound change in the pharmacological effect. This logic has prompted to synthesize derivatives of natural compounds and the structural analogues of biologically interesting substances with the "lead" (prototype) compound [10]. Many of the currently used antispasmodics [11–14] (dicyclomine, cyclopentolate, clidinium bromide, mebeverine, metoclopramide, tropicamide), antibiotics [15–20] (penicillins, cloxacillin, amoxacillin, ampicillin, cefadroxil, cefaclor, cefixime, cefepime), sulphonamides [21–25] (sulphacetamide, sulphadiazine, sulphasalazine, sulphamethoxazole), anthelmintics [26–28] (albendazole, mebedazole, pyrantel pamoate, piperazine, diethylcarbamazine citrate, praziquantel, niclosamide), antimycobacterials [29–31] (clofazimine, dapsone, ethambutol, isoniazid, benzothiazole, sulphonamide, rifampin), analgesics [32–35] (aspirin, diclofenac sodium, ibuprofen, indomethacin, ketoprofen, naproxen, piroxicam), anticonvulsants [36–40] (phenytoin, ethosuximide, carbamazepine, sodium valproate, riluzole), antitumours [41–46] (amsacrine, azacitidine, chlorambucil, cyclosporine, fluorouracil), diuretics [47–51] (acetazolamide, chlorothiazide, furosemide, triamterene, spironolactone), antimalarials [52–56] (chloroquine, primaquine, amodiaquine, proguanil, pyrimethamine), antifungals [57–60] (griseofulvin, nystatin, miconazole, tolnaftate, clotrimazole), antihistaminics [61–65] (chlorpheniramine maleate, promethazine, astemizole, cetirizine hydro-chloride, fexofenadine) have been obtained by synthetic or semi-synthetic approach. In recent years, the molecular studies are more directed to discover new targets for better treatment of the disease. In addition, newer screening methods of assays, studying the effect of drug on the cell lines, availability of purified or recombinant enzymes and improved understanding about the nature and properties of receptor systems immensely boosted the drug research. It is well recognized that a medicinal chemist had been a key person in the discovery of a new drug. He synthesizes a new drug, isolates and characterizes natural products and in association of

pharmacologist establishes a rational SAR. Moreover, SAR had proved to be vital and fundamental to drug discovery [66].

#### **1.1 Discovery of drugs of the future**

Traditionally, new medications have been discovered by screening a large number of synthetic chemical compounds or natural items for desired effects. Although this method of developing novel pharmacological agents has proven to be successful in the past, it is not optimal for a variety of reasons. The most significant disadvantage of the screening approach is the demand for a proper screening procedure. Another problem with the screening process is that because of its random nature, it is inherently repetitious and time consuming just to find a chemical with the desired activity [67, 68]. Drugs can be created particularly to interact with the target molecule in such a way that the disease is disrupted after the disease process is understood at the molecular level and the target molecule (s) is defined. Because of the large quantity of data that must be gathered in order to produce medications using this method, here is where computer-aided drug design will have the most influence [69, 70].

In discussing various techniques of finding new drugs described in **Figure 1**, it is important to remember that drug discovery is both a cumulative and a reiterative process. Drugs developed mechanistically will likely to be screened and later modified in order to produce the best candidate design [71]. The use of stiff constructs for structure and targets is common in the early stages of using molecular modelling to create medications. In medication design, the flexibility of molecular information, both in single molecules and in molecules interacting with each other, is a crucial and difficult subject.

Since, the discovery of morphine in 1806 lot many important drugs came for remedy of humans, important results in drug discovery during last three centuries is shown in **Table 1**.

**Figure 1.** *Lead optimization cycle.*


*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*


#### **Table 1.**

*Important results in drug discovery.*

#### **1.2 Computer-aided drug design**

Drug research and discovery is a time-consuming and costly procedure. In order to get a medicine to market, it takes an average of 10–15 years and \$500–800 million dollars [72]. This is why, in order to speed up the process, computer-assisted drug design (CADD) technologies have become popular in the pharmaceutical business. CADD, as shown in **Figure 2**, assists scientists in focusing on the most promising compounds in order to reduce the amount of time and money spent on synthetic and biological testing.

In reality, the availability of experimentally defined 3D (three-dimensional) structures of target proteins usually determines which CADD techniques are used. If the structure of a protein is unknown, ligand-based drug design methods such as quantitative structure activity relationship (QSAR) and pharmacophore analysis can be used. If the target structures are known, structure-based techniques such as molecular docking can be utilised to create novel active molecules with improved potency using the target 3D structures. The accuracy of prediction is anticipated to improve as more structures become accessible. In the absence of the receptor 3D information, lead identification and optimization depend on available pharmacologically relevant agents and their bioactivities [73, 74]. The computational approaches include QSAR, pharmacophore modelling and database mining. QSAR can be taken as an example to

**Figure 2.** *Computer-aided drug design.*

illustrate the workflow. A mathematical relationship between structural features and target properties of a group of compounds is described by QSAR. Over the last few decades, many various 2D (two-dimensional) and 3D QSAR techniques have been developed [75]. Chemical descriptors and mathematical procedures used to build the association between the goal attributes and the descriptors are two key differences between these strategies.

Many graph theoretic indices-based 2D QSAR algorithms have been thoroughly researched. Although the physical significance of these indices is unknown, they do indicate various characteristics of molecular structures. It's been used to predict biological activity in analytical chemistry, toxicology analysis, and other fields. To overcome the shortcomings of 2D QSAR techniques, such as their inability to differentiate stereoisomers, 3D QSAR approaches have been developed. Molecular shape analysis (MSA), distance geometry, and Voronoi procedures are examples of 3D methodologies. The most well-known example of 3D QSAR is comparative molecular field analysis (CoMFA). By elegantly merging the power of molecular graphics and the partial least square (PLS) technique, it has been widely employed in medicinal chemistry and toxicity studies. The linear relationship between a target property and molecular descriptors is frequently assumed in QSAR approaches. However, the rapid development of structural and biological data has put this assumption to the test. To this goal, a number of nonlinear QSAR algorithms have been presented, the majority of which are based on artificial neural network (ANN) or machine learning techniques [76]. Scientists had always concentrated on the development and application of automated algorithms for QSAR studies, including genetic algorithms (GAs)-partial least squares, k-nearest neighbour (k-NN), and support vector machine (SVM). Learning approaches have been widely used in cheminformatics and molecular modelling. For instance, SVM was found to yield better results compared to multiple linear regressions (MLR) and radial basis functions (RBF).

SBDD (structure-based drug design) has played a significant role in drug development and discovery [76]. Understanding receptor–ligand interactions is required for this strategy. The target 3D structure can be used to develop new ligands if it is known. X-ray crystallography, NMR, and homology modelling are all used to obtain structural information. SBDD methods are used to assess complementarities and anticipate potential binding modes and affinities between small compounds and their macromolecular receptors. SBDD's success is extensively proven, and computational approaches differ greatly in methodology, performance, and speed. Some can provide accurate binding modes, while others are better suited to scanning vast datasets quickly [77].

#### **2. Molecular docking study**

The production, manipulation, or representation of 3D structures of molecules and their associated physicochemical properties is referred to as molecular docking. It entails a variety of computational strategies for predicting chemical and biological properties based on theoretical chemistry methodologies and experimental data. The subject is sometimes referred to as "molecular graphics," "molecular visualisations," "computational chemistry," or "computational quantum chemistry," depending on the context and rigour. The molecular docking techniques are based on Huckel and Mullikan's conceptions of molecular orbitals and Westheimer et al. classical's mechanical programming.' The foundation of SBDD is 3D molecular structure [78, 79].

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

**Figure 3.** *Molecular docking process.*

Separate data for protein structure and medication data are available, but no correlated data is accessible. Docking is the process of fitting two molecules together in complimentary styles in 3D space and designing the molecules rationally, as seen in **Figure 3**. Modeling a drug's interaction with its receptor is a difficult task. Hydrophobic, dispersion or van der Waals, hydrogen bonding, and electrostatic forces all play a role in intermolecular interaction. Hydrophobic interactions appear to be the dominant force for binding, whereas hydrogen bonding and electrostatic interactions appear to influence the specificity of the binding [80, 81].

#### **2.1 Theory of docking**

The objectives of molecular docking is to forecasting the ligand-receptor complex by using computer method. Docking is partitioned into two steps that is sampling ligand and scoring function. Sampling algorithms aid to find the energetically most favorable conformations of the ligand in the active site of the protein with their binding mode and further ranked these conformations using a scoring function.

#### *2.1.1 Sampling algorithms*

There are a great number of potential binding modes between two molecules due to the six degrees of translational and rotational freedom as well as the conformational degrees of freedom of both the ligand and protein [82]. Unfortunately, computing all of the conceivable conformations would be too expensive. In molecular docking software, various sampling techniques have been developed and are frequently utilized. In terms of shape features and chemical information, matching algorithms (MAs) based on molecular shape map a ligand onto an active site of a protein [83]. Pharmacophores represent the protein and the ligand. Each pharmacophore distance within the protein and ligand is determined for a match; the distance matrix between the pharmacophore and the associated ligand atoms governs new ligand conformations. During the match, chemical parameters such as hydrogen-bond donors and acceptors might be considered. Because MAs are fast, they can be used to enrich active chemicals from vast libraries. DOCK, FLOG, LibDock and SANDOCK programme provides ligand docking MAs [84–86]. The ligand is placed in an active site in a fragmented and incremental manner using incremental construction methods (ICMs). By breaking the ligands rotatable links, it is separated into many fragments, one of which is chosen to dock into the active site first. This anchor is typically the biggest fragment or the piece

that has a functional purpose or interacts with protein. The remaining pieces can be added in stages. The ligand's flexibility is realized by generating different orientations to fit in the active site. DOCK 4.0, FlexX and SLIDE all use the ICM. In supplement to ICM, fragment-based approaches such as multiple copy simultaneous search (MCSS) and Ligue Universitaire D' Improvisation (LUDI) are used to create new ligands and modify existing ligands to improve their binding to the target protein. At the force field of the protein, MCSS creates 1000–5000 copies of a substituent, which are randomly put in the binding site of interest and subjected to simultaneous energy minimization and/or quenched molecular dynamics. Copies solely interact with proteins; interactions between copies are not included. Based on the interaction energies, a collection of energetically favorable binding sites and orientations for the functional group is discovered. Different functional categories are used to map the binding site. The linking of those different functional groups can be used to create new molecules that perfectly match the binding site [87]. The hydrogen bonds and hydrophobic interactions that potentially occur between the ligand and protein are the focus of LUDI. Interaction sites, which are discrete positions in space appropriate for establishing hydrogen bonds or filling a hydrophobic pocket, are the core notion. Using the rules or scanning the database, a set of interaction sites is constructed. After that, the fragment is fitted onto the interaction sites and distance criteria are used to evaluate it. The merging of some or all of the fitted fragments to a single molecule is the final stage. By randomly changing a ligand conformation or a population of ligands, stochastic methods seek the conformational space. Another well-known class of stochastic approaches is genetic algorithm (GA). The GA was inspired by Darwin's theory of evolution. The ligand's degrees of freedom are represented as binary strings called genes. These genes make up the "chromosome," which indicates the ligand's position. In GA, there are two types of genetic operators: mutation and crossover. Crossover swaps genes between two chromosomes, while mutation produces random changes to the genes. A novel ligand structure is created when genetic operators impact genes. New structures will be evaluated using a scoring system, and those that survive will be employed in the upcoming generation. AutoDock, GOLD, DIVALI, and DARWIN all use GAs [88–91].

#### *2.1.2 Scoring functions*

The scoring function's goal is to distinguish between proper and inappropriate poses, or binders and inactive substances, in a very short time. Scoring functions, on the other hand, require guessing rather than computing the protein-ligand binding affinity and through these functions, numerous assumptions and simplifications are used. There are three types of scoring functions: force-field-based, empirical, and knowledge-based. Basic force-field-based scoring functions calculate the sum of non-bonded (electrostatics and van der Waals) interactions to determine the binding energy. A Columbic framework is used to determine the electrostatic terms. Due to the difficulty of representing the protein's true environment with point charge calculations, a distance-dependent dielectric function is commonly utilized to regulate the contribution of charge–charge interactions [92–94]. A Lennard-Jones potential function describes the van der Waals terms. The "hardness" of the potential, which regulates how close a contact between protein and ligand atoms can be tolerated, can be varied by using different parameter sets for the Lennard-Jones potential. The processing speed of force-field-based scoring functions is also an issue. To address non-bonded interactions, cut-off distance is used. As a result, the accuracy of long-range effects involved in binding is reduced. Hydrogen bonds, solvations,

#### *Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

and entropy contributions are considered in extensions of force-field-based scoring functions. DOCK, GOLD, and AutoDock are examples of software applications that provide these features [95]. They differ in their treatment of hydrogen bonding, the structure of the energy functions and other aspects. Furthermore, the accuracy of estimating binding energies can be improved by using other techniques also including linear interaction energy and free-energy perturbation methods (FEP) to refine the findings of docking with force-field-based functions. Binding energy is decomposed into multiple energy components in empirical scoring functions, including hydrogen bonds, ionic interactions, hydrophobic effect, and binding entropy. To arrive at a final score, each component is multiplied by a coefficient and then added together. Regression analysis fitted to a test set of ligand-protein complexes with known binding affinities yields coefficients. The energy terms in empirical scoring functions are quite simple to evaluate the affinities. Beyond the training set, however, it is unknown how well they are suited for ligand-protein complexes. Furthermore, various software may treat each term in empirical scoring functions differently, and the amount of terms included may differ as well. Examples of empirical scoring functions include LUDI, piecewise linear potential (PLP), and ChemScore. The interatomic interaction frequencies and/or distances between the ligand and protein are calculated using statistical analysis of ligand-protein complex crystal structures. They are founded on the notion that the more beneficial an encounter is, the more likely it will occur [96, 97]. Pairwise atom-type potentials are created from these frequency distributions. Within a particular cutoff, the score is derived by prioritizing favorable contacts and penalizing repulsive interactions between each atom in the ligand and protein. Knowledge-based functions are appealing because of their computational simplicity, which can be used to screen enormous compound datasets. They can also represent some unusual interactions, such as sulphur-aromatic or cation- that are frequently overlooked in empirical approaches. However, some interactions are underrepresented in the limited training sets of crystal structures, and the bias inherent in the selection of proteins for successful structure determination, so the obtained parameters may not be suitable for widespread use, particularly with implicating metals or halogens. knowledge-based functions such as DrugScore, SMoG, and Bleep that differ mostly in training set size, energy function shape, atom type definition, distance cutoff, and other characteristics [98–100]. Consensus scoring is a new technique for assessing docking conformation that combines numerous different scores. When a ligand or possible binder poses well in a number of different scoring schemes, it may be accepted. In virtual screening, consensus scoring usually enhances enrichment and improves the prediction of bound conformations and poses. However, binding energies predictions may still be wrong. When terms in distinct scoring functions are substantially connected, the utility of consensus scoring decreases. DOCK, ChemScore, PMF, GOLD, and FlexX scoring functions are all combined in CScore [101–103].

#### **2.2 Docking methodologies**

#### *2.2.1 Docking of rigid ligand and rigid receptor*

The search space is highly constrained when the ligand and receptor are both considered as rigid entities, with only three translational and three rotational degrees of freedom. In this scenario, ligand flexibility might be addressed by allowing for a degree of atom–atom overlap between the protein and the ligand, or by using a precomputed set of ligand conformations. Early versions of DOCK, FLOG, and certain

protein-protein docking systems like FTDOCK used a mechanism that kept the ligand and receptor stiff during the docking process [104, 105].

DOCK is the world's initial automated process for docking a molecule into a receptor site, and it's still evolving. The ligand and receptor are represented as sets of spheres that can be superimposed using a clique detection approach. The ligandreceptor complexes are scored using geometrical and chemical MAs, and steric fit, chemical complementation, and pharmacophore similarity are all taken into account. To account for ligand flexibility, incremental construction approach and exhaustive search have been included to the enhanced versions.

The extensive search generates a user-defined number of conformers at random, which is a multiple of the ligand's rotatable bonds. In terms of scoring, DOCK 6.4 now includes AMBER derived forcefield scoring with implicit solvent. Also, the molecular mechanics methodologies such as Poisson–Boltzmann or generalized Born and surface area continuum solvation (MM/PBSA and MM/GBSA) methods are used to determine the chemisorption which estimate the free energy of the binding of small ligands to biological macromolecules [106].

FLOG creates ligand conformations based on distance geometry and calculates the sets of distances using a search technique. For some flexibility, up to 25 specified conformations of the ligand might be employed to dock. Users can identify critical sites that must be associated with ligand atoms using FLOG. If a critical interaction is already known before docking, this method is useful. Van der Waals, electrostatics, hydrogen bonding, and hydrophobic interactions are all taken into account when scoring conformations [107].

#### *2.2.2 Docking of flexible ligand and rigid receptor*

As both the ligand and the receptor change conformations to form a minimum energy perfect-fit complex in systems that follow the induced fit paradigm, it is critical to consider the flexibility of both the ligand and receptor. However, when the receptor is also flexible, the cost is very high. As a result, the most typical technique is to consider the ligand as flexible while keeping the receptor stiff during docking, which is likewise a trade-off between accuracy and computational time. Almost all docking applications, such as AutoDock and FlexX, have embraced this concept [108–110]. To mimic ligand flexibility while keeping the receptor stiff, AutoDock 3.0 uses Monte Carlo simulated annealing, evolutionary, genetic, and Lamarckian genetic algorithm (LGA) approaches. The AMBER force field, which includes van der Waals, hydrogen bonding, electrostatic interactions, conformational entropy, and desolvation components, is used to calculate the scoring function. An empirical scaling factor derived from experimental data is used to weight each term. By enabling side-chains to shift, AutoDock 4.0 can model receptor flexibility. In this version of AutoDock, you may also test the interaction of protein-protein docking [111–114]. The latest version of AutoDock Vina for molecular docking and virtual screening was recently published. By redocking the 190 receptor-ligand complexes that had been utilised as a training set for the AutoDock 4, AutoDock Vina demonstrated a two-order exponential increase in speed as well as a considerable improvement in binding mode prediction accuracy [115]. FlexX samples ligand conformations using an incremental building approach. By matching hydrogen bond pairings and metal and aromatic ring interactions between the ligand and protein, the base fragment is docked into the active site. The remaining components are then built up incrementally in line with a set of preset rotatable torsion angles to complete the structure. Electrostatic interactions, directional

#### *Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

hydrogen bonds, rotational entropy, and aromatic and lipophilic interactions are all included in the present edition. The relationships between functional groups are also considered when group types and geometry are assigned [116].

#### *2.2.3 Docking of flexible ligand and flexible receptor*

In flexible docking, the docking of the ligand and receptor is difficult task due to protein intrinsic mobility and ligand binding affinity. MD simulations might theoretically model all degrees of freedom in the ligand-receptor combination. However, MD has the previously discussed issue of insufficient sampling. Another stumbling block is the method's high computing cost, which prevents it from being employed in large-scale chemical database screening [117]. Several theoretical models, including conformer selection and conformational induction, have been presented to illustrate the flexible ligand-protein binding process in addition to the historic induced fit. Conformer selection refers to a process in which a ligand selects a favourable conformation from a variety of protein conformations, while conformational induction describes a process in which the ligand induces the protein to adopt a conformation that it would not adopt spontaneously in its unbound state. This conformational change is sometimes compared to a partial refolding of the protein [118]. The most basic is "soft-docking," which lowers the van der Waals repulsion energy term in the scoring function to allow for some atom-to-atom overlap between the receptor and the ligand. This strategy could be lacking in versatility. Nonetheless, it has the advantage of computational efficiency because the receptor coordinates are fixed, and the van der Waals parameters are readily adjusted. To deal with side chain flexibility, AutoDock 4 uses a simultaneous sampling technique. Users can select multiple side chains of the receptor and sample them simultaneously with a ligand using the same methods. During sampling, other parts of the receptor are handled strictly using a grid energy map. Grid energy maps were established to hold receptor energy information and facilitate ligand-receptor interaction energy calculations [119]. Another approach to dealing with protein flexibility is to use an ensemble of protein conformations, which corresponds to conformer selection theory. Instead of docking into a single rigid protein conformation, a ligand is docked into a set of hard protein conformations and the results are merged using the method of choice. This method was first used in DOCK, which constructs an ensemble's average potential energy grid and has since been extended in a variety of programmes. Discrete protein conformations are sampled in a combinatorial approach during the gradual building of a ligand. Based on a comparison of the ligand and each alternative, the highest scoring protein structure is chosen (**Table 2**).

Because there are so many degrees of freedom and little knowledge of the effect of solvent on the binding relationship, modelling the intermolecular interactions in a ligand-protein complex is difficult. The docking of a ligand to a binding site attempts to emulate the natural course of interaction between the ligand and its receptor by taking the shortest path possible. Although there are straightforward ways for docking rigid ligands with rigid receptors and flexible ligands with rigid receptors, docking conformationally flexible ligands and receptors is more difficult. The interaction of macromolecular receptors and tiny drug molecules is a crucial stage in regulatory systems, drug pharmacology, hazardous side effects, and other processes.

The structure of protein-ligand or protein-protein binding sites is exploited in SBDD, however the site is not always known at the outset. Even if the site is identified, researchers may want to look for other potential binding sites that could lead to distinct biological effects or a new class of drugs. In lead optimization, it's also critical to know how well known binders or docking hits fulfil or violate the receptor's complementarity. One component of molecular modelling is molecular mechanics, which refers to the use of classical/Newtonian mechanics to describe the physical basis of the models. In most molecular models, atoms (the nucleus and electrons combined) are described as point charges with a mass. Spring-like interactions (representing chemical bonds) and Van der Waals forces describe the interactions between nearby atoms. The Lennard-Jones potential is often used to characterise Van der Waals forces. Coulomb's law is used to calculate electrostatic interactions. Atoms are given coordinates in Cartesian space or internal coordinates, and in dynamical simulations, they can also be given velocities. The atomic velocities are proportional to the system's temperature (a macroscopic quantity). A potential function is a mathematical expression that is related to the system's internal energy (*U*), which is equal to the sum of potential and kinetic energies (a thermodynamic quantity). Energy reduction techniques (e.g., steepest descent and conjugate gradient) are used to reduce potential energy, whereas molecular dynamics methods are used to predict the behaviour of a system with time propagation [120–130].

As previously stated, molecular docking's role in drug design has been divided into two paradigms: one focused on the structure-activity problem, which attempts to rationalise in the absence of detailed 3D structural information about the receptor, and the other focused on understanding the interaction seen in the receptor-ligand complex, which uses the known 3D structure of the therapeutic target to design novel drugs. A binding relationship between a small molecule ligand and an enzyme protein can cause the enzyme to be activated or inhibited. Ligand binding may cause agonism or antagonism if the protein is a receptor. The most common application of docking is in the field of medication design. The most medications are tiny organic compounds and docking may be applied as follows,


Molecular docking not only contributes to the design of potent compounds but also assist various steps in development of new drugs from laboratory to clinic. Few examples of contribution of molecular modeling are design of thimidylate synthetase inhibitors as anticancer agents, HIV protease inhibitors as antiviral agents, neutrophil elastase inhibitors as agents for emphysema, carbonic anhydrase inhibitors as antiglucoma agents and in discovery of novel sweeteners-taste receptor models [131–133]

In addition to the existing large number of docking programs, there are also many molecular mechanics programs applicable to these problems. Of course, there are some programs that are very widely used. Nevertheless it seems that the programs are not that easy to use and require some understanding of the underlying computational principles. Some of the software system are listed below [134–139].




#### **Table 2.**

*The successful application of computer assisted drug design approach to biological targets.*

*AutoDock:* To generate a set of potential conformations, AutoDock use Monte Carlo simulated annealing and the LGA energy minimization is employed as a local search strategy and LGA is used as a global optimizer. The AMBER force field model is used in conjunction with free energy scoring functions and a wide set of protein-ligand complexes with known protein-ligand constants to analyse possible orientations. AutoDock's web pages are more informative than its competitors', and its free academic licence makes it a nice place to start if you're new to molecular docking software.

*DOCK:* DOCK is one of the most well-known and widely used ligand-protein docking tools. The initial version employed hard ligands; flexibility was later added by building the ligand in the binding pocket incrementally. DOCK, as previously stated, is a fragment-based technique that uses complimentary shape and chemistry methodologies to generate various ligand orientations. Three distinct scoring systems can be used to score these orientations; however, none of them include explicit hydrogenbonding terms, solvation/desolvation words, or hydrophobicity parameters, limiting their usefulness. DOCK appears to handle polar binding sites well and is beneficial for quick docking, but it isn't the most precise programme available.

*FlexX*: FlexX is a fragment-based approach that uses hard proteins and flexible ligands. It creates conformers using the MIMUMBA torsion angle database. MIMUMBA is a database of intermolecular interaction patterns that uses interaction geometry to precisely define them. The Boehm function is used for scoring (with slight adjustments for docking). FlexX is used to emphasise the significance of scoring functions. Despite the fact that FlexX and DOCK are both fragment-based approaches, they give very distinct outputs. FlexX behaves in an entirely different way than DOCK, which works well with polar binding sites. It has a slightly lower hit rate than DOCK, but it produces superior Root Mean Square Distance estimates for compounds with accurately predicted binding modes. FlexE, a FlexX extension with flexible receptors, has been demonstrated to yield better outcomes with substantially shorter run times.

*Gold*: Because of its strong outcomes in independent tests, gold has gained a lot of new users in recent years. It has a good overall hit rate, although it struggles a little when dealing with hydrophobic binding pockets. To offer docking of a flexible ligand and a protein with flexible hydroxyl groups, Gold use a GA. Aside from that, the protein is considered stiff. When the binding pocket contains amino acids that create hydrogen bonds with the ligand, this makes it a favourable choice. Gold employs

a scoring system based on favourable conformations discovered in the Cambridge Structural Database as well as empirical evidence on weak chemical interactions. The current focus of GOLD development is on enhancing the computational algorithm and introducing parallel processing capability.

### **3. Toxicity prediction and prediction adverse drug reaction**

Any chemical's harmful or adverse effects are called as toxicity. Toxicity, such as carcinogenicity or genotoxicity, can be quantitative (e.g., lethal dose to 50% LD50 of tested individuals) or qualitative (e.g., toxic or nontoxic). In studies of toxicity the use of acute-exposure (single dose) or multiple-exposure (multiple dose) to determine detrimental effects of chemicals on humans, animals, plants, or the environment (multiple doses). Chemical toxicity is determine through several factors like the mode of exposure (oral, cutaneous or inhalation), dose, exposure frequency (single or multiple), exposure duration, qualities of chemical, biological properties (age, gender) and absorption, distribution, metabolism, excretion (ADME). Generally, animal models have been used for long time for toxicity testing. Nowadays advancements in high throughput screening, *in vitro* toxicity testing are easily achievable. Computational toxicology is one of the best toxicity assessment tool that establish, analyses, models, simulates, visualize or prediction of chemical toxicity. The simulation tools like algorithms, softwares, data, etc., which are projected *in vitro* toxicity experiments in order to avoid the animal models and cost effective toxicity testing which expands toxicity prediction and safety evaluation. Moreover, additional computational tools have the distinct benefits of being able to predict toxicity of substances even before they are created (**Figure 4**) [140].

Softwares (generating molecular descriptors):


#### **Figure 4.**

*In silico toxicology tools, steps to generate prediction models, and categories of prediction models [140].*


By and large, modeling approaches comprise five major steps while developing prediction models.

#### **3.1 Why exploration of toxicity prediction is important?**

Optimization of molecule is important during initial drug development for good efficacy as well as for pharmacokinetics (PKs) and toxicological properties prediction. Appropriate balance of target potency, selectivity, suitable ADME, and safe preclinical properties all together leads to the choice and clinical development of a potential new drug moiety. In clinical phase I trial the characteristic compound have to undergo years of preclinical testing and acquire only 8% chance of getting to the market. The failure of development of new drug cause by its toxicity. Therefore, executing toxicity analysis to be done in the early phase of the development process which gives significant potential to make value.

The major reasons that impede pharmaceutical companies to conduct earlier screening for toxicity like the big amount of compounds required for *in vivo* studies, the deficiency of *in vitro* assay predictions through high throughput along with inability of *in vitro* and animal models to proper prediction of toxicity in humans. The development of computational tools or *in silico* tools for prediction of toxicity are required to avoid above mentioned hurdles. These tools are structure based or using modeling techniques on human data, which provides approaches for removing the toxic effect in humans before the physical appearance of compound. The importance of computational tools arises from their applicability early in development stage. During the last few years, computational toxicology prediction system tremendously increased their forecasting ability but still unable to achieve the significant achievement because of deficit of big datasets contain toxicological effects like hepatotoxicity, teratogenicity, etc. The development of low throughput data with generations and coordinated efforts and set up on big historical background of experience and trained with small additional efforts may save a big investment and avoid use of animals (**Table 3**) [141].


Studies in laboratory animals have traditionally been used to determine the possible risks of chemicals, with modifications in clinical pathology and histology


#### **Table 3.**

*In silico tools used for predicting toxicity endpoints of chemicals/drugs.*

compared to untreated controls defining an adverse effect. In recent decades, there has been a greater degree of agreement in the definition of adversity in experimental animals caused by chemically produced effects, as well as in the assessment of human relevance. More recently, a paradigm change in toxicity testing has been proposed, largely as a result of animal welfare concerns, but also as a result of the development of new technologies. *In vitro* methods, toxicogenomic technologies, and computational tools are already available to provide mechanistic insight into the toxicological mode of action (MOA) of deleterious effects found in laboratory animals. Tox21c

(toxicity testing in the twenty first century) is an idea that intends to forecast *in vivo* toxicity using a bottom-up strategy, starting with an understanding of MOA based on *in vitro* data and eventually predicting detrimental effects in humans [142].

Data sets and metrics used for drug side effect prediction:

	- Docking-based approaches
	- Network-based approaches
	- Machine learning-based approaches

**Figure 5** depicts the categorization as well as the numerous approaches within each of the categories. The next sections discuss each of these categories and describe some of the most important efforts in the field of drug side effect prediction that have been done in each of these categories.

• *Docking-based approaches:* The preferred orientation of one molecule with another to form a stable compound is referred to as docking. Docking is one of the most used strategies for designing drugs based on structural data. The ability of targets to bind to one another is a critical property that impacts the efficiency of biochemical processes. When a medicine attaches to a certain protein, it can produce side effects. Drug side effect prediction using docking-based techniques

**Figure 5.** *Classification of drug side effect prediction approaches [143].* identifies possible drug binding sites. Many adverse effects are thought to be caused by an unexpected interaction of a medication molecule with a specific protein [144]. Side effects occur when a medication molecule is overregulated or communicates with a protein in an unexpected way. A molecular docking-based method for finding these target proteins has been presented INVDOCK. Various side effect–protein relationships were discovered during the method's evaluation. Various publications supporting the indicated side effect–protein relationships were discovered by searching the PubMed data collection.


#### **4. Polypharmacology and drug repositioning**

Polypharmacology, a new paradigm in drug discovery that focuses on multi-target medicines (MTDs), has applications in drug repurposing, the process of finding new uses for already-approved pharmaceuticals, off-target toxicology prediction, and rational MTD design. In this situation, computational approaches have shown great promise in predicting polypharmacology and assisting with pharmaceutical repurposing [145].

The goal of polypharmacology is to identify a small ligands with off-target activities. Polypharmacology and chemogenomics have a high level of interaction. Chemogenomics is the study of the relationship between targets and their ligands in terms of structure and activity. The information about a target's ligands and its distance from other targets in biological space can be used to aid in the evaluation of new compounds for one or more novel targets. Both approaches can be employed in

the early stages of development to screen out compounds and reduce the probability of failure due to significant adverse effects. When used on known medications, polypharmacological approaches can lead to a compound's repurposing for a new indication. Drug repurposing is suitable for marketed medications or development candidates that have failed in clinical trials due to lack of efficacy but have a strong safety profile and PK features [146]. Because prior clinical trial studies provide valuable data on drug PKs/PDs and toxicity profiles, repurposing previously approved pharmaceuticals saves time and money in drug development when compared to generating novel drugs from scratch. Sildenafil (Viagra®), a medicine that was originally created to treat hypertension but is now marketed to treat penile erection dysfunction, is a well-known example of drug repositioning [147].

Most pharmaceutical corporations and specialized service providers are increasing their medication repurposing activities in response to the present productivity problem and the need to minimize attrition rates in drug development. Because large pharmaceutical corporations, in particular, have a large pool of unsuccessful drug candidates, dedicated divisions have been formed and collaboration agreements have been negotiated. As a result of the endeavour, there has been a rise in the development and application of *in silico* approaches in this field. Due to computational constraints, *in silico* approaches for polypharmacology analysis and medication repurposing have primarily relied on 2D representations of small compounds. First, 3D approaches have already been outlined, but further research will allow for the discovery of targettarget correlations that are not conceivable in the 2D world. This, together with recent breakthroughs in 3D tool computational throughput, suggests that these methods will be able to be used on the same scale as 2D tools in the near future [148]. Because of its potential applications and recent successes, polypharmacology has inspired a lot of interest in drug discovery [149]. Polypharmacology is exemplified by kinase inhibitors. Imatinib, for example, was developed to target the BCR-ABL protein and was licenced by the Food and Drug Administration to treat chronic myelogenous leukaemia [150].

High-throughput virtual screening (HTVS) is a simple tool for detecting hits in a single-target drug discovery project, but it is insufficient when several targets are investigated at the same time. In order to address polypharmacology, a multi-target approach must be developed. In order to identify the "magic shotgun" that can target numerous receptors at the same time, inverse docking techniques must be used. This enables the bioactivity and secondary effects of a potential new drug to be predicted, as well as the repositioning of existing treatments. Polypharmacology of known drugs and novel compounds is predicted *in silico* using structure-based and ligand-based approaches, as well as the rational design of MTDs.

*In silico* approaches have advanced as a valuable strategy in early drug development, and as additional target structures, structural bioactivity data, and therefore enhanced chemoinformatic tools become accessible, their influence will certainly grow. Because medications with a certain polypharmacologic profile will allow for better treatment of certain diseases, one of the most important computational challenges ahead is the application and development of algorithms for identifying suitable molecules (**Figure 6**).

Polypharmacology can be predicted using computational methods. Statistical data analysis and bioinformatics, ligand-based, and structure-based approaches can be used singly or in combination to take use of each approach's unique characteristics and strengths. The figure's lower half depicts three separate proteins (A–C) interacting with the same ligand, emphasising that the ligand's final pharmacological effect is the product of synergistic effects emerging from interactions with all targets.

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

**Figure 6.** *Polypharmacology can be predicted using computational methods [148].*

Structure-based approaches, ligand-based methods, and systems biology methods are the three categories of methodologies that can be used to anticipate unknown targets for small compounds.

• *Structure-based methods*: Inverse docking, binding site similarities, inverse pharmacophore modelling, molecular dynamics simulations, and fragmentbased multi-target drug design are examples of structure-based techniques. Currently, the Protein Data Bank (PDB) has substantially includes 3D protein structures that refined by protein crystallography, nuclear magnetic resonance spectroscopy, and electron microscopy. Due to the availability of such structural data, inverse docking algorithms have been developed, with the primary goal of docking a small molecule into binding sites of many targets for hit identification. INVDOCK, TarFisDock, and idTarget are some of the modified scoring functions that have been developed specifically for target ranking in recent years. Binding site similarity-based search, in addition to inverse docking, is commonly employed for target prediction. It's based on the idea that structurally comparable proteins have similar chemical functions, thus they'll probably bind to structurally similar substances. Combining the GRID Molecular Interaction Fields with pharmacophoric characteristics, the Fingerprints for Ligands and Proteins (FLAP) algorithm was recently developed. Drug repurposing and hit identification can both benefit from binding site similarity technologies. It can also be employed in the lead optimization process by comparing binding locations. Advanced pharmacophore approaches have recently been developed to connect structure-based pharmacophore models of targets with small molecule pharmacophoric features to small molecule pharmacophoric features. Fragments are smaller, simpler chemical entities than drug/lead-like compounds, and they have a higher promiscuous nature. Fragment-based techniques boost the likelihood of obtaining hits and aid in the discovery of novel compounds because a small number of pieces can cover a large chemical search area. As a result, they can be utilised for hit detection, lead generation, and lead optimization.


The concept that comparable drugs bind to similar targets still underpins the majority of polypharmacology research. The development of precise and robust scoring algorithms that can rank targets rather than tiny molecules is a big challenge. Novel approaches to rational design of multi-targeting small molecules are now being investigated. Apart from traditional structure- and ligand-based approaches, there has been an upsurge in interest in system biology and bioinformatics-based methodologies, as well as community-wide activities. These approaches have been demonstrated to not only anticipate new small molecule targets, but also to aid in the understanding of disease dynamics and the molecular interaction pathways that lie beneath. Polypharmacology, which can predict both on-target and off-target therapeutic effects, could help in illness targeting. As a result, the rational polypharmacological drug design (PDD) holds a lot of promise and possibility for drug discovery in the future. However, in order to reach such ambitious aims and, eventually, translate information into successful patient therapy, we must overcome a number of flaws and roadblocks [151].

The field of computational polypharmacology has progressed to the point where concrete hypotheses may be formulated using prediction results to guide wet-lab research. The field of computational polypharmacology has advanced to the point where concrete hypotheses may be established and used to guide wet lab research utilizing prediction results. Furthermore, the majority of contemporary approaches are implemented as web servers or standalone applications. As community efforts become more essential, it will be necessary to create portable programming libraries that community developers can use to alter existing toolkits or create new ones. More cell-free, cell-based, and animal models are needed in experimental assays to examine the impact of drugs on various targets or functions at the same time.

#### **5. Opportunities and challenges**

There are six components to the CADD challenges. Chemical and biological space are the two major categories. The term "chemical space" refers to the large number

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

**Figure 7.** *In silico methods showing outstanding challenges during drug discovery and design.*

of possibilities for discovering hit substances. Third is methodologies challenges, in which for designing and optimizing drug candidate's computational methods could be used. Last one is the proper training of newcomers like investigators of CADD for multidisciplinary work (**Figure 7**) [152–155].

The topic of drug repurposing is gaining impetus toward novel therapeutic molecule development, aided by an ever-increasing number of innovative computational techniques and enormous sequencing databases. Antibiotic resistance among key clinical pathogens is a grim prospect, as per infection-related death rate continues to rise despite a slowing rate of new antibiotic discovery.

#### **6. Applications and limitations**

CADD is useful in the treatment of neurodegenerative disorders particularly targeting Amyloid-β in case of Alzheimer's disease. For nearly two decades, in pharmaceutical research docking calculations have been used. Virtual screening using protein templates differs from virtual screening approaches based on molecular similarity and ligands beneficial for de novo identification of active complex. Three important factors in CADD pays close attention include: (1) As per target structure, screening a large number of molecules, which can then be assessed using both experimental and computational techniques; (2) as per affinity, criteria on toxicity and PK study, guiding the optimization of lead compounds and (3) based on the structure, supporting in the design of novel compounds to recover functions of drug. For modelling of drug the CADD approach is extremely helpful. Computed chemistry and bioinformatics, as well as combinatorial chemistry, are used to handle the many issues connected

with the drug discovery pipeline in less time and expense. As per **Figure 8**, general advantages of CADD are found to be cost effective, with higher efficiency, speed and accuracy in results [156–159].

FDA approved drugs like human immunodeficiency virus (HIV)-1-inhibiting drugs identified by SBDD available on the market. Other example is thymidylate synthase inhibitor, raltitrexed, by protein modelling, inhibitor of HIV protease, amprenavir is discovered. Computer assisted techniques are hypothetical and results must be confirmed in real-world systems, and pharmacological activities discovered through CADD in lead compounds have failed. Most of the methods of CADD methods like QSAR, molecular dynamics, molecular docking, etc. have their specific

**Figure 9.** *Limitations of CADD.*

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

restrictions. Limitations are found to be multi-domain protein issues that means protein flexibility which is the most problematic challenge, assessment of multi-drug effects, in some cases lack of quality datasets observed (**Figure 9**).

One failure example of SBDD is RPX00023 which was reported as an antidepressant activity as an agonist of the 5-HT1A receptor. However, it was found to be an inhibitors of 5-HT1A receptor [160–164].

#### **Conflict of interest**

We confirm that there is no conflict of interest.

### **Author details**

Kishor Danao1 \*, Deweshri Nandurkar1 , Vijayshri Rokde1 , Ruchi Shivhare1 and Ujwala Mahajan2

1 Department of Pharmaceutical Chemistry, Dadasaheb Balpande College of Pharmacy, Nagpur, Maharashtra, India

2 Department of Quality Assurance, Dadasaheb Balpande College of Pharmacy, Nagpur, Maharashtra, India

\*Address all correspondence to: kerzarepritee@gmail.com

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Houston JG, Banks MN. In: Abraham DJ, editor. Burger's Medicinal Chemistry and Drug Discovery. 6th ed. Hoboke, New Jersey: Wiley-Interscience; 2003. p. 38

[2] Camille G, Wermuth CG. The Practice of Medicinal Chemistry. London: Academic Press; 1996. p. 4

[3] Martin YC, Kuffer E, Austel A. Modern Drug Research, Paths to Better and Safer Drugs. New York: Marcel Dekker Inc; 1989. pp. 243-273

[4] Trickle IJ, Sibanda BL, Pearl CH, Hemming AM, Blundell TL. X-Ray Crystallography and Drug Action. Oxford: Clarendon Press; 1984. pp. 427-440

[5] Greer J, Erickson JW, Baldwin JJ, Varney MD. Application of the threedimensional structures of protein target molecules in structure-based drug design. Journal of Medicinal Chemistry. 1994;**37**(8):1035-1054

[6] Tollenaere JP. In: Gund P, editor. Guidebook on Molecular Modeling in Drug Design. New York: Academic Press; 1996. p. 352

[7] Ariens EJ. Molecular Pharmacology. New York: Academic Press; 1964. p. 176

[8] Glenn S. In: Osol A, editor. Remington's Pharmaceutical Sciences. 16th ed. Easton, Pennsylvania: Mack Publishing Company; 1980. p. 8

[9] Webb ML. In: Gennaro AR, editor. Remington: The Science and Practice of Pharmacy. Vol. I. 20th ed. Philadelphia: Lippincott Williams and Wilkins; 2000. p. 81

[10] Newall C. In: Roberts S, Price B, editors. Medicinal Chemistry - The role of Organic Chemistry in Drug Research. 1st ed. London: Academic Press; 1985. p. 209

[11] Sternbach LH, Kaiser S. Antispasmodic. II. Esters of basic bicyclic alcohols. Journal of the American Chemical Society. 1952;**74**:2219-2221

[12] Treves GR, Testa FC. Basic esters and quaternary derivatives of β-hydroxy acids as antispasmodics. Journal of the American Chemical Society. 1952;**74**:46-48

[13] Tilford CH. Aminoesters of substituted alicyclic carboxylic acids. Journal of the American Chemical Society. 1947;**69**:2902-2906

[14] Karczmar AG. Ganglionic Blocking and Stimulating Agents. International Encyclopedia of Pharmacology and Therapeutics. Vol. I. Oxford: Pergamon Press; 1966. p. 342

[15] Hou JP, Poole JW. β-Lactam antibiotics: Their physicochemical properties and biological activities in relation to structure. Journal of Pharmaceutical Sciences. 1971;**60**:503-532

[16] Mayersohn M, Endrenyi L. Relative bioavailability of commercial ampicillin formulations in man. Canadian Medical Association Journal. 1973;**109**:989-993

[17] Hill SA, Jones KH, Seager H, Taskis CB. Dissolution and bioavailability of the anhydrate and trihydrate forms of ampicillin. The Journal of Pharmacy and Pharmacology. 1975;**27**:594-598

[18] Fong I, Engelking ER, Kirbi WM. Relative inactivation by *Staphylococcus aureus* of eight cephalosporin

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

antibiotics. Antimicrobial Agents and Chemotherapy. 1976;**9**:939-944

[19] Yamana T, Tsuji A. Comparative stability of cephalosporins in aqueous solution: Kinetics and mechanisms of degradation. Journal of Pharmaceutical Sciences. 1976;**65**:1563-1574

[20] Neu HC, Aswapokee N, Fu KP, Aswapokee P. Antibacterial activity of a new 1-oxa cephalosporin compared with that of other beta-lactam compounds. Antimicrobial Agents and Chemotherapy. 1979;**16**:141-149

[21] Domagk GJ. Ein Beitrag zur Chemotherapie der bakteriellen Infektionen. Deutsche Medizinische Wochenschrift. 1935;**61**:250-253

[22] Anand N. In: Wolff ME, editor. Burger's Medicinal Chemistry and Discovery. 5th ed. Vol. Vol. II. New York: Wiley-Interscience; 1996. p. 255

[23] Macdonald L, kazanijan, P. Opportunistic infections in patients with AIDS - treatment and prophylaxis. Formulary. 1996;**31**:470

[24] Jawetz E. In: Katzung BG, editor. Basic and Clinical Pharmacology. 6th ed. Norwalk, CT: Appleton and Lange; 1995. p. 478

[25] Shepard CC. Leprosy today. The New England Journal of Medicine. 1982;**307**:1640-1641

[26] Miner NA, McDowell JW, Willcockson GW, Bruckner NI, Stark RL, Whitmore EJ. Antimicrobial and other properties of a new stabilized alkaline glutaraldehyde disinfectant/sterilizer. American Journal of Hospital Pharmacy. 1977;**34**:376-382

[27] Domagala JM. Structure-activity and structure-side-effect relationships for the quinolone antibacterials. Antimicrobial Agents and Chemotherapy. 1994;**33**: 685-706

[28] Heifets LB, Flory MA, Lindholm-Levy P. Does pyrazinoic acid as an active moiety of pyrazinamide have specific activity against *Mycobacterium tuberculosis*? Antimicrobial Agents and Chemotherapy. 1989;**33**:1252-1254

[29] Werli W. Rifampin: Mechanisms of action and resistance. Reviews of Infectious Diseases. 1983;**55**:407-411

[30] Hartmann GR. Molecular mechanism of action of the antibiotic rifampicin. Angewandte Chemie (International Ed. in English). 1985;**24**:1009-1014

[31] Zhang Y, Heym B, Allen B, Young D, Cole S. The catalase-peroxidase gene and isoniazid resistance of *Mycobacterium tuberculosis*. Nature. 1992;**358**:591-593

[32] Scherrer RA. In: Scherrer RA, Whitehouse MW, editors. Anti-Inflammatory Agents. New York: Academic Press; 1974. p. 132

[33] Dornan J, Reynolds W. Comparison of ibuprofen and acetylsalicylic acid in the treatment of rheumatoid arthritis. Canadian Medical Association Journal. 1974;**110**:1370-1372

[34] Brogden RN, Heel RC, Speight TM, Avery GS. Fenoprofen: A review of its pharmacological properties and therapeutic efficacy in rheumatic disease. Drugs. 1977;**13**:241-265

[35] Chernish SM, Rosenak BD, Brunelie RL, Crabtree R. Comparison of gastrointestinal effects of aspirin and fenoprofen. Arthritis and Rheumatism. 1979;**22**:376-383

[36] Winters WD, Ferrar AT, Guzman FC, Alcaraz M. The cataleptic state induced by ketamine: A review of the neuropharmacology of anesthesia. Neuropharmacology. 1972;**11**:303-315

[37] Greenblatt DJ, Shader RI. Benzodiazepine in Clinical Practice. New York: Raven Press; 1974. p. 17

[38] Greenblatt DJ, Shader RI, Abernethy DR. Drug therapy: Current status of benzodiazepines. Part One. New England Journal of Medicine. 1983;**309**:354-358

[39] Gastaut H, Broughton R. Anticonvulsant drugs. In: Radouco-Thomas C, editor. International Encyclopedia of Pharmacology and Therapeutics. Vol. I. New York: Pergamon Press; 1973. p. 3

[40] Spinks A, Waring WS. In: Ellis GP, West GB, editors. Progress in Medicinal Chemistry. Vol. III. Washington, DC: Butterworth; 1963. p. 345

[41] Hanka LJ, Evans JS, Mason DJ, Dietz A. Microbiological production of 5-azacytidine. I. Production and biological activity. Antimicrobial Agents and Chemotherapy. 1966;**6**:619-624

[42] Schaeffer HJ, Schwender CF. Enzyme inhibitors. 26. Bridging hydrophobic and hydrophilic regions on adenosine deaminase with some 9-(2-hydroxy-3 alkyl) adenines. Journal of Medicinal Chemistry. 1974;**17**:6-8

[43] Eckle E, Stezowski JJ. The crystal and molecular structure of 7-con-Omethylnogarol. Tetrahedron Letters. 1980;**21**:507-510

[44] Fujiwara K, Hiromi S, Masahiro H. Enyne[3]cumulene. Synthesis and mode of aromatization. The Journal of Organic Chemistry. 1991;**56**:1688-1689

[45] Harrison RC, McAuliffe CA. An efficient route for the preparation of highly soluble platinum (II) antitumour agents. Inorganica Chimica Acta. 1980;**46**:L15-L16

[46] Levitzki A, Gazit A. Tyrosine kinase inhibition: An approach to drug development. Science. 1995;**267**:1782-1788

[47] Moller JV, Sheikh MI. Renal organic anion transport system: Pharmacological, physiological and biochemical aspects. Pharmacological Reviews. 1983;**34**:315-356

[48] Mann T, Keilin K. Sulphanilamide as a specific inhibitor of carbonic anhydrase. Nature. 1940;**146**:164-165

[49] Leaf A, Cotran RS. In: Leaf A, Cortan RS, editors. Renal Pathophysiology. 2nd ed. New York: Oxford University Press; 1980. p. 145

[50] Shinkawa T, Fumiaki Y, Notsu T, Nakakuki M, Nishijima K, Yoshitomi K, et al. Loop and distal actions of a novel diuretic, M17055. European Journal of Pharmacology. 1993;**238**:317-325

[51] Cragoe EJ. In: Cragoe EJ, editor. Chemistry, Pharmacology and Medicine. New York: John Wiley and Sons; 1983. p. 303

[52] Roberts LS, Schmidt GD. Foundations of Parasitology. USA: William C Brown Pub; 1995. p. 324

[53] Hardman JG, Limbiad LE. The Pharmacological Basis of Therapeutics. 9th ed. New York: Macmillan; 1996. p. 576

[54] Foye WO. In: Foye WO, Lemke TL, Williams DA, editors. Principles of Medicinal Chemistry. 4th ed. Philadelphia: Lea and Febiger; 1995. p. 348

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

[55] Banks BJ. Antiparasitic agents. In: Bailey DM, editor. Annual Reports in Medicinal Chemistry. Vol. 19. New York: Academic Press; 1984. p. 198

[56] Cox FEG. Which way for malaria? Nature. 1988;**332**:486-487

[57] Walsh C. Antibiotics: Actions, Origins, Resistance. 1st ed. New York: ASM Press; 1956. pp. 223-324

[58] Mechlinski W, Schaffner CP, Ganis P, Avitabile G. Structure and absolute configuration of the polyene macrolide antibiotic amphotericin B. Tetrahedron Letters. 1970;**11**:3873-3876

[59] Pandey RC, Rinehart K. Carbon-13 nuclear magnetic resonance evidence for cyclic hemiketals in the polyene antibiotics amphotericin B, nystatin A1, tetrin A, tetrin B, lucensomycin and pimaricin 1,2. The Journal of Antibiotics. 1976;**29**:1035-1342

[60] Mitscher LA, Sharma PM, Chu DT, Shen LL, Pernet AG. Chiral DNA gyrase inhibitors 2. Asymmetric synthesis and biological activity of the enantiomers of 9-fluoro-3-methyl-10-(4-methyl-1-piperazinyl)-7-oxo-2,3-dihydro-7hpyrido[1,2,3-de]-1,4-benzoxazine-6 carboxylic acid (ofloxacin). Journal of Medicinal Chemistry. 1987;**30**:2283-2286

[61] Garrison JC. In: Gilman AG, Rall TW, Nies AS, Taylor P, editors. Goodman and Gilman's The Pharmacological Basis of Therapeutics. 8th ed. New York: Pergamon Press; 1990. p. 398

[62] Mann KV, Crowe JP, Tietze KJ. Non sedating histamine H1-receptor antagonists. Clinical Pharmacy. 1989;**8**:331-344

[63] Barouh V, Dall H, Patel D, Hite G. Stereochemical aspects of antihistamine action. 4. Absolute configuration of carbinoxamine antipodes. Journal of Medicinal Chemistry. 1971;**14**:834-836

[64] Leurs R, Timmerman H. Progress in Drug Research. Vol. 39. Boston: Virkhauser Verlag; 1992. p. 127

[65] Saxena AK. Saxena M. In: Jucker E, editor. Progress in Drug Research. Vol. 39. Boston: Birkhauser Verlag; 1992. p. 35

[66] Wermuth CG. Drug Design-Fact or Fantasy. 1st ed. New York: Academic Press; 1984. p. 47

[67] Gerhard K, Abrahum UJ. Comparative molecular similarity index analysis (CoMSIA) to study hydrogen bonding properties and to store combinatorial libraries. Computer Aided Molecular Design. 1999;**13**:1-10

[68] Venger BH, Hanch C, Hathwan GJ, Amerein YV. Ames-test of 1-(X-phenyl)- 3,3-dialkyl triazines. A quantitative structure activity study. Journal of Medicinal Chemistry. 1979;**22**:473-476

[69] Carter RC, Grassy G, Kubinyl H, Martin YC, Willett P. Chapter 37. Glossary of terms used in computational drug design (IUPAC Recommendations 1997). Annual Reports in Medicinal Chemistry. 1998;**33**:397-409

[70] Propst CL, Perun TJ. In: Marcel D, Perun TJ, Propst CK, editors. Computer Aided Drug Design Methods and Application. New York: Marcel Dekker Inc; 1989. p. 12

[71] Leow GH, Villar HO, Alkorta I. Strategies for indirect computer-aided drug design. Pharmaceutical Research. 1993;**10**:475-486

[72] Workman P. How much gets there and what does it do?: The need for better pharmacokinetic and pharmacodynamic endpoints in contemporary drug discovery and development. Current Pharmaceutical Design. 2003;**9**:891-902

[73] Stahura FL, Bajorath J. virtual screening methods that complement HTS. Combinatorial Chemistry & High Throughput Screening. 2004;**7**:259-269

[74] Guner O, Clement O, Kurogi Y. Pharmacophore modeling and three dimensional database searching for drug design using catalyst: Recent advances. Current Medicinal Chemistry. 2004;**11**:2991-3005

[75] Leo AJ, Hansch C. Role of hydrophobic effects in mechanistic QSAR. Perspectives in Drug Discovery and Design. 1999;**17**:1-25

[76] Bolis G, Dipace L, Fabrocini F. A machine learning approach to computer aided molecular design. Journal of Computer-Aided Molecular Design. 1991;**5**:617-628

[77] Zhang S, Du-Cuny L. Development and evaluation of a new statistical model for structure-based high-throughput virtual screening. International Journal of Bioinformatics Research and Applications. 2009;**5**:269-279

[78] Beusen DD, Marshall GR. In: Guner OF, editor. Pharmacophore Perception, Development, and Use in Drug Design. La Jolla, CA: International University Line; 2000. pp. 23-45

[79] Van Drie JH. "Shrink-Wrap" surfaces: A new method for incorporating shape into pharmacophoric 3D database searching. Journal of Chemical Information and Computer Sciences. 1997;**37**:38-42

[80] Patel Y, Gillet VJ, Bravi G, Leach AR. A comparison of the pharmacophore

identification programs: Catalyst, DISCO and GASP. Journal of Computer-Aided Molecular Design. 2002;**16**:653-681

[81] Cho AE, Guallar V, Berne B, Friesner RA. Importance of accurate charges in molecular docking: Quantum mechanical/molecular mechanical (QM/ MM) approach. Journal of Computational Chemistry. 2005;**26**:915-931

[82] Brint AT, Willett P. Algorithms for the identification of three-dimensional maximal common substructures. Journal of Chemical Information and Computer Sciences. 1987;**27**:152-158

[83] Fischer D, Norel R, Wolfson H, Nussinov R. Surface motifs by a computer vision technique: Searches, detection, and implications for protein-ligand recognition. Proteins. 1993;**16**(3):278-292

[84] Norel R, Fischer D, Wolfson HJ, Nussinov R. Molecular surface recognition by a computer visionbased technique. Protein Engineering. 1994;**7**(1):39-46

[85] Miller MD, Kearsley SK, Underwood DJ, Sheridan RP. FLOG: A system to select 'quasi-flexible' ligands complementary to a receptor of known three-dimensional structure. Journal of Computer-Aided Molecular Design. 1994;**8**(2):153-174

[86] Diller DJ, Merz KM Jr. High throughput docking for library design and library prioritization. Proteins. 2001;**43**(2):113-124

[87] Burkhard P, Taylor P, Walkinshaw MD. An example of a protein ligand found by database mining: Description of the docking method and its verification by a 2.3 A X-ray structure of a thrombin-ligand complex. Journal of Molecular Biology. 1998;**277**(2):449-466

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

[88] DesJarlais RL, Sheridan RP, Dixon JS, Kuntz ID, Venkataraghavan R. Docking flexible ligands to macromolecular receptors by molecular shape. Journal of Medicinal Chemistry. 1986;**29**(11):2149-2153

[89] Kuntz ID, Leach AR. Conformational analysis of flexible ligands in macromolecular receptor sites. Journal of Computational Chemistry. 1992;**13**:730-748

[90] Ewing TJ, Makino S, Skillman AG, Kuntz ID. DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases. Journal of Computer-Aided Molecular Design. 2001;**15**(5):411

[91] Welch W, Ruppert J, Jain AN. Hammerhead: Fast, fully automated docking of flexible ligands to protein binding sites. Chemistry & Biology. 1996;**3**(6):449-462

[92] Kollman PA. Free energy calculations: Applications to chemical and biochemical phenomena. Chemical Reviews. 1993;**93**:2395-2417

[93] Aqvist J, Luzhkov VB, Brandsdal BO. Ligand binding affinities from MD simulations. Accounts of Chemical Research. 2002;**35**(6):358-365

[94] Carlson HA, Jorgensen WL. An extended linear response method for determining free energies of hydration. The Journal of Physical Chemistry. 1995;**99**:10667-10673

[95] Shoichet BK, Stroud RM, Santi DV, Kuntz ID, Perry KM. Structurebased discovery of inhibitors of thymidylate synthase. Science. 1993;**259**(5100):1445-1450

[96] Michel J, Verdonk ML, Essex JW. Protein-ligand binding affinity predictions by implicit solvent simulations: A tool for lead optimization? Journal of Medicinal Chemistry. 2006;**49**(25):7427-7439

[97] Briggs JM, Marrone TJ, McCammon JA. Computational science new horizons and relevance to pharmaceutical design. Trends in Cardiovascular Medicine. 1996;**6**:198-206

[98] Gehlhaar DK, Verkhivker GM, Rejto PA, Sherman CJ, Fogel DB, Fogel LJ, et al. Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: Conformationally flexible docking by evolutionary programming. Chemistry & Biology. 1995;**2**(5):317-324

[99] Verkhivker GM, Bouzida D, Gehlhaar DK, Rejto PA, Arthurs S, Colson AB, et al. Deciphering common failures in molecular docking of ligandprotein complexes. Journal of Computer-Aided Molecular Design. 2000;**14**(8):731-751

[100] Jain AN. Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities. Journal of Computer-Aided Molecular Design. 1996;**10**(5):427-440

[101] Head RD, Smythe ML, Oprea TI, Waller CL, Green SM, Marshall GR. VALIDATE: A new method for the receptorbased prediction of binding affinities of novel ligands. Journal of the American Chemical Society. 1996;**118**:3959-3969

[102] Gehlhaar DK, Moerder KE, Zichi D, Sherman CJ, Ogden RC, Freer ST. De novo design of enzyme inhibitors by Monte Carlo ligand generation. Journal of Medicinal Chemistry. 1995;**38**(3):466-472

[103] Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP. Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. Journal of Computer-Aided Molecular Design. 1997;**11**(5):425-445

[104] Muegge I, Martin YC. A general and fast scoring function for proteinligand interactions: A simplified potential approach. Journal of Medicinal Chemistry. 1999;**42**(5):791-804

[105] Still WC, Tempczyk A, Hawley RC, Hendrickson T. Semianalytical treatment of solvation for molecular mechanics and dynamics. Journal of the American Chemical Society. 1990;**112**(16):6127-6129

[106] Guimaraes CR, Mathiowetz AM. Addressing limitations with the MM-GB/SA scoring procedure using the WaterMap method and free energy perturbation calculations. Journal of Chemical Information and Modeling. 2010;**50**(4):547-559

[107] Singh N, Warshel A. Absolute binding free energy calculations: On the accuracy of computational scoring of protein-ligand interactions. Proteins. 2010;**78**(7):1705-1723

[108] Gabb HA, Jackson RM, Sternberg MJ. Modelling protein docking using shape complementarity, electrostatics and biochemical information. Journal of Molecular Biology. 1997;**272**(1):106-120

[109] Bron C, Kerbosch J. Algorithm 457: Finding all cliques of an undirected graph. Communications of the ACM. 1973;**16**(9):575-576

[110] Meng EC, Shoichet BK, Kuntz ID. Automated docking with grid-based energy evaluation. Journal of Computational Chemistry. 1992;**13**:505-524

[111] Meng XY, Zheng QC, Zhang HX. A comparative analysis of binding sites between mouse CYP2C38 and CYP2C39 based on homology modeling, molecular dynamics simulation and docking studies. Biochimica et Biophysica Acta. 2009;**1794**(7):1066-1072

[112] Boehm HJ, Boehringer M, Bur D, Gmuender H, Huber W, Klaus W, et al. Novel inhibitors of DNA gyrase: 3D structure based biased needle screening, hit validation by biophysical methods, and 3D guided optimization. A promising alternative to random screening. Journal of Medicinal Chemistry. 2000;**43**(14):2664-2674

[113] Kirton SB, Murray CW, Verdonk ML, Taylor RD. Prediction of binding modes for ligands in the cytochromes P450 and other heme-containing proteins. Proteins. 2005;**58**(4):836-844

[114] Doman TN, McGovern SL, Witherbee BJ, Kasten TP, Kurumbail R, Stallings WC, et al. Molecular docking and high-throughput screening for novel inhibitors of protein tyrosine phosphatase-1B. Journal of Medicinal Chemistry. 2002;**45**(11):2213-2221

[115] Shoichet BK, Leach AR, Kuntz ID. Ligand solvation in molecular docking. Proteins. 1999;**34**(1):4-16

[116] Lorber DM, Shoichet BK. Flexible ligand docking using conformational ensembles. Protein Science. 1998;**7**(4):938-950

[117] Freymann DM, Wenck MA, Engel JC, Feng J, Focia PJ, Eakin AE, et al. Efficient identification of inhibitors targeting the closed active site conformation of the HPRT from *Trypanosoma cruzi*. Chemistry & Biology. 2000;**7**(12):957-968

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

[118] Su AI, Lorber DM, Weston GS, Baase WA, Matthews BW, Shoichet BK. Docking molecules by families to increase the diversity of hits in database screens: Computational strategy and experimental evaluation. Proteins. 2001;**42**(2):279-293

[119] Gschwend DA, Kuntz ID. Orientational sampling and rigid-body minimization in molecular docking revisited: On-the-fly optimization and degeneracy removal. Journal of Computer-Aided Molecular Design. 1996;**10**(2):123-132

[120] Krovat EM, Steindl T, Langer T. Recent advances in docking and scoring. Journal of Computer-Aided Molecular Design. 2005;**19**:93-102

[121] Kontoyianni M, Sokol GS, McClellan LM. Evaluation of library ranking efficacy in virtual screening. Journal of Computational Chemistry. 2005;**26**:11-22

[122] Kirkpatrick P. Gliding to success. Nature Reviews Drug Discovery. 2004;**3**:299-303

[123] Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, et al. Glide: A new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. Journal of Msedicinal Chemistry. 2004;**47**:1750-1759

[124] Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: A new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. Journal of Medicinal Chemistry. 2004;**47**:1739-1749

[125] Klon AE, Glick M, Davies JW. Application of machine learning to improve the results of high-throughput docking against the HIV-1 protease. Journal of Chemical Information and Computer Sciences. 2004;**44**:2216-2224

[126] Ruddat VC, Mogul R, Chorny I, Chen C, Perrin N, Whitman S, et al. Tryptophan 500 and Arginine 707 define product and substrate active site binding in soybean lipoxygenase-1. Biochemistry. 2004;**43**:13063-13071

[127] Kellenberger E, Rodrigo J, Muller P, Rognan D. Comparative evaluation of eight docking tools for docking and virtual screening accuracy. Proteins. 2004;**57**:225-242

[128] Perola E, Walters WP, Charifson PS. A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance. Proteins. 2004;**56**:235-249

[129] Klon AE, Glick M, Thoma M, Acklin P, Davies JW. Finding more needles in the haystack: A simple and efficient method for improving high throughput docking results. Journal of Medicinal Chemistry. 2004;**47**:2743-2749

[130] Bytheway I, Cochran S. Validation of molecular docking calculations involving FGF-1 and FGF-2. Journal of Medicinal Chemistry. 2004;**47**:1683-1693

[131] Kontoyianni M, McClellan LM, Sokol GS. Evaluation of docking performance: Comparative data on docking algorithms. Journal of Medicinal Chemistry. 2004;**47**:558-565

[132] Schulz-Gasch T, Stahl M. Binding site characteristics in structure-based virtual screening: Evaluation of current docking tools. Journal of Molecular Modeling. 2003;**9**:47-57

[133] Wu TYH, Wagner KW, Bursulaya B, Schultz PG, Deveraux QL. Development and characterizationof nonpeptidic

small molecule inhibitors of the XIAP/ caspase-3 interaction. Chemistry and Biology. 2003;**10**:759-767

[134] Kuo GH, Prouty C, DeAngelis A, Shen L, O'Neill DJ, Shah C, et al. Synthesis and discovery of macrocyclic polyoxygenated bis-7 azaindolylmaleimides as a novel series of potent and highly selective glycogen synthase kinase-3β inhibitors. Journal of Medicinal Chemistry. 2003;**46**:4021-4031

[135] Nilsson JW, Kvarnstrom I, Musil D, Nilsson I, Samulesson B. Synthesis and SAR of thrombin inhibitors incorporating a novel 4-aminomorpholinone scaffold: Analysis of x-ray crystal structure of enzyme inhibitor complex. Journal of Medicinal Chemistry. 2003;**46**:3985-4001

[136] Bjerrum EJ, Kristensen AS, Pickering DS, Greenwood JR, Nielsen B, Liljefors T, et al. Design, synthesis, and pharmacology of a highly subtypeselective GluR1/2 agonist, (RS)-2-amino-3-(4-chloro-3-hydroxy-5-isoxazolyl) propionic acid (Cl-HIBO). Journal of Medicinal Chemistry. 2003;**46**:2246-2249

[137] Brehm L, Greenwood JR, Hansen KB, Nielsen B, Egebjerg J, Stensbol TB, et al. (S)-2-amino-3-(3 hydroxy-7,8-dihydro-6H- cyclohepta[d] isoxazol-4-yl)propionic acid, a potent and selective agonist at the GluR5 subtype of ionotropic glutamate receptors. Synthesis, modeling, and molecular pharmacology. Journal of Medicinal Chemistry. 2003;**46**:1350-1358

[138] Thorstensson F, Kvarnstrom I, Musil D, Nilsson I, Samuelsson B. Synthesis of novel thrombin inhibitors. Use of ring-closing metathesis reactions for synthesis of P2 cyclopentene- and cyclohexenedicarboxylic acid derivatives. Journal of Medicinal Chemistry. 2003;**46**:1165-1179

[139] Bunch L, Liljefors T, Greenwood JR, Frydenvang K, Brauner-Osborne H, Krogsgaard-Larsen P, et al. The Journal of Organic Chemistry. 2003;**68**:1489-1495

[140] Raies AB, Bajic VB. In silico toxicology: Computational methods for the prediction of chemical toxicity. Wiley Interdisciplinary Reviews: Computational Molecular Science. 2016;**6**(April):147-172. DOI: 10.1002/ wcms.1240

[141] Devillers J. Methods for building QSARs. Methods in Molecular Biology. 2013:**930**:3-27

[142] Parthasarathi R, Dhawan A. In silico approaches for predictive toxicology. In: In Vitro Toxicology. Academic Press. 2018:91-109. DOI: 10.1016/ B978-0-12-804667-8.00005-5

[143] Sachdev K, Gupta MK. A comprehensive review of computational techniques for the prediction of drug side effects. Drug Development Research. 2020;**81**(6):650-670. DOI: 10.1002/ ddr.21669

[144] Proschak E, Stark H, Merk D. Polypharmacology by design: A medicinal chemist's perspective on multitargeting compounds [reviewarticle]. Journal of Medicinal Chemistry. 2019;**62**(2):420-444. DOI: 10.1021/acs. jmedchem.8b00760

[145] Lavecchia A, Cerchia C. In silico methods to address polypharmacology: Current status, applications and future perspectives. Drug Discovery Today. 2016;**21**(2):288-298. DOI: 10.1016/j. drudis.2015.12.007

[146] Achenbach J, Tiikkainen P, Franke L, Proschak E. Computational tools for polypharmacology and repurposing. Future Medicinal Chemistry. 2011;**3**(8):961-968. DOI: 10.4155/fmc.11.62

*Molecular Docking: Metamorphosis in Drug Discovery DOI: http://dx.doi.org/10.5772/intechopen.105972*

[147] Chaudhari R, Tan Z, Huang B, Zhang S. Computational polypharmacology: A new paradigm for drug discovery. Expert Opinion on Drug Discovery. 2017;**12**(3):279-291. DOI: 10.1080/17460441.2017.1280024

[148] Rastelli G, Pinzi L. Computational polypharmacology comes of age. Frontiers in Pharmacology. 2015;**6**(Jul): 1-4. DOI: 10.3389/fphar.2015.00157

[149] Anighoro A, Bajorath J, Rastelli G. Polypharmacology: Challenges and opportunities in drug discovery department of life science informatics, B-IT, LIMES program unit chemical biology and medicinal. Journal of Medicinal Chemistry. 2014;**57**(19):7874-7887

[150] Capdeville R, Buchdunger E, Zimmermann J, Matter A. Glivec (ST1571, imatinib), a rationally developed, targeted anticancer drug. Nature Reviews Drug Discovery. 2002;**1**(7):493-502. DOI: 10.1038/nrd839

[151] Chaudhari R, Fong LW, Tan Z, Huang B, Zhang S. An up-todate overview of computational polypharmacology in modern drug discovery. Expert Opinion on Drug Discovery. 2020;**15**(9):1025-1044. DOI: 10.1080/17460441.2020.1767063

[152] Medina-Franco JL, Martinez-Mayorga K, Fernández-de Gortari E, Kirchmair J, Bajorath J. Rationality over fashion and hype in drug design. F1000Research. 2021;**10**(397):397

[153] McInnes G, Sharo AG, Koleske ML, Brown JE, Norstad M, Adhikari AN, et al. Opportunities and challenges for the computational interpretation of rare variation in clinically important genes. The American Journal of Human Genetics. 2021;**108**:535-548

[154] Gautam P, Pal MK, Chaudhry V. In silico drug repurposing for MDR bacteria: Opportunities and challenges. In: In Silico Drug Design. Academic Press. 2019. pp. 781-799

[155] Marshall BM, Levy SB. Food animals and antimicrobials: Impacts on human health. Clinical Microbiology Reviews. 2011;**24**(4):718-733

[156] Makhouri FR, Ghasemi JB. In silico studies in drug research against neurodegenerative diseases. Current Neuropharmacology. 2018;**16**:664-725

[157] Baig MH, Ahmad K, Rabbani G, Danishuddin M, Choi I. Computer aided drug design and its application to the development of potential drugs for neurodegenerative disorders. Current Neuropharmacology. 2018;**16**(6):740-748

[158] Verma S, Pathak RK. Discovery and optimization of lead molecules in drug designing. Bioinformatics Methods and Applications. Academic Press. 2022:253-267

[159] Wlodawer A, Vondrasek J. Inhibitors of HIV-1 protease: A major success of structure-assisted drug design. Annual Review of Biophysics and Biomolecular Structure. 1998;**27**:249-284

[160] Anderson AC. The process of structure-based drug design. Chemistry & Biology. 2003;**10**:787-797

[161] Douglas B, Kitchen DB, Decornez HY, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nature Reviews Drug Discovery. 2004;**3**:935-949

[162] Chen D, Martin ZS, Soto C, Schein CH. Computational selection of inhibitors of Abeta aggregation and neuronal toxicity. Bioorganic

& Medicinal Chemistry. 2009;**17**(14):5189-5197

[163] Cheatham TE, Young MA. Molecular dynamics simulation of nucleic acids: Successes, limitations, and promise. Biopolymers. 2001;**56**(4):232-256

[164] De Paulis T. Drug evaluation: Prx-00023, a selective 5-ht1a receptor agonist for depression. Current Opinion in Investigational Drugs. 2007;**8**:78-86

#### **Chapter 4**

## Molecular Docking in the Study of Ligand-Protein Recognition: An Overview

*Iqbal Azad*

#### **Abstract**

Molecular docking is a bioinformatics-based theoretical simulation strategy. It is employed to study ligand-protein interaction profiles and predict their binding conformers and affinity through computational tools. Since the 1980s, computational tools have been used in the drug discovery process. The initial molecular modeling approaches available at the time focused on a rigid view of the ligand-protein interaction due to the limited computational capabilities. The advancement of hardware technology has made it possible to simulate the dynamic character of the ligand-protein interactions throughout time. The current chapter deals with an outline of the progression of structure-based drug discovery methodologies in the investigation of the ligandprotein interaction profiles from static to improved molecular docking strategies.

**Keywords:** Molecular docking, AutoDock, Vina, AutoDockFR, iGEMDOCK, Drug discovery process, Virtual screening

#### **1. Introduction**

Docking tools have simplified the study of interactions between drug molecules and receptor proteins, DNA, or biological molecules [1]. These interactions take place covalently. Furthermore, critical molecular mechanisms, ligand binding approaches, and factors influencing the ligand-protein interaction profile can be estimated with the help of the docking results [2, 3]. Docking suites can be used to calculate the binding energies associated with the most stable conformation of drug-receptor interactions (**Figure 1**) [4, 5].

#### **2. Types of docking**

In 1982, Kuntz et al. developed the first molecular docking algorithm through the estimation of the released binding energy [6, 7]. Docking evaluations are performed to regulate the interaction profile between the ligand and target and to search for the most suitable conformation of the ligand in the complex. Empirical scoring functions are also explored, which transform binding energy into the docking score [8]. There are numerous free online tools available to generate 3D ligand and target interaction

**Figure 1.** *General modes of molecular docking simulation.*

profiles, such as Biovia DSV, Pymol, Chimera, Rasmol, SwissPDB viewer, etc. Docking is broadly classified into three classes, discussed below:

#### **2.1 Flexible docking**

In flexible docking, the side chains of the protein and ligand are kept flexible. The general principle of flexible docking is based on the induced-fit hypothesis offered by Daniel Koshland in 1958 [9]. As a result, it is also known as "induced-fit docking," in which the binding energies of various conformations of the proposed ligand are calculated at protein or receptor pockets [10, 11]. Furthermore, the target chain should be flexible enough to combine with the conformational modifications of the receptor and ligand. Various altered possible conformations of the ligand can be predicted, which makes it the most accepted and accurate technique, but it is timeconsuming and costly at the same time [12].

#### **2.2 Semi-flexible docking**

In this approach, the ligand molecule is the only flexible element while the protein is rigid [13]. In addition to the six translational and rotational degrees of freedom, the conformational degrees of freedom of the ligand are also tested [14]. These approaches assume that a protein's fixed conformation is capable of recognizing the ligands to be docked. As previously stated, this assumption is not always validated [15].

#### **2.3 Rigid docking**

In rigid docking, the main geometry of the target and ligand is retained and frozen during docking analysis [16]. The basis of this type of docking analysis is the 'Lock and *Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

Key' hypothesis, proposed by Emil Fischer in 1894 [17]. Thus, it is defined as lock and key docking, which also leads to several problems. The analysis of ligand-target docking is very significant for observing drug-target interaction, but a problem is associated with it when the ligand is docked at the pocket site of a receptor protein. Due to the rigid structure of both, observation of interactions becomes very challenging and the most suitable confirmation of ligand is not easily obtained [18]. Sometimes ligands do not enter the pocket site of a protein, leading to weak interactions that are not enough to show satisfactory results. Internal flexibility is necessary for good docking interaction. In various cases, the structural modifications that are essential for binding are negligible in rigid docking. Rigid docking is only enough to observe the interaction [11]. Some other benefits of rigid docking are its simplicity and a short period of run time.

#### **3. Docking interactions**

Docking is performed to establish the most suitable interaction profile for a ligand inside the target protein. It is also employed to estimate the energy evolved during the interaction between the ligand and protein [19]. Various forces influence docking interactions. The total energy released during these interactions is calculated through the empirical formula and displayed in the form of total binding energy [11, 18]. Based on the different forces, docking interactions are categorized as electro-dynamic forces (like van der Waals), electrostatic forces (charge-charge, dipole-dipole, and charge-dipole), steric forces (observed between closer molecules and influence the reactivity as well as the chemical reactivity), solvent-related forces (occurring due to interaction among the solvent and protein/ligand) and conformational modifications in the ligand) [20].

#### **4. Types of energies**

The preliminary objective of docking analysis is to obtain the best conformation of the drug during the drug-receptor interactions in support of the lowest binding free energy [21]. Molecular docking tools frequently calculate the scoring functions to evaluate the binding energies of drug-receptor interactions [11]. The resultant binding energy (ΔG bind) is calculated in the form of a combination of different energies such as H-bond, torsional free, electrostatic, unbound system's desolation, total internal, dispersion, and repulsion, etc. The dissociation constant (K*d*) is used to signify the binding energy in terms of Gibbs' free energy (ΔG) [22]. The predication of drugreceptor binding depends on the some factors such as intermolecular interactions, desolation, and entropic effects. Upon increasing the estimation of the physiochemical parameters, the accuracy of the scoring function is also increased [23].

An example of a scoring function is as follows:

The empirical scoring function of any docking program.

$$\text{Fitness} = \text{vdW} + \text{H bond} + \text{Elec.}$$

Binding Energy.

$$
\Delta \mathbf{G}\_{\text{bind}} = \Delta \mathbf{G}\_{\text{vdw}} + \Delta \mathbf{G}\_{\text{hbond}} + \Delta \mathbf{G}\_{\text{elect}} + \Delta \mathbf{G}\_{\text{conform}} + \Delta \mathbf{G}\_{\text{tor}} + \Delta \mathbf{G}\_{\text{sol}} \tag{1}
$$

### **5. Docking algorithms**

The docking algorithms display a new dimension to evaluating the interaction profile of the ligand-receptor complex [24]. It calculates all possible conformations of the ligand under investigation during the interaction with the receptor. It also delivers the most suitable conformational pose with minimum binding energy [24, 25]. The most common algorithms apply for various docking evaluations (Flexible, Semiflexible, and Rigid Docking) are (**Figure 2**):

#### **5.1 Flexible docking with single protein conformation**

#### *5.1.1 Side-chain flexibility docking*

The side-chain flexibility docking approach introduces different conformations for various protein side-chains [26]. This is usually accomplished by utilizing rotamer library databases. Various docking approaches like GOLD use their search engine to sample some degrees of freedom. Large conformational fluctuations of the protein are ignored by these approaches due to side-chain flexibility [27].

#### **5.2 Soft docking**

In 1991, Jiang and Kim first described the soft docking strategy, which is based on the understanding of protein flexibility [28]. The VdW revulsion is also working in force field scoring functions because it reduces collisions and allows for more compact ligandprotein packing. In this method, an induced fit is recreated. As a drawback, this method can only simulate faint protein motions, which can lead to erroneous poses [24].

#### **5.3 Flexible docking with multiple protein conformations**

For the same target, multiple experimental structures may be offered [29]. Furthermore, computational approaches such as Monte Carlo or Molecular Dynamics

**Figure 2.** *Various docking algorithms.*

simulations can be used to obtain an ensemble of protein conformations [30]. The goal behind multiple protein conformation docking is to consider all of the potential configurations by employing various strategies:

#### *5.3.1 Individual conformations*

The target structures are viewed as conformations that could be attached to the ligand. Therefore, several docking scores are undertaken, assessing the ligands on all of the target conformations [31]. Furthermore, to filter the structures, an initial standard to evaluate the presentation of distinct target structures in a docking investigation was also performed in individual conformations [32, 33].

#### *5.3.2 United description of the protein*

The structures are utilized to build the best-performing "chimaeras" protein instead of collapsing into an average grid [34]. Like FlexE, it selects structurally conserved areas from the ensemble's structures to build a rigid configuration. This section is attached to the ensemble's flexible portions in a combinatorial method, resulting in a pool of "chimaeras" that can be docked [35].

#### *5.3.3 Average grid*

The ensemble's structures are combined to form a typical solitary grid [36].

#### **5.4 Semi-flexible docking algorithm with simulation approaches**

A well-known model of this class is molecular dynamics. This approach defines a system's temporal evolution [37]. The molecular dynamics unit provides a more detailed explanation [38]. Energy-saving strategies are also included in this category, but these strategies are rarely utilized as standalone search engines [39]. Energy minimization is a local optimization approach for obtaining a system with certain potential energy [40].

#### **5.5 Semi-flexible docking algorithm with stochastic methods**

In this approach, the values of the degrees of freedom of a system are changed randomly rather than systematically like in stochastic algorithms [41]. The speed of these procedures is beneficial, as they might potentially locate the best answer very quickly. The main disadvantage of this approach is that it does not confirm a comprehensive investigation of the conformational space, which denotes the actual solution, which may be overlooked. Increase the number of iterations of the method to partially solve the lack of convergence. The following are the most well-known stochastic algorithms [42]:

#### *5.5.1 Swarm optimization (SO) methods*

Several swarm optimization approaches are based on the behavior of swarms [43]. The knowledge supplied by previously sampling good poses guides the sample of a ligand's degrees of freedom. PLANTS use an Ant Colony Optimization (ACO) algorithm, which simulates the behavior of ants, and uses pheromones to find the quickest way to a food position [44]. Each degree of freedom is coupled with a pheromone in this system. Successful ants contribute to pheromone deposition, while virtual ants choose conformations based on pheromone values.

#### *5.5.2 Evolutionary algorithms (EA)*

The most prominent evolutionary algorithms are genetic algorithms (GAs), which are based on the idea of biological evolution [45]. The genes, chromosomes, mutations, and crossover concepts are all taken from biology. Genes are represented in the form of the degrees of freedom as well as ligand conformation, which is defined by a chromosome that is awarded a fitness score [46]. Within a population of chromosomes, mutations and crossovers occur, and the chromosomes with greater fitness survive and replace the ones with lower fitness. rDock, PSI-DOCk, AutoDock, and GOLD are the most well-known instances [46–50].

#### *5.5.3 Tabu search methods*

Tabu search strategies are used to avoid exploring zones of the conformational/ positional space that have already been explored. At each cycle, random alterations are made to the ligand's degrees of freedom. The previously sampled conformations are recorded, and a new stance is allowed only if it is distinct from any previously investigated pose. This category includes programs like PRO LEADS and PSI-DOCK [47, 51–54].

#### *5.5.4 Monte Carlo (MC) methods*

The Metropolis Monte Carlo algorithm, which presents a recognized measure in the development of docking exploration, is the basis for Monte Carlo approaches [55]. Each repetition of the algorithm involves a casual adjustment of the degrees of freedom of the ligand. The Metropolis algorithm in its basic form, although it is implemented in a variety of ways in docking software, AutoDock Vina, MCDOCK, QXP, ICM, and AutoDock [30, 42, 44, 56].

#### **5.6 Semi-flexible docking algorithm with efficient exploration techniques**

In an efficient exploration, a collection of findings is associated with each degree of freedom, and all the values of each coordinate are examined in a combinatorial manner [56]. These approaches are classified into the following categories:

#### *5.6.1 Conformational ensemble*

Rigid docking approaches can easily be supplemented with a certain amount of flexibility. If an ensemble of previously produced ligand conformers is docked to the target using a conformational variation approach on the ligand complement, an example is MS-DOCK [57].

#### *5.6.2 Fragmentation*

DesJarlais et al. in 1986 described an approach to fragmentation of the ligand. The first application of ligand flexibility in docking was the hard docking of fragments into the reaction site and the subsequent connecting of the fragments [58]. Partial

flexibility is achieved at the junctions among the fragments in this manner. Additional approaches, known as incremental building, initially dock one fragment and then add the rest, one by one. FlexX [59] and Hammerhead [60] are two approaches that use fragmentation [61].

#### *5.6.3 Exhaustive search*

Exhaustive exploration is an efficient method in austere intelligence, as it examines all of the ligands' rotatable bonds systematically. To limit the search space and avoid a combinatorial explosion, several limitations and termination criteria are usually defined. The software Glide's docking pipeline [62, 63] includes an exhaustive search stage.

#### **6. Some common docking software**

#### **6.1 AutoDock**

AutoDock is an open-source and automated docking package introduced by the Molecular Graphics Lab, Scripps Research Institute, La Jolla, CA 92037, USA. It is effectively applied to the calculation of the binding sphere of biological macromolecules like proteins and enzymes, as well as ligands (small molecules) [25]. The AutoDock docking suite offers the minimum binding energy of interaction obtained between the ligand and the receptor protein. The binding energy calculation is based on the formula offered in the form of the scoring function. Using the Lamarckian genetic algorithm (LGA), the AutoDock scoring function is established on the AMBER force field as well as through linear regression analysis [64]. It deals with reinforcing docking evaluation for ligands through almost zero to ten flexible bonds. The default settings of AutoDock are tremendously effective and are commonly applied to search for the interaction profile of a drug candidate. Furthermore, it is also extensively used for virtual screening. For each docking, the AutoDock is performed for a considerable duration to provide frequently docked conformations of the ligand concerning a receptor protein [65]. Examples: drug-receptor docking; protein-protein docking; molecule optimization; analysis oscillating from structure-based drug design; validation of the action mechanism of drug molecules; etc.

#### **6.2 Handling tips of AutoDock**

AutoDock tools offer multiple approaches for docking simulation, such as alternating from simple docking to advanced docking procedures [66]. The successful run of AutoDock requires four different files, such as ligand coordinates, target coordinates, grid parameters, and docking parameters [67, 68]. These files are prepared with the help of AutoDock Tools (ADT)/MGL Tools and their preparatory procedures are as follows:

#### **6.3 Preparation ligand coordinate file**

AutoDock accepts PDB or mol2 files as an input. In the novel compound, the first three-dimensional (3D) structure of the compound is prepared. The two-dimensional (2D) structure of the proposed compound can be prepared with the help of ChemDraw or ChemDoodle (https://web.chemdoodle.com/demos/sketcher/) and saved as a SMILES file. The SMILES file is pasted into the online CORINA Classic service (https://www.mn-am.com/online\_demos/corina\_demo) to prepare meals or. pdb files, but it needs further structural optimization through a suitable method such as Merck Molecular Force Field (MMFF). On the other hand, for simple preparation to optimize 3D structures, the online molsoft (https://www.molsoft.com/2dto3d.html) is recommended. It can prepare 2D as well as 3D structures in a single place. During the conversion of a 2D structure into 3D, it automatically optimizes the structure through MMFF. It has been found that the most accurate, optimized structure can be offered by DFT, but MMFF is still useful for an organic molecule. If the proposed compound has a known structure, then its crystalline 3D structure can be obtained from PubChem (https://pubchem.ncbi.nlm.nih.gov)and ChemSpider (http://www.che mspider.com/), etc. The coordinate setting of proposed compounds needs the addition of hydrogen atoms that are included in the 3D structure [69]. The proposed compound's open 3D structure is selected as a ligand in ADT, and the 'edit' button is clicked to add polar hydrogens, Gasteiger charge, number of torsions, and detect root. At this moment, the ligand will be visible on the screen in which aromatic carbons appear green and another fragment looks red. Now click 'ok' and save it as a ligand pdbqt file.

#### **6.4 Preparation of target coordinate file**

ADT also requires preparing the coordinates of a biological macromolecule such as a protein or enzyme. The PDB file of the receptor can be downloaded from the Protein Data Bank (www.pdb.org), the Cambridge Crystallographic Database (ccdc.cam.ac. uk), etc. To generate the target coordinate file, all hydrogen atoms, need to be added. The 3D coordinates of the target can be taken from the PDB, and it requires the removal of water, ligands, cofactors, ions, etc. Click on 'Edit' to incorporate polar hydrogen, Kollman charge, Marge nonpolar hydrogen, and macromolecules are saved as target pdbqt.

#### **6.5 Preparation grid parameter file**

ADT needs a pdbqt file to prepare the grid parameter file (gpf). In a new window to set the grid, click on Grid > Macromolecule > Open and open the target pdbqt file by macromolecule. Similarly, click on Grid > Set map type > Open and open the ligand pdbqt file of the proposed small molecule or ligand, and then set the grid map, grid size, as well as grid center in x, y, and z-direction by clicking on "grid > Grid box". After that, the output file can be saved as a gpf file.

#### **6.6 Preparation docking parameter file**

For the preparation of the docking parameter file (dpf), click on Docking > Macromolecule > Set rigid filament > Open in the ADT window to open the target PDBQT. Similarly, ligand pdbqt can also be opened by clicking on Docking > Ligand > Open. Then, set the algorithm by clicking on Docking > Search Parameters > Genetic algorithm and setting docking parameters. Finally, click on Docking > Output > Lararckian GA and save it as a dpf file. Then ADT is ready to run. Firstly, it runs. It

*Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

required a proper time and needed the grid parameter file as well as a docking parameter file.

#### **6.7 Analysis of docking result**

ADT also offers to evaluate docking interactions and binding energies of a minimum of ten conformations along with a docking inhibition constant (K*a*). By selecting Analyze > Docking > Open, you may view the findings by opening the dlg file. A popup will open, click "OK" and then further click Analyze > Conformation > Play > & > Show info.

The AutoDock scoring function can be calculated based on the following formula:

Free binding energy ¼ Final intermolecular energy þ Final total internal energy <sup>þ</sup> Torsional free energy ‐ Unbound system's energy*:*

Where, total energy of van der Waal energy, hydrogen bond energy, electrostatic energy and dissolved energy equals to final intermolecular energy.

#### **7. AutoDock Vina**

AutoDock Vina was established by Oleg Trott in the Molecular Graphics Lab at the Scripps Research Institute in 2010 [70]. It is a relatively new, freely available tool for molecular docking, drug discovery, and virtual screening. It also offers high performance, multi-core proficiency, greater accuracy, and a simple handling protocol. Vina itself predicts the grid maps and clusters. Vina considerably enhances the accuracy of the interaction mode calculations as associated with AutoDock. Vina has been found to predict more accurate results as compared to other tools [71].

#### **7.1 Handling tips of AutoDock Vina**

The input and output files of Vina are pdbqt. It is essential to prepare the ligand as well as the target coordinate file in pdbqt format. Both coordinate files are prepared similarly as in AutoDock. Vina does not require a grid parameter file and a docking parameter file [72]. Additionally, it requires a text configuration file. Complete handling of AutoDock Vina is discussed below.

#### **7.2 Preparation of configuration file**

A new window of ADT is opened after the preparation of the ligand and target coordinate file. Click on Grid > Macromolecule > Open and open the target pdbqt file. Click "YES" to save the present modifications in the folder, and then press "OK" to receive them. Sometimes a warning window is also opened if there are minor indiscretions in charge. Ignore it by pressing "OK."

Then, click Grid > Set map types > Open and open the ligand pdbqt file. The grid map, grid size, and grid center of the analysis space are then described in a new window that is opened by selecting Grid > Grid box. To begin the box built on the ligand, click Center > Center on the ligand. Here, thumbnails are available for the manual changes in the values of grid size and center, along with other options. Press

file > close to save the current after adjusting the grid's size and center. To complete the setup, select Docking > Output > Vina Configuration and click "SAVE" to provide a text configuration file with the default name of config.txt.

#### **7.3 Run AutoDock Vina**

The default setting of AutoDock Vina is not enough to accurately evaluate the interaction profile and binding energies. Vina offers a factor known as exhaustiveness to adjust the computer-aided strength utilized during a docking analysis. In Vina, the default value of exhaustiveness is 8. For greater accuracy, the default value of exhaustiveness is changed and set to about 24. It will provide more accurate docking findings. The most well-known way to run Vina is via ADT. ADT offers to click "run" to run AutoDock Vina. Open a window to start the route of the Vina executable file by pressing the browse option, and then press the launch button to operate the Vina. The second path is through the command line. Open a terminal window and modify the directory that encompasses the coordinate files as well as the configuration file. The command line is edited to adjust the values of exhaustiveness (like: /Vina–config config.txt-exhaustiveness = 24). This command accepts that the AutoDock Vina executable Vina is also found in a similar directory.

#### **7.4 Analysis of Vina docking result**

ADT also offers the ability to visualize the outcomes of docking from AutoDock Vina. Open a new ADT window and select the working directory. Analyze > Docking > Open the AutoDock Vina result and select the output file obtained from step II. Then select the default single molecule with numerous conformations followed by pressing "OK" to visualize the coordinates for all docked outcomes through arrow keys. To visualize the target coordinate file, select Analyze > Macromolecule > Open and open the target pdbqt file. Similarly, open the ligand coordinate file by clicking on File > Read molecule > Open and open the ligand pdbqt file to read the crystallographic location of the ligand. It offers the ability to evaluate the ligand as well as docked conformation. Select Analyze > Docking > Show interactions to examine the ligand-target complex's interaction profile.

The estimated scoring function of AutoDock Vina is based on the following formula:

$$
\Delta \text{G (binding)} = \Delta \text{G (vdW)} + \Delta \text{G (H bond)} + \Delta \text{G (Elec.)} + \Delta \text{G (E desolv.)}\tag{2}
$$

Where ΔG denotes Gibbs' free energy, ΔG (vdW) denotes van der Waal's free energy, and ΔG (H bond) denotes hydrogen bond free energy. ΔG (Elec.) stands for electrostatic free energy; ΔG (E dissolv.) stands for dissolved free energy. Torsional free energy is denoted by the symbol ΔG (tors).

#### **8. AutoDock FR**

AutoDock FR (ADFR: AutoDock for Flexible Receptors) was developed by Dr. Pradeep Anand Ravindranath in the Integrative Structural and Computational Biology Lab at the Scripps Research Institute in 2015. The ADFR is a newly designed docking tool built for the AutoDock scoring function. The ADFR was deliberately designed to

*Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

study the interaction of small flexible ligands with the target protein [69, 73]. It offers preparation of side-chains of target proteins flexibly to simulate induced-fit without the knowledge of the side-chain conformational alterations [73]. The ADFR regulates up to 14 targets with side-chain flexibility. The proficient growth rate of docking realization is more than 50%. On the cross, docking is investigated along with up to 12 flexible receptor side-chains. The ADFR displays superior results as compared to AutoDock Vina. Vina requires uncontrolled run time for docking by increasing the number of flexible receptor side chains. On the other hand, ADFR requires linear run time [73].

#### **8.1 Handling tips of AutoDockFR**

The input format of ADFR is pdbqt format. ADFR requires the preparation of coordinate files of ligand and target. Coordinate files are prepared with the help of ADT. To perform docking through ADFR also requires the generation of affinity maps and translational points that are probable ligand binding areas. The step-by-step handling protocol of ADFR is discussed below.

#### **8.2 Prepare affinity maps and translational points**

Open a new ADFR window, select the receptor PDBQT > Open, and upload the target coordinate file in pdbqt format to run the docking analysis. Similarly, the ligand pdbqt file is uploaded by selecting Open under ligand PDBQT. Then press the box entire ligand button to surround the ligand with a docking box or grid box, followed by clicking on the center view of the docking box to center the docking position. In the docking box, along with ligand, amino acid residues can also be labeled by clicking on "show receptor residue labels." ADFR is the only tool to select the amino acid residue up to 14 at a time with a single click. To select the amino acid residues for docking investigation, click on flexible residues and select the amino acids from the list. The selected side chains of the amino acid are presented as orange balls-sticks and the other portions remain the same. Then click the green checkmark.

For the prediction of binding pockets, click on the 'compute pockets' button. Auto Site recognizes multiple pockets in the docking box and selects those at which the actual ligand is found in higher volume. These binding pocket fill-points appear as a green mesh, denoted as translational points. If the binding pocket fills-points button is green, then generating maps is supported. To generate affinity maps, press the Generate maps button and save the maps as a zip file in the working folder.

#### **8.3 Run ADFR**

Open the command window, adjust the working directory, and type the following windows command to run the ADFR: "c:\Program Files\MGL Tools 2-latest\adfr.bat" random pdbqt -m generate.zip -r ligand pdbqt -job Name Result –seed 1. To visualize the docking result, a visualization tool like Biovia DSV is used to generate the interaction profile of the ligand-target complex.

#### **9. iGEMDOCK**

The iGEMDOCK tool was established by the Institute of Bioinformatics at National Chiao Tung University, Taiwan for docking, drug design, screening, and postscreening analysis. It is an automatic multipurpose graphical package [74]. For docking evaluation on the iGEMDOCK, initially prepare the coordinate files of the ligand as well as the target. Coordinate files are prepared similarly as in AutoDock by adding torsions, bond orders, hydrogen atoms, and charges. These parameters are assigned to both the ligand and the target. The input and output files of the iGEMDOCK are PDB and Mol. IGEMDOCK automatically selects the most suitable conformation of the ligand and gives the total binding energy [74]. The iGEMDOCK scores are calculated using an empirical formula or fitness score, denoted as.

Van der Waal energy + Hydrogen bond energy + Electro-statistic energy equals fitness score.

During the docking evaluation, the estimation of target binding sites and structure optimization are very significant. The hydrogen bonds found in the docked complex strongly impact the scoring function. This possibility reduces the number of suspected H bonds significantly. Additionally, internal H bonds, as well as internal electrostatic interaction, are predicted as sp2-sp2 torsions from the interaction complex. The iGEMDOCK works, since the generic evolutionary method (GA), provides three effective docking methods, *viz.,* standard docking, stable docking, and accurate docking. Accurate docking is a very slow docking protocol and offers a maximum of 80 numbers of runs or generations, 800 population size, 8000 interactions, and 10 numbers of the solution, along with 100-threshold energy. For every single step, torsions, translations, and rotations are verified. For a better result, the hydrophobic, as well as electrostatic preferences are set to 1.00. The iGEMDOCK automatically selects the lowest energy conformation. When the iGEMDOCK calculates unfavorable electrostatic interaction, then a positive energy value is obtained. To rectify this problem, check the docked position and restart; or if the docked pose is closer to the listed ligands, define the RMSD threshold and add an energy penalty (i.e., the 100 energy penalty, 2.00 RMSD threshold, and atom ID (fast) RMSD calculations were set.) In the scoring function, the docking tool resolves and emphasizes the results of its previous search and finds their variations. Then ligand-target docking proceeds and results are obtained in the form of binding affinities (kcal/Mol) and docking run time. The minimum binding energy conformation is automatically selected as the best finding. The overall docking performance of the iGEMDOCK as compared to other docking tools is simple and better.

#### **9.1 Handling tips of iGEMDOCK**

The iGEMDOCK is a complete package of automated docking and screening. It is a combination of two main parts; the first part predicts the interaction profile among the ligand-target complex in the 3D structure, while the second part predicts the suitable pose of the ligand-target complex along with post-analysis. The docking evaluation with the iGEMDOCK begins with the preparation of ligand and target protein coordinate files. Both coordinate files are prepared like AutoDock. The iGEMDOCK input and output file formats are mol, mol2, and PDB.

#### **9.2 Target binding site preparation**

In the iGEMDOCK operator, a distinct binding site of the target protein/enzyme or complete target structure is selected. If the target's input file contains a natural


#### **Table 1.**

*The server/software of the molecular docking analysis.*

physiological ligand, it will automatically determine the target's binding site. To begin docking, upload the target's coordinate file (PDB) by clicking "Prepare binding site > Browse > Open" in the "Protein-ligand docking/screening" window. To select the binding site of the target, click on "By bounded ligand" and then define the binding site center by selecting the available ligand which you want to study. It also offers to set the binding site radius; by default, its value is 8.0 Å. Uncheck the "Retain reference ligand" box, and then click "OK" to save the defined parameter to the chosen binding site. This will delete the physiological ligand. Select "by a current file" to specify the binding sites of the new target protein.

#### **9.3 Ligand preparation**

The iGEMDOCK provides two methods for ligand preparation. To begin, for "single ligand," upload the ligand coordinate file (single/many) directly by clicking "Prepare compounds > Ligands > Open" and pressing "OK" at the "docking/screening" window. The iGEMDOCK recommends preparing the ligand coordinate file in mol. It does not assign charges and hydrogen to all of the ligand's atoms. For the "ligand database", the ligand library is also prepared as mol. To upload the list of compounds, click "Prepare compounds," then "import list," "Open," and "OK."

#### **9.4 Run iGEMDOCK**

Set the output path before the start of docking evaluation. Set the output path by clicking on the "Set output path". Then choose the desired file and press "OK."

*Set the GA Parameters:* iGEMDOCK works based on the generic evolutionary method (GA) for docking performance. It automatically calculates the ligand conformation as well as orientation compared to the interaction site of the target. The following default GA parameters are generally recommended: population size: 200; generations: 70; and a number of solutions: 3.

*Advanced Options:* The iGEMDOCK offers an advanced option for the adjustment of the scoring function, saving/loading configurations, and generating docking poses. It also allows setting the internal energy of the ligand in docking prediction or the addition of certain molecular filters. It automatically produces a configuration file with the name config.dock file in the directory "/bin/". Set up all the parameters in the configuration file and run it with the help of command mode.

*Start Docking:* After the setting of coordinate files of ligand and target along with output path and docking parameters, press "start docking" and observe the status of the job on the screen. After the completion of docking, a default alert is opened. To close it, click "OK," then press "View docked poses > post-analyze" to visualize the docking poses and the complete binding energy of the docked complex. These docking poses will be saved in the "best\_pose" and "fitness.txt" at the output site, respectively (**Table 1**).

#### **10. Use of molecular docking**

In the last decade, technologies like high-throughput sequencing and X-ray crystallography have been regularly updated. The crystal structures of large numbers of proteins have been defined. Consequently, the structural and functional significance of biological macromolecules (like proteins and enzymes) has been expanded and many novel drug targets also have been identified [75]. Due to the revolution of

#### *Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

computational science in various fields of research, the utilization of virtual screening and molecular docking in DDD has been significantly stimulated. The development of a novel drug is time-consuming, costly, and needs more manpower [76]. Currently, computer-aided technology has become a key tool in DDD. Through molecular docking simulation, the analysis of the mutual interaction of drug and receptor becomes very easy along with high accuracy and boosts the drug development procedure by reducing the time [77].

Reverse molecular docking is a particularly fresh and innovative significant of molecular docking. It precedes the library of small molecules as a key structure to execute molecular docking in the spatial or 3D target database and evaluate the conceivable larger entities to conclude the three-dimensional structure and energy of identical assessment. That is to say, it identifies the most suitable target with minimum binding energy. For that reason, the development of reverse molecular docking provides a new route to discover the suitable target of a drug compound and reveal the drug action mechanism [78].

#### **11. Conclusion**

The findings of this chapter demonstrate that docking programs are highly focused on the development of new pharmaceutical compounds using molecular modeling. In this decade, new docking software designs are emphasized. These trends are focused on improving docking accuracy by using more accurate molecular energy calculations without any fitting parameters, such as quantum-chemical methods, implicit solvent models, and new global optimization algorithms that can treat ligand flexibility and protein atom mobility at the same time. Current docking applications are not reliable enough to estimate binding affinity due to the insufficient molecular structure and the inadequacies of the scoring algorithm. However, by including a huge amount of biological data into the scoring function, the present molecular docking technique can be improved. Finally, it is demonstrated that all of the conditions for improving docking accuracy may be met in practice. Furthermore, some expanded sampling strategies are no longer an exclusive methodological exercise but have become accessible to a wide range of research organizations, with real-world applications in drug discovery. Molecular docking, technological advancements, and novel MD computational approaches have all made it possible to simulate increasingly large conformational shifts. By providing a mechanical understanding of binding pathways, the ability to recreate present folding and binding processes can be used to address the long-standing argument regarding "induced-fit" and "conformational selection" binding theories.

#### **Acknowledgements**

The authors gratefully acknowledge the R&D wing of Integral University, Lucknow, for their support and guidance.

#### **Conflict of interest**

The authors declare no conflict of interest.

*Molecular Docking - Recent Advances*

### **Author details**

Iqbal Azad Department of Chemistry, Integral University, Lucknow, UP, India

\*Address all correspondence to: iazad@iul.ac.in

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

#### **References**

[1] Chen G, Seukep AJ, Guo M. Recent advances in molecular docking for the research and discovery of potential marine drugs. Marine Drugs. 2020; **18**(11):545-566

[2] Guedes IA, de Magalhães CS, Dardenne LE. Receptor–ligand molecular docking. Biophysical Reviews. 2014;**6**(1):75

[3] Pantsar T, Poso A. Binding affinity via docking: Fact and fiction. Mol A J Synth Chem Nat Prod Chem. 2018; **23**(8):1899-1909

[4] Mazumder M, Ponnan P, Das U, Gourinath S, Khan HA, Yang J, et al. Investigations on binding pattern of kinase inhibitors with PPAR γ: Molecular docking, molecular dynamic simulations, and free energy calculation studies. PPAR Research. 2017;**2017**: 6397836

[5] Ramírez D, Caballero J. Is it reliable to use common molecular docking methods for comparing the binding affinities of enantiomer pairs for their protein target? International Journal of Molecular Sciences. 2016;**17**(4):525-539

[6] Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. A geometric approach to macromolecule-ligand interactions. Journal of Molecular Biology. 1982;**161**(2):269-288

[7] Ahmed A, Mam B, Sowdhamini R. DEELIG: A deep learning approach to predict protein-ligand binding affinity. Bioinformatics and Biology Insights. 2021;**15**:1-9

[8] Guedes IA, Pereira FSS, Dardenne LE. Empirical scoring functions for structure-based virtual screening: Applications, critical aspects, and

challenges. Frontiers in Pharmacology. 2018;**9**:1089

[9] Koshland DE. The key–lock theory and the induced fit theory. Angew Chemie Int Ed English. 1995;**33**(23–24): 2375-2378

[10] Lexa KW, Carlson HA. Protein flexibility in docking and surface mapping. Quarterly Reviews of Biophysics. 2012;**45**(3):301-343

[11] Tripathi A, Bankaitis VA. Molecular docking: From lock and key to combination lock. J Mol Med Clin Appl. 2017;**2**(1). DOI: 10.16966/ 2575-0305.106

[12] Andrusier N, Mashiach E, Nussinov R, Wolfson HJ. Principles of flexible protein-protein docking. Proteins: Structure, Function and Genetics. 2008;**73**(2):271-289

[13] Anderson AC, O'Neil RH, Surti TS, Stroud RM. Approaches to solving the rigid receptor problem by identifying a minimal set of flexible residues during ligand docking. Chemistry & Biology. 2001;**8**(5):445-457

[14] Meng X-Y, Zhang H-X, Mezei M, Cui M. Molecular docking: A powerful approach for structure-based drug discovery. Current Computer-Aided Drug Design. 2011;**7**(2):146-157

[15] Du X, Li Y, Xia YL, Ai SM, Liang J, Sang P, et al. Insights into protein–ligand interactions: Mechanisms, models, and methods. International Journal of Molecular Sciences. 2016;**17**(2):144-177

[16] Cramer F. Biochemical correctness: Emil Fischer's lock and key hypothesis, a hundred years after — An essay.

Pharmaceutica Acta Helvetiae. 1995; **69**(4):193-203

[17] Salmaso V, Moro S. Bridging molecular docking to molecular dynamics in exploring ligand-protein recognition process: An overview. Frontiers in Pharmacology. 2018;**9**:923

[18] Meng X-Y, Zhang H-X, Mezei M, Cui M. Molecular docking: A powerful approach for structure-based drug discovery. Curr Comput Aided-Drug Des. 2012;**7**(2):146-157

[19] Pinzi L, Rastelli G. Molecular docking: Shifting paradigms in drug discovery. International Journal of Molecular Sciences. 2019;**20**(18):4331

[20] Hetényi C, Van Der Spoel D. Blind docking of drug-sized compounds to proteins with up to a thousand residues. FEBS Letters. 2006;**580**(5):1447-1450

[21] Ferreira LG, Dos Santos RN, Oliva G, Andricopulo AD. Molecular docking and structure-based drug design strategies. Molecules. 2015;**20**(7):13384

[22] Hall R, Dixon T, Dickson A. On calculating free energy differences using ensembles of transition paths. Frontiers in Molecular Biosciences. 2020;**7**:106

[23] Agrawal P, Singh H, Srivastava HK, Singh S, Kishore G, Raghava GPS. Benchmarking of different molecular docking methods for protein-peptide docking. BMC Bioinformatics. 2019;**19**: 426

[24] Torres PHM, Sodero ACR, Jofily P, Silva-Jr FP. Key topics in molecular docking for drug design. International Journal of Molecular Sciences. 2019; **20**(18):4574

[25] Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, et al. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry. 1998;**19**(14):1639-1662

[26] Zavodszky MI, Kuhn LA. Side-chain flexibility in protein–ligand binding: The minimal rotation hypothesis. Protein Science. 2005;**14**(4):1104

[27] Miao Z, Cao Y. Quantifying sidechain conformational variations in protein structure. Sci Reports. 2016;**6**(1): 1-10

[28] Jiang F, Kim SH. "Soft docking": Matching of molecular surface cubes. Journal of Molecular Biology. 1991; **219**(1):79-102

[29] Lepore R, Kryshtafovych A, Alahuhta M, Veraszto HA, Bomble YJ, Bufton JC, et al. Target highlights in CASP13: Experimental target structures through the eyes of their authors. Proteins Struct Funct Bioinforma. 2019; **87**(12):1037-1057

[30] Hospital A, Goñi JR, Orozco M, Gelpí JL. Molecular dynamics simulations: Advances and applications. Adv Appl Bioinform Chem. 2015;**8**(1):37

[31] Huang SY, Zou X. Ensemble docking of multiple protein structures: Considering protein structural variations in molecular docking. Proteins. 2007;**66**(2):399-421

[32] Salmaso N, Stevens HE, McNeill J, ElSayed M, Ren Q, Maragnoli ME, et al. Fibroblast growth factor 2 modulates hypothalamic pituitary Axis activity and anxiety behavior through glucocorticoid receptors. Biological Psychiatry. 2016; **80**(6):479-489

[33] Al-Karmalawy AA, Dahab MA, Metwaly AM, Elhady SS, Elkaeed EB, Eissa IH, et al. Molecular docking and *Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

dynamics simulation revealed the potential inhibitory activity of ACEIs against SARS-CoV-2 targeting the hACE2 receptor. Frontiers in Chemistry. 2021;**9**:661230

[34] Seidel R, Blumer M, Chaumel J, Amini S, Dean MN. Endoskeletal mineralization in chimaera and a comparative guide to tessellated cartilage in chondrichthyan fishes (sharks, rays and chimaera). J R Soc Interface. 2020; **17**:20200474

[35] Gallego-Yerga L, Ochoa R, Lans I, Peña-Varas C, Alegría-Arcos M, Cossio P, et al. Application of ensemble pharmacophore-based virtual screening to the discovery of novel antimitotic tubulin inhibitors. Computational and Structural Biotechnology Journal. 2021; **19**:4360-4372

[36] Knegtel RMA, Kuntz ID, Oshiro CM. Molecular docking to ensembles of protein structures. Journal of Molecular Biology. 1997;**266**(2):424-440

[37] Adcock SA, McCammon JA. Molecular dynamics: Survey of methods for simulating the activity of proteins. Chemical Reviews. 2006;**106**(5):1589

[38] Hollingsworth SA, Dror RO. Molecular dynamics simulation for all. Neuron. 2018;**99**(6):1129

[39] Kitchen DB, Decornez H, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nature Reviews. Drug Discovery. 2004;**3**(11):935-949

[40] Gautam B. Energy Minimization. In: Maia RT, de Moraes Filho Ra, Campos M. editors. Homology Molecular Modeling - Perspectives and Applications. London: IntechOpen; 2020. DOI: 10.5772/intechopen.94809

[41] Araki M, Matsumoto S, Bekker GJ, et al. Exploring ligand binding pathways on proteins using hypersoundaccelerated molecular dynamics. Nature Communications. 2021;**12**:2793

[42] Huang SY, Zou X. Advances and challenges in protein-ligand docking. International Journal of Molecular Sciences. 2010;**11**(8):3016-3034

[43] Korb O, Stützle T, Exner TE. PLANTS: Application of ant Colony optimization to structure-based drug design. Lecture Notes in Computer Science. 2006;**4150**:247-258

[44] Katoch S, Chauhan SS, Kumar V. A review on genetic algorithm: Past, present, and future. Multimedia Tools and Applications. 2021;**80**(5):8091-8126

[45] Tessaro F, Tessaro F, Scapozza L, Scapozza L. How 'protein-docking' translates into the new emerging field of docking small molecules to nucleic acids? Molecules. 2020;**25**(12):2749

[46] Xu J, Zhang L, Ye Y, Shan Y, Wan C, Wang J, et al. SNX16 regulates the recycling of E-cadherin through a unique mechanism of coordinated membrane and cargo binding. Structure. 2017;**25**(8): 1251-1263.e5

[47] Pei J, Wang Q, Liu Z, Li Q, Yang K, Lai L. PSI-DOCK: Towards highly efficient and accurate flexible ligand docking. Proteins Struct Funct Bioinforma. 2006;**62**(4):934-946

[48] Jones S, Thornton JM. Prediction of protein-protein interaction sites using patch analysis. Journal of Molecular Biology. 1997;**272**(1):133-143

[49] Yan Y, He J, Feng Y, Lin P, Tao H, Huang SY. Challenges and opportunities of automated protein-protein docking: HDOCK server vs human predictions in

CAPRI rounds 38-46. Proteins Struct Funct Bioinforma. 2020;**88**(8): 1055-1069

[50] Ruiz-Carmona S, Alvarez-Garcia D, Foloppe N, Garmendia-Doval AB, Juhos S, Schmidtke P, et al. rDock: A fast, versatile and open source program for docking ligands to proteins and nucleic acids. PLoS Computational Biology. 2014;**10**(4):e1003571

[51] Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD. Improved protein-ligand docking using GOLD. Proteins: Structure, Function, and Genetics. 2003;**52**(4):609-623

[52] Trott O, Olson AJ. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry. 2010;**31**(2):455-461

[53] Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, et al. The ClusPro web server for proteinprotein docking. Nature Protocols. 2017; **12**(2):255

[54] del Alamo D, Sala D, Mchaourab HS, Meiler J. Sampling alternative conformational states of transporters and receptors with AlphaFold2. eLife (2022);**11**:e75751. DOI.org/10.7554/ eLife.75751

[55] Pagadala NS, Syed K, Tuszynski J. Software for molecular docking: A review. Biophysical Reviews. 2017; **9**(2):91

[56] Brooijmans N, Kuntz ID. Molecular recognition and docking algorithms. Annual Review of Biophysics and Biomolecular Structure. 2003;**32**:335-373

[57] Sauton N, Lagorce D, Villoutreix BO, Miteva MA. MS-DOCK: Accurate

multiple conformation generator and rigid docking protocol for multi-step virtual ligand screening. BMC Bioinformatics. 2008;**9**(1):1-12

[58] DesJarlais RL, Kuntz ID, Sheridan RP, Venkataraghavan R, Dixon JS. Docking flexible ligands to macromolecular receptors by molecular shape. Journal of Medicinal Chemistry. 1986;**29**(11):2149-2153

[59] Rarey M, Kramer B, Lengauer T, Klebe G. A fast flexible docking method using an incremental construction algorithm. Journal of Molecular Biology. 1996;**261**(3):470-489

[60] Lengauer T, Rarey M. Computational methods for biomolecular docking. Current Opinion in Structural Biology. 1996;**6**(3):402-406

[61] Welch WJ, Brown CR. Influence of molecular and chemical chaperones on protein folding. Cell Stress & Chaperones. 1996;**1**(2):109

[62] Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, Banks JL. Glide: A new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. Journal of Medicinal Chemistry 2004;47(7):1750–1759.

[63] Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: A new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. Journal of Medicinal Chemistry. 2004; **47**(7):1739-1749

[64] Morris GM, Ruth H, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. Software news and updates AutoDock4 and AutoDockTools4: Automated docking with selective receptor

*Molecular Docking in the Study of Ligand-Protein Recognition: An Overview DOI: http://dx.doi.org/10.5772/intechopen.106583*

flexibility. Journal of Computational Chemistry. 2009;**30**(16):2785-2791

[65] Dar AM, Mir S. Molecular docking: Approaches, types, applications and basic challenges. J Anal Bioanal Tech. 2017;**8**(2):1-3

[66] Hernández-Santoyo A, Tenorio-Barajas AY, VictorAltuzar V, Vivanco-Cid H, Mendoza-Barrera C. Protein-Protein and Protein-Ligand Docking. In: Ogawa T. editor. Protein Engineering - Technology and Application. London: IntechOpen; 2013. DOI: 10.5772/56376

[67] Cosconati S, Forli S, Perryman AL, Harris R, Goodsell DS, Olson AJ. Virtual screening with AutoDock: Theory and practice. Expert Opinion on Drug Discovery. 2010;**5**(6):597-607

[68] Rizvi SMD, Shakil S, Haneef M. A simple click by click protocol to perform docking: AutoDock 4.2 made easy for non-bioinformaticians. EXCLI Journal. 2013;**12**:831-857

[69] Forli S, Huey R, Pique ME, Sanner MF, Goodsell DS, Olson AJ. Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nature Protocols. 2016; **11**(5):905-919

[70] Trott O, Olson AJ. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry. 2009;**31**(2):NA-NA

[71] Sandeep G, Nagasree KP, Hanisha M, Kumar MMK. AUDocker LE: A GUI for virtual screening with AUTODOCK Vina. BMC Research Notes. 2011; **4**(1):445

[72] Seeliger D, De Groot BL. Ligand docking and binding site analysis with PyMOL and Autodock/Vina. Journal of Computer-Aided Molecular Design. 2010;**24**(5):417-422

[73] Ravindranath PA, Forli S, Goodsell DS, Olson AJ, Sanner MF. AutoDockFR: Advances in proteinligand docking with explicitly specified binding site flexibility. PLoS Computational Biology. 2015;**11**(12): e1004586

[74] Hsu KC, Chen YF, Lin SR, Yang JM. Igemdock: A graphical environment of enhancing gemdock using pharmacological interactions and postscreening analysis. BMC Bioinformatics. 2011;**12**(Suppl. 1):S33

[75] Jamkhande PG, Ghante MH, Ajgunde BR. Software based approaches for drug designing and development: A systematic review on commonly used software and its applications. Bull Fac Pharmacy, Cairo Univ. 2017;**55**(2): 203-210

[76] Talluri S. Molecular docking and virtual screening based prediction of drugs for COVID-19. Combinatorial Chemistry & High Throughput Screening. 2021;**24**(5):716-728

[77] Oferkin IV, Katkova EV, Sulimov AV, Kutov DC, Sobolev SI, Voevodin VV, et al. Evaluation of docking target functions by the comprehensive investigation of proteinligand energy minima. Advances in Bioinformatics. 2015;**2015**:20151-20112

[78] Malmstrom RD, Watowich SJ. Using free energy of binding calculations to improve the accuracy of virtual screening predictions. Journal of Chemical Information and Modeling. 2011;**51**(7):1648-1655

## Section 3
