**4. Molecular docking**

The plethora of diseases discovered ever since and being investigated tirelessly by scientists all over the world ultimately culminates to the sole objective of finding effective solutions. The therapeutic targets in most of the cases are proteins. After knowing their mechanism of actions, how the proteins works and what goes wrong during the diseased state, the next notion is to challenge their functionality with designing some inhibitors. It comes under the domain of drug discovery. And one of the most challenging fields of study is the drug design and development. The complete clinical trials take about 10–15 years of time with billions of dollars expenses for a single drug to reach market. With the completion of human genome project which leads to identification of ever-increasing number of new drug targets (mainly proteins); the efforts were strengthened to find solution to the diseases. Additionally, the availability of 3D structures of protein and protein-ligand complexes made it feasible to carry out research in this area. However, to experimentally screen millions of compounds and their conformers for a single therapeutic target requires enormous amount of time and resources which makes it quite challenging. With the application of computational techniques, the pre-clinical period can be reduced to save valuable assets. The *in-silico* approaches will significantly curtail the time needed for hit identification and also improve the chances of finding the anticipated drug molecules. To facilitate drug design and discovery, several modeling

techniques were available and mostly they are categorized into two main approaches viz. structure-based and ligand-based drug design approaches. The structure-based approach mainly relies on the 3D data of target and the ligand. The ligand-based approach is chiefly adopted in the absence of known experimental structure of the target. In ligand-based approach, the known ligands which were bound to the targets were investigated to decipher the physiochemical and structural properties of the ligands and these were correlated with the anticipated pharmacological activity of the ligands in hand [60].

One of the most extensively utilized computational techniques in the structurebased drug design is molecular docking. Molecular docking is usually achieved by first predicting the molecular orientation or pose of a ligand within the active site of a target and followed by assessing their binding affinity with the usage of a scoring function. The technique is exploited to decipher the interactions between a target and ligand at the atomic level allowing us to describe the behavior of ligands within the active sites of targets as well as to reveal fundamental biochemical processes. Since the first developments of docking algorithms in the 1980s, molecular docking became an indispensable tool in the field of drug discovery [61].

### **4.1 Types of molecular docking**

Molecular docking can be basically categorized into three types: rigid docking, semi-flexible docking and flexible docking. In the rigid docking approach, both the structure of target and ligand does not change. The computation method is relatively modest and chiefly spans the degree of conformational matching, thus it is more apt for investigating macromolecular systems such as protein-nucleic acid and protein-protein systems. The semi/quasi flexible docking approach take flexibility into consideration while docking of the ligand and thus it is more appropriate to deal with the intermolecular interactions of small molecules and proteins. Usually the structure of the ligands can move freely while the target remains rigid or retain few rotatable residues ensuring computational efficiency during the docking process. In the flexible docking method, it is based on the idea that a protein is not always a rigid entity during the course of ligand binding and thus it considers both the protein and ligand as flexible entities. Over the years various methods have been introduced, based on induced fit model and/or conformational sampling.

#### *4.1.1 Scoring function*

One crucial element of any docking algorithm is the scoring function. The scoring function aids in the pose selection and it is involved in distinguishing putative precise binding modes and to filter out the non-binders from the *N* number of generated poses during a docking run. The speed and accuracy of docking programs is also dependent on scoring functions. Further computational efficiency and reliability are points kept in mind while developing any scoring function. There are three categories of scoring functions:

#### i.Force-field based scoring function

This scoring function is based on the concept of molecular mechanics which estimates the potential energy of a system with a mixture of intramolecular and intermolecular elements. In molecular docking, the intermolecular elements are usually considered, with the probable ligand-bonded terms, especially the torsional constituents. The non-bonded constituents include the van der Waals term which

**43**

*Role of Force Fields in Protein Function Prediction DOI: http://dx.doi.org/10.5772/intechopen.93901*

examples of the mentioned scoring function.

iii.Knowledge-based scoring function

ii.Empirical scoring function

empirical scoring functions.

functions [68].

*4.1.2 Sampling algorithms*

i.Shape matching

is defined by Lennard-Jones potential, and the electrostatic term, specified by the Coulomb function. GoldScore [62], AutoDock [63] and GBVI/WSA [60] are few

Empirical function is the sum of different empirical terms such as van der Waals, H-bond, electrostatic, entropy, desolvation, hydrophobicity, etc. Utilizing least square fitting method, they are optimized on a training set of target-ligand complexes to reproduce the binding affinity data. Empirical scoring functions compared to force-field ones are computationally much more efficient owing to their simple energy terms. The first example of empirical scoring function is the LUDI scoring function [64]. GlideScore [65] and ChemScore [66] are other examples of

Knowledge-based functions are directly obtained from the structural information of experimentally solved protein-ligand complexes. The frequencies of interatomic contact and/or distances between the target and the ligand are obtained. The premise for this criterion relies on the assumption that frequency of occurrences will be greater for the ones with more favorable interactions. Pairwise atom-type potentials were generated with the obtained frequency distributions. Further the score is computed by preferred interactions and imposing penalty for repulsive contacts between each pair of atoms in the target and ligand within a set cutoff. Examples of this scoring function are DrugScore [67] and GOLD/ASP

With the advancement in the field of high-performance computing, scientists have also applied artificial intelligence based and machine learning based scoring

Sampling plays the next crucial role in any molecular docking program. With a set therapeutic target, the sampling algorithm will generate a number of conformations (poses) of the small molecule within the docked site of the target. The knowledge of the docked site is considered either from experimental data or predicted with the aid of active site prediction software. As the speed and accuracy of molecular docking plays a role in large virtual screening research works, the area of developing and/or improving existing sampling algorithms have provided ample opportunities for computational scientists. The sampling algorithms can be categorized as: shape matching, systematic search algorithm and stochastic algorithm.

One of the earliest methods designed was the shape matching algorithm for sampling. The criterion implemented in this algorithm is that the molecular surface of the small molecule needs to complement the molecular surface of the binding region of the target. The three translational and three rotational (six degree of freedom) of the small molecule led to spans many probable orientations. Thus, the goal of this algorithm is to place as smoothly and quickly the small molecule into the binding site based on shape complementarity. In this method, the conformation of the small molecule is usually fixed and therefore, this method along with

functions in virtual screening which holds promising outcomes [69].

#### *Role of Force Fields in Protein Function Prediction DOI: http://dx.doi.org/10.5772/intechopen.93901*

*Homology Molecular Modeling - Perspectives and Applications*

of the ligands in hand [60].

**4.1 Types of molecular docking**

techniques were available and mostly they are categorized into two main approaches viz. structure-based and ligand-based drug design approaches. The structure-based approach mainly relies on the 3D data of target and the ligand. The ligand-based approach is chiefly adopted in the absence of known experimental structure of the target. In ligand-based approach, the known ligands which were bound to the targets were investigated to decipher the physiochemical and structural properties of the ligands and these were correlated with the anticipated pharmacological activity

One of the most extensively utilized computational techniques in the structurebased drug design is molecular docking. Molecular docking is usually achieved by first predicting the molecular orientation or pose of a ligand within the active site of a target and followed by assessing their binding affinity with the usage of a scoring function. The technique is exploited to decipher the interactions between a target and ligand at the atomic level allowing us to describe the behavior of ligands within the active sites of targets as well as to reveal fundamental biochemical processes. Since the first developments of docking algorithms in the 1980s, molecular docking

Molecular docking can be basically categorized into three types: rigid docking, semi-flexible docking and flexible docking. In the rigid docking approach, both the structure of target and ligand does not change. The computation method is relatively modest and chiefly spans the degree of conformational matching, thus it is more apt for investigating macromolecular systems such as protein-nucleic acid and protein-protein systems. The semi/quasi flexible docking approach take flexibility into consideration while docking of the ligand and thus it is more appropriate to deal with the intermolecular interactions of small molecules and proteins. Usually the structure of the ligands can move freely while the target remains rigid or retain few rotatable residues ensuring computational efficiency during the docking process. In the flexible docking method, it is based on the idea that a protein is not always a rigid entity during the course of ligand binding and thus it considers both the protein and ligand as flexible entities. Over the years various methods have been introduced, based on induced fit model and/or conformational

One crucial element of any docking algorithm is the scoring function. The scoring function aids in the pose selection and it is involved in distinguishing putative precise binding modes and to filter out the non-binders from the *N* number of generated poses during a docking run. The speed and accuracy of docking programs is also dependent on scoring functions. Further computational efficiency and reliability are points kept in mind while developing any scoring function. There are

This scoring function is based on the concept of molecular mechanics which estimates the potential energy of a system with a mixture of intramolecular and intermolecular elements. In molecular docking, the intermolecular elements are usually considered, with the probable ligand-bonded terms, especially the torsional constituents. The non-bonded constituents include the van der Waals term which

became an indispensable tool in the field of drug discovery [61].

**42**

sampling.

*4.1.1 Scoring function*

three categories of scoring functions:

i.Force-field based scoring function

is defined by Lennard-Jones potential, and the electrostatic term, specified by the Coulomb function. GoldScore [62], AutoDock [63] and GBVI/WSA [60] are few examples of the mentioned scoring function.

ii.Empirical scoring function

Empirical function is the sum of different empirical terms such as van der Waals, H-bond, electrostatic, entropy, desolvation, hydrophobicity, etc. Utilizing least square fitting method, they are optimized on a training set of target-ligand complexes to reproduce the binding affinity data. Empirical scoring functions compared to force-field ones are computationally much more efficient owing to their simple energy terms. The first example of empirical scoring function is the LUDI scoring function [64]. GlideScore [65] and ChemScore [66] are other examples of empirical scoring functions.
