**3.1. Thermodynamics of host-guest binding**

The determination of binding affinities for host-guest complexes arising from the non-covalent association of two molecules is of paramount importance in different fields, including drug discovery. The host-guest binding affinity is related to the difference in Gibbs free energies of binding (∆Gbind) corresponding to the associated and dissociated states of the system under study (see **Figure 2**).

*∆Gbind* is composed of enthalpic and entropic terms. While the enthalpic contribution reflects changes in the inner energy of the systems, i.e. energies related to atomic motions and interactions, entropic contributions are also related with conformational changes upon binding [16].

The common methods for binding affinity estimation include very fast but less accurate similarity-based regression models and scoring functions and accurate but computationally demanding physical methods, which require a large number of MD (or Monte Carlo) simulations [57]. This suggests that the most convenient methods and algorithms are those that offer a good balance between computational costs and accuracy. Most accurate strategies provide estimates for thermodynamic quantities, with intrinsic difficulties associated to fundamental aspects. The quantification of atomic or ionic interactions, associated to repulsive and attractive forces, requires physical models assigned to different force fields (FF). The latter can be detailed using quantum mechanical (QM) *ab-initio* methods. However, a quantum mechanical representation of solvated biological systems, such as membranes and proteins consisting of large amounts of atoms is useless, even with approximations, such as the density functional theory (DFT) [58, 59]. Relevant considerations have been made [24] on the accuracy of FF for the calculation of binding thermodynamics. Some FF combinations may provide the most accurate binding enthalpies but the least accurate binding free energies, with implications in the development of new FF. These have been evaluated for a wide range of host-guest complexes and water models, using different partial charge assignment methods and host FF parameters, and resorting to the attach-pull-release (APR) method [24], which allows computing the absolute binding free energies, from a series of umbrella sampling simulations. In the

**Figure 2.** Schematic representations of a reversible non-covalent association/dissociation process of a guest molecule (green) and a host (orange) molecule, possessing a certain binding energy difference (∆Gbind) between the associated and dissociated states.

latter, restraints controlling the host-guest complex are activated cumulatively for separating host and guest molecules, and then released to leave the guest at standard concentration.

intermediate states. The respective equations have also been used as the core of adaptive umbrella sampling approaches [42, 43], in which the efficiency of free-energy calculations are

The determination of binding affinities for host-guest complexes arising from the non-covalent association of two molecules is of paramount importance in different fields, including drug discovery. The host-guest binding affinity is related to the difference in Gibbs free energies of binding (∆Gbind) corresponding to the associated and dissociated states of the system

*∆Gbind* is composed of enthalpic and entropic terms. While the enthalpic contribution reflects changes in the inner energy of the systems, i.e. energies related to atomic motions and interactions, entropic contributions are also related with conformational changes upon binding [16]. The common methods for binding affinity estimation include very fast but less accurate similarity-based regression models and scoring functions and accurate but computationally demanding physical methods, which require a large number of MD (or Monte Carlo) simulations [57]. This suggests that the most convenient methods and algorithms are those that offer a good balance between computational costs and accuracy. Most accurate strategies provide estimates for thermodynamic quantities, with intrinsic difficulties associated to fundamental aspects. The quantification of atomic or ionic interactions, associated to repulsive and attractive forces, requires physical models assigned to different force fields (FF). The latter can be detailed using quantum mechanical (QM) *ab-initio* methods. However, a quantum mechanical representation of solvated biological systems, such as membranes and proteins consisting of large amounts of atoms is useless, even with approximations, such as the density functional theory (DFT) [58, 59]. Relevant considerations have been made [24] on the accuracy of FF for the calculation of binding thermodynamics. Some FF combinations may provide the most accurate binding enthalpies but the least accurate binding free energies, with implications in the development of new FF. These have been evaluated for a wide range of host-guest complexes and water models, using different partial charge assignment methods and host FF parameters, and resorting to the attach-pull-release (APR) method [24], which allows computing the absolute binding free energies, from a series of umbrella sampling simulations. In the

**Figure 2.** Schematic representations of a reversible non-covalent association/dissociation process of a guest molecule (green) and a host (orange) molecule, possessing a certain binding energy difference (∆Gbind) between the associated and

improved through refinement of the biasing potentials as the simulation progressed.

**3.1. Thermodynamics of host-guest binding**

under study (see **Figure 2**).

14 Molecular Dynamics

dissociated states.

In fact, the selection of FF and setup parameters (e.g. protonation states and slow motions) can be a difficult task, as new variants primarily concerned to the treatment of proteins, are constantly being produced. There are standard options such as OPLS [60], AMBER [61], CHARMM [62], and GROMOS [63] and other more specific FF, such as the Kirkwood-Buff [64] FF, directed at aromatic amino acids, the CHARMM polarizable FF, based on the classical Drude oscillator [65, 66], the induced dipole [67], for modeling polarization in proteins and protein-ligand complexes, the Jorgensen's approach [68] for parameterization by atoms-in-molecule electron density partitioning, AMOEBA [69] and GEM\* [70], among many others [71–73]. Additionally, several water models such as TIP3P [74], TIP4Pew [75], and SPC/E [76], and the recently introduced OPC [77], TIP3P-FB and TIP4P-FB [78], TIP4P-D [79], and iAMOEBA [80] are also available. It is worth mentioning that the assessement of the ability of these FF to reproduce experimental data are very sparse, and based on the free-energy of hydration of small molecules [81], and on the structure of nucleic acids [82] and proteins [83]. With regard to explicit solvent methods, only a few studies have focused on thermodynamic data from non-covalent binding for validating the FF. Recently [84, 85], the binding free energies of cyclodextrin hostguest complexes were also used, but in the context of implicit solvent models.

It is worth to note that there are other issues in this type of systems that transcend the accuracy of the selected FF, and affect the simulation results. These are related to restrictions for the sampling and estimation of reliable thermodynamic quantities, including the respective binding constants. The latter can be determined from experimental techniques such as <sup>1</sup> H NMR, isothermal titration calorimetry and solubility experiments. The calculations of binding constants that provide a convenient counterpart to the experimental observations in inclusion complexes is sometimes addressed using a "flexible molecule" approximation [5], instead of the conventional rigid rotor harmonic oscillator [86–88]. The available volume for a guest molecule is described in terms of a cylindrical, PMF weighted region, in which the respective motion is measured by the positioning of the center of mass. The "flexible molecule" approximation is not as widespread as might be expected, and relatively recent publications [86, 89] both divulge the approximation and highlight controversies on the topic. In simple terms, the binding constant, *Kbind* can be estimated by integrating the PMF values *(∆GPMF(ξ)* along the association/dissociation coordinate, *ξ*, from

$$K\_{\rm bind} = \pi N\_A \int r \, \{\xi\}^2 \, e^{\left(\frac{-4C\_{\rm rel}(\xi)}{RT}\right)} d\xi \tag{3}$$

where *NA* and *R* corresponds to the Avogadro and the ideal gas constants, respectively and r is the average radius of the cross section of the cylinder in each sampling window, which is also a function of *ξ*.

The key factors affecting the thermodynamic signatures in host-guest systems can also be assessed along with an accurate description of the NCI including their spatial features. This can be carried out resorting to a recently developed Independent Gradient Method (IGM) of Lefebvre and co-workers [90] (see Section 4 for details), which allows the visualization of regions of low charge density corresponding to stabilizing/destabilizing NCI, based on the analysis of the electronic charge density of the interacting molecules and the respective gradients. Although the original NCI method of Johnson and co-workers [91] provide a very similar qualitative analysis of NCI, the reduced density gradient, s, used to identify the interaction types is a dimensionless quantity and therefore difficults the evaluation of the respective strengths. The new IGM allows for a quantitative comparison of the strength of NCI through the calculation of the IGM descriptor, δ<sup>g</sup> , which corresponds directly to the charge density gradient(s) in real-space.
