**3. Using potentials of mean force to address host-guest recognition and association**

Several research groups have opened the way for future progress through innovative applications of free-energy methods to physical and organic chemistry, as well as structural biology. An exhaustive account of the wide range of works published in the early years of free-energy calculations falls beyond the scope of this section. The reader is referred to Refs. [16, 52] for a full description of these efforts.

A complete thermodynamic characterization of the binding process implies the knowledge of the enthalpy and entropy of association. This is one of the key elements in identifying the stabilizing factors and in understanding how is the guest and host assemble. Standard molecular simulation methods have reproduced the absolute thermodynamic properties of binding (standard free-energy, enthalpy and entropy) between host-guest systems [53]. The absolute binding free-energy can be expressed as the sum of separate free-energy contributions corresponding to a step-by-step process describing the association process between the host and guest [5]. This can be accomplished by calculating the PMF profile along a specific reaction coordinate characterizing the association process (this reaction coordinate can be the distance between the center of mass of the host and that of the guest).

Different methodologies have been successfully applied for the calculation of the PMF profile. Among them, free-energy perturbation (FEP) [47, 50, 54, 55], thermodynamic integration (TI) [45, 47], umbrella sampling, with the weighted histogram analysis method (WHAM) [5, 56] and force constraint methods [16] are the most commonly used in the literature for chemically and biologically relevant systems. PMF calculations play a considerable role in understanding how chemical species recognize and associate. One might argue that they actually provide a more satisfactory picture of these phenomena than alchemical free-energy calculations, because they follow the molecules of interest from their free, unbound state, to the bound state of the complex. This point of view is obviously misleading, as the model reaction coordinate followed to describe association/dissociation bears certain arbitrariness. Furthermore, PMF have helped to decode complex recognition and association phenomena in thermodynamic signatures with a detailed atomic description of the underlying processes. A detailed description of the theoretical basis of PMF calculation is given in ref. [16]

The PMF, W(ξ) may be interpreted as the potential that is produced by the average forces over all the configurations of a given system, in which a set of molecules is kept fixed (at a certain value of a reaction coordinate, *ξ*). This quantity was devised by Kirkwood [19, 32] and can be defined based on the average distribution function, *ρ(ξ*), along the coordinate *ξ*, obtained from a Boltzmann weighted average,

\begin{tabular}{l l l} \begin{array}{l l} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \text{---} \\\\ \left\langle \rho(\xi) \right\rangle & = \begin{array}{l} \frac{\int dr \delta(\xi'(r) - \xi) \int e^{\left(\frac{-40r}{\xi\_1 \overline{\xi}}\right)}}{\int dr e^{\left(\frac{-40r}{\xi\_1 \overline{\xi}}\right)}} \end{array} \tag{1}

center of mass (COM) distances from the center of the host cavity, using an umbrella biasing

molecule to sample the configurational space in a defined region along the association/dissociation coordinate. Then the biased system (**Figure 1**, middle) will sample configurations

In general, the potential energy (*U(r) + w(ξ*)) is used to generate the biased simulations. The mean force, as a function of position, is calculated in each window, and the respective poten-

Statistical techniques, such as WHAM [56] are used to remove the umbrella bias and combine the local distributions, allowing free-energy to be computed (**Figure 1**, top). WHAM is the basic tool for constructing free-energy profiles from distributions derived through stratification, for which the path connecting the reference and the target states is separated into

**Figure 1.** Schematic representation of the umbrella sampling procedure applied to a system along the pathway for the

association/dissociation of a model guest with a host cavity.

*(ξ) = 0.5 K(ξ − ξ<sup>i</sup>*

*)2*

Modeling Soft Supramolecular Nanostructures by Molecular Simulations

, even when these would not be sampled in the unbiased system.

. This restraint allows the guest

13

http://dx.doi.org/10.5772/intechopen.74939

potential, *w(ξ)*, often of quadratic form, *wi*

tials, *W(ξ)*, are derived using an unbiasing procedure.

close to a defined position, *z<sup>0</sup>*

$$\mathcal{W}(\xi) = \mathcal{W}(\xi^\*) - k\_{\mathfrak{g}} \operatorname{Tin} \left[ \frac{\rho \langle \xi \rangle}{\rho \langle \xi^\* \rangle} \right] \tag{2}$$

where *W*(*ξ*<sup>∗</sup> ) and *ξ*<sup>∗</sup> are arbitrary reference values, *δ*(*ξ*′ (*r*) − *ξ*) represents the Dirac delta function for the coordinate *ξ*, *U(r)* corresponds the total energy of the system as a function of the coordinates *r*, and *ξ(r)* is a function depending on the number of degrees of freedom, such as an angle, a distance or a more complex function of the cartesian coordinates of the system.

The distribution function *ρ(ξ)* cannot be computed by standard MC or MD simulations, due to low sampling frequency in higher-energy configurations. Specific sampling techniques (non-Boltzmann sampling) have been designed for calculating the PMF along a coordinate *ξ*, from a MD trajectory. The umbrella sampling technique, proposed in the 1970s by Torrie and Valleau [22], is one of these approaches, and is commonly used to overcome the difficulty of sampling rare events, by modifying the potential function so that the unfavorable states are sampled appropriately.

Umbrella sampling is used to describe simulations in which an order parameter connecting the initial and final ensembles is divided into mutually overlapping regions, or windows, which are sampled using non-Boltzmann weights. To explain briefly, several independent simulations are performed in each of the imposed regions of the reaction coordinate (**Figure 1**, bottom), using a bias potential to constrain the simulations to adjacent windows. Such initial configurations are generated, each corresponding to a location wherein the guest molecule is restrained and centered at subsequent values of *ξ<sup>i</sup>* , corresponding to decreasing or increasing center of mass (COM) distances from the center of the host cavity, using an umbrella biasing potential, *w(ξ)*, often of quadratic form, *wi (ξ) = 0.5 K(ξ − ξ<sup>i</sup> )2* . This restraint allows the guest molecule to sample the configurational space in a defined region along the association/dissociation coordinate. Then the biased system (**Figure 1**, middle) will sample configurations close to a defined position, *z<sup>0</sup>* , even when these would not be sampled in the unbiased system. In general, the potential energy (*U(r) + w(ξ*)) is used to generate the biased simulations. The mean force, as a function of position, is calculated in each window, and the respective potentials, *W(ξ)*, are derived using an unbiasing procedure.

Different methodologies have been successfully applied for the calculation of the PMF profile. Among them, free-energy perturbation (FEP) [47, 50, 54, 55], thermodynamic integration (TI) [45, 47], umbrella sampling, with the weighted histogram analysis method (WHAM) [5, 56] and force constraint methods [16] are the most commonly used in the literature for chemically and biologically relevant systems. PMF calculations play a considerable role in understanding how chemical species recognize and associate. One might argue that they actually provide a more satisfactory picture of these phenomena than alchemical free-energy calculations, because they follow the molecules of interest from their free, unbound state, to the bound state of the complex. This point of view is obviously misleading, as the model reaction coordinate followed to describe association/dissociation bears certain arbitrariness. Furthermore, PMF have helped to decode complex recognition and association phenomena in thermodynamic signatures with a detailed atomic description of the underlying processes. A detailed

The PMF, W(ξ) may be interpreted as the potential that is produced by the average forces over all the configurations of a given system, in which a set of molecules is kept fixed (at a certain value of a reaction coordinate, *ξ*). This quantity was devised by Kirkwood [19, 32] and can be defined based on the average distribution function, *ρ(ξ*), along the coordinate *ξ*, obtained

(*r*) − *ξ*) *e* (

∫*dre* ( −*U*(*r*) \_\_\_\_\_ k*BT* )

for the coordinate *ξ*, *U(r)* corresponds the total energy of the system as a function of the coordinates *r*, and *ξ(r)* is a function depending on the number of degrees of freedom, such as an angle, a distance or a more complex function of the cartesian coordinates of the system.

The distribution function *ρ(ξ)* cannot be computed by standard MC or MD simulations, due to low sampling frequency in higher-energy configurations. Specific sampling techniques (non-Boltzmann sampling) have been designed for calculating the PMF along a coordinate *ξ*, from a MD trajectory. The umbrella sampling technique, proposed in the 1970s by Torrie and Valleau [22], is one of these approaches, and is commonly used to overcome the difficulty of sampling rare events, by modifying the potential function so that the unfavorable states are

Umbrella sampling is used to describe simulations in which an order parameter connecting the initial and final ensembles is divided into mutually overlapping regions, or windows, which are sampled using non-Boltzmann weights. To explain briefly, several independent simulations are performed in each of the imposed regions of the reaction coordinate (**Figure 1**, bottom), using a bias potential to constrain the simulations to adjacent windows. Such initial configurations are generated, each corresponding to a location wherein the guest molecule is

\_\_\_\_\_\_\_\_\_\_\_\_\_\_

−*U*(*r*) \_\_\_\_\_ *kBT* )

*ρ*〈*ξ*〉 \_\_\_\_\_ *ρ*〈*ξ*<sup>∗</sup>

(1)

〉] (2)

(*r*) − *ξ*) represents the Dirac delta function

, corresponding to decreasing or increasing

description of the theoretical basis of PMF calculation is given in ref. [16]

from a Boltzmann weighted average,

where *W*(*ξ*<sup>∗</sup>

12 Molecular Dynamics

) and *ξ*<sup>∗</sup>

sampled appropriately.

〈*ρ*(*ξ*)〉 <sup>=</sup> <sup>∫</sup>*dr*(*ξ*′

*<sup>W</sup>*(*ξ*) <sup>=</sup> *<sup>W</sup>*(*ξ*∗) <sup>−</sup> *kB Tln*[

restrained and centered at subsequent values of *ξ<sup>i</sup>*

are arbitrary reference values, *δ*(*ξ*′

Statistical techniques, such as WHAM [56] are used to remove the umbrella bias and combine the local distributions, allowing free-energy to be computed (**Figure 1**, top). WHAM is the basic tool for constructing free-energy profiles from distributions derived through stratification, for which the path connecting the reference and the target states is separated into

**Figure 1.** Schematic representation of the umbrella sampling procedure applied to a system along the pathway for the association/dissociation of a model guest with a host cavity.

intermediate states. The respective equations have also been used as the core of adaptive umbrella sampling approaches [42, 43], in which the efficiency of free-energy calculations are improved through refinement of the biasing potentials as the simulation progressed.

latter, restraints controlling the host-guest complex are activated cumulatively for separating host and guest molecules, and then released to leave the guest at standard concentration.

Modeling Soft Supramolecular Nanostructures by Molecular Simulations

http://dx.doi.org/10.5772/intechopen.74939

In fact, the selection of FF and setup parameters (e.g. protonation states and slow motions) can be a difficult task, as new variants primarily concerned to the treatment of proteins, are constantly being produced. There are standard options such as OPLS [60], AMBER [61], CHARMM [62], and GROMOS [63] and other more specific FF, such as the Kirkwood-Buff [64] FF, directed at aromatic amino acids, the CHARMM polarizable FF, based on the classical Drude oscillator [65, 66], the induced dipole [67], for modeling polarization in proteins and protein-ligand complexes, the Jorgensen's approach [68] for parameterization by atoms-in-molecule electron density partitioning, AMOEBA [69] and GEM\* [70], among many others [71–73]. Additionally, several water models such as TIP3P [74], TIP4Pew [75], and SPC/E [76], and the recently introduced OPC [77], TIP3P-FB and TIP4P-FB [78], TIP4P-D [79], and iAMOEBA [80] are also available. It is worth mentioning that the assessement of the ability of these FF to reproduce experimental data are very sparse, and based on the free-energy of hydration of small molecules [81], and on the structure of nucleic acids [82] and proteins [83]. With regard to explicit solvent methods, only a few studies have focused on thermodynamic data from non-covalent binding for validating the FF. Recently [84, 85], the binding free energies of cyclodextrin host-

guest complexes were also used, but in the context of implicit solvent models.

association/dissociation coordinate, *ξ*, from

also a function of *ξ*.

*Kbind* = *πNA*∫*r* (*ξ*)<sup>2</sup> *e* (

It is worth to note that there are other issues in this type of systems that transcend the accuracy of the selected FF, and affect the simulation results. These are related to restrictions for the sampling and estimation of reliable thermodynamic quantities, including the respective binding constants. The latter can be determined from experimental techniques such as <sup>1</sup>

NMR, isothermal titration calorimetry and solubility experiments. The calculations of binding constants that provide a convenient counterpart to the experimental observations in inclusion complexes is sometimes addressed using a "flexible molecule" approximation [5], instead of the conventional rigid rotor harmonic oscillator [86–88]. The available volume for a guest molecule is described in terms of a cylindrical, PMF weighted region, in which the respective motion is measured by the positioning of the center of mass. The "flexible molecule" approximation is not as widespread as might be expected, and relatively recent publications [86, 89] both divulge the approximation and highlight controversies on the topic. In simple terms, the binding constant, *Kbind* can be estimated by integrating the PMF values *(∆GPMF(ξ)* along the

where *NA* and *R* corresponds to the Avogadro and the ideal gas constants, respectively and r is the average radius of the cross section of the cylinder in each sampling window, which is

The key factors affecting the thermodynamic signatures in host-guest systems can also be assessed along with an accurate description of the NCI including their spatial features. This can be carried out resorting to a recently developed Independent Gradient Method (IGM) of Lefebvre and co-workers [90] (see Section 4 for details), which allows the visualization

<sup>−</sup>*GPMF*(*ξ*) \_\_\_\_\_\_\_\_

*RT* ) *d* (3)

H

15
