*3.1.1.2.3 Enthalpy-entropy compensation*

16 Thermodynamics – Interaction Studies – Solids, Liquids and Gases

described in the previous section, ligand-protein complexes tend to compensate for

contributions is of a fundamental importance for understanding of the binding energetics.

The change in the enthalpy represents the changes in energy associated with specific, noncovalent interaction. However, such an interpretation is too simplistic to describe

complex. The measured changes in enthalpy are the result of the formation and breaking of many individual bonds; it reflects the loss of protein–solvent hydrogen bonds and van der Waals interactions, the loss of ligand-solvent interactions, the formation of ligand-protein bonds, salt bridges and van der Waals contacts, the re-organisation of the intra-molecular hydrogen-bonding network of the protein, solvent reorganisation near the protein surface, conformational changes at the binding site due to the binding event, and many more. These individual components may produce either favourable or unfavourable contributions,

The treatment of each component individually is very challenging since the global heat effect of a particular interaction is a balance between the enthalpy of the ligand binding to the protein and to the solvent. Several approaches have been employed to investigate the energetics of individual bonds, including alanine scanning mutagenesis (Perozzo *et al*., 2004 and references therein), and removal of particular hydrogen bonds at the binding site (Connelly *et al.*, 1994). However, these approaches suffer from the major bottleneck, resulting from the fact that a direct relation between the change in enthalpy and the removal

studies carried out in water and deuterium (Connelly *et al.*, 1993). Frequently, water molecules are located at complex interfaces, improving the complementarity of the surfaces and extending hydrogen-bonding networks. This should contribute favourably to the enthalpy, but it may be offset by an unfavourable entropic contribution (Perozzo *et al.*, 2004). The role of interfacial water was studied by lowering water activity by adding osmolytes such as glycerol to the solution. It was found that complexes with a low degree of surface complementarity and no change in hydration are tolerant to osmotic pressure (Perozzo *et al.,*

physical representation is not straightforward. It is often related to the dynamics and flexibility of the system (Diehl *et al.*, 2010, Homans, 2007), sometimes dubbed as a 'measure

*G* and

of the system's disorder' (which is incorrect). It has been proposed that the

*H* values, and the physical meaning of observed

*G* can be very similar regardless of the driving

*H* and positive

*G* less sensitive to the molecular

*G* into enthalpic and entropic

*G* can be the same for an

*H* (enthalpy-driven

*H* seems to be more

*S* . As

*H* (entropy-driven, binding signature dominated by the

*S* and

*H* is due to a bulk hydration effect, as emerged from ITC

*H* , according to the Gibbs' equation. Its

*S* associated

force, which can be very different from one case to another.

binding signature), or all sort of combinations of negative

of the corresponding specific interactions cannot be made *a priori*.

classical hydrophobic effect), an interaction with negative

enthalpic and entropic contributions, making changes in

details of the interactions. Therefore, dissection of

*S* and

As mentioned earlier, observed overall

interaction with positive

*3.1.1.2.1 Enthalpic contributions* 

depending on the system.

A large part of the observed

2004, and references therein). *3.1.1.2.2 Entropic contributions* 

*S* may be calculated directly from

experimental

As mentioned in the previous section of this chapter, this phenomenon is described by the linear relationship between the change in enthalpy and the change in entropy. This means that favourable changes in binding enthalpy are compensated by opposite changes in binding entropy and vice versa, resulting in very small changes in overall free binding energy. Enthalpy–entropy compensation is an illustration of the 'motion opposes binding' rule, and it is believed to be a consequence of altering the weak inter-molecular interactions as well as being related to solvent effects. Since both *H* and *S* are connected to *Cp* , the correlation between enthalpy/entropy and heat capacity changes is clear.

Enthalpy-entropy compensation is a difficult problem to address in the context of rational molecular design. In such framework, the goal is to maximise the binding affinity of a complex of the designed compound and the protein target. The optimisation strategy requires simultaneous minimisation of both enthalpic and entropic penalties. However, reducing one of them usually means increasing the other.

### **3.1.2 Nuclear Magnetic Resonance (NMR) spectroscopy**

Thermodynamics of biologically-relevant macromolecules and their complexes can be characterised by measurements using NMR spectroscopy. The basis of NMR spectroscopy is the non-zero nuclear magnetic moment of many elements, such as 1H, 13C, 15N, or 19F. When put into an external static magnetic field (B), the different nuclear spin states of these elements become quantised with energies proportional to their projections onto vector B. The energy differences are also proportional to the field strength and dependent on the chemical environment of the element, which makes NMR an ideal technique to study 3D structural and dynamical properties of the systems.

A variety of NMR methods have been introduced to study ligand-protein interactions. These methods include one-, two- and three-dimensional NMR experiments. Many studies, to date, proved the power of stable-isotope labelling and isotope-edited NMR in the investigation of ligand-protein interactions. Recent development of techniques allowed for

NMR relaxation techniques have been used to study multiple time scale dynamics of ligandprotein complexes. Their results show that even though large conformational changes occur on 'slow' time scale, 'fast' (pico-to-nanosecond) protein motion plays important roles in all aspects of binding event. These are typically probed by measuring three relaxation rates: the longitudinal relaxation rate (*R*1), the transverse relaxation rate (*R*2), and the NOE. These

fluctuating magnetic fields are caused by molecular motion in an external magnetic field, which is closely coupled to nuclear spin relaxation (Boehr *et al.*, 2006, and references therein). In early studies of ps-ns time scale protein dynamics, various models for protein internal motion were used to generate different spectral density functions that were then compared to the experimental data. Subsequently, Lipari and Szabo (Lipari and Szabo, 1996) generated a spectral density function (5) that is independent of any specific physical model of motion,

> <sup>2</sup> <sup>2</sup> 2 2 22 1

   (5)

is displayed.

is the time scale (ns) for the bond vector internal

. This function is

*.* Such

> *<sup>m</sup>* is

relaxation rates are directly related to the spectral density function, *J*( )

which is shown in equation 5 and is referred to as model free formalism.

*m e*

order parameter, and <sup>111</sup>

'diffusion in a cone' with semi-angle

rotation of the bond vector.

 ,where *<sup>e</sup>*

proportional to the amplitude of the fluctuating magnetic field at the frequency

( ) 1 1 *<sup>m</sup> <sup>S</sup> <sup>S</sup> <sup>J</sup> m*

 

For isotropic tumbling (ligand-protein complex tumbles in the water solution), where

*S*

of a free rotation of a bond vector (here – N-H) in a cone. The semi-angle

 

the correlation time for the overall rotational diffusion of the macromolecule, <sup>2</sup> *S* is the

motions. An order parameter of 1 indicates complete restriction of internal motion, and <sup>2</sup> *S* 0 indicates unrestricted isotropic internal motion. It should be emphasised that <sup>2</sup> *S* parameters have a straightforward physical interpretation. The simplest model relates <sup>2</sup> *S* to

> 2 2 <sup>2</sup> cos (1 cos ) 4

Fig. 3. Physical interpretation of <sup>2</sup> *S* order parameters. <sup>2</sup> *S* can be interpreted as a measure

Smaller <sup>2</sup> *S* parameters correspond to more flexible bond vectors. <sup>2</sup> *S* 0 means unrestricted

, and is shown in Figure 3.

the study of ligand-induced conformational changes, investigating positions and dynamic behaviour of bound water molecules, and for quantification of conformational entropy. The steady-state heteronuclear Overhausser effects (NOEs) are very useful for structural analysis of three-dimensional structures of macromolecules in solution (Boehr *et al.*, 2006, Meyer and Peters, 2003). It is important to note that the NOE occurs through space, not through chemical bonds, which makes it applicable to characterise non-covalent binding events. When ligand binds the NOEs change dramatically, and transferred NOEs (trNOEs), relying on different tumbling times of free and bound interactors, can be observed.

Another NMR technique commonly used for identification of the ligand binding is chemical shift mapping (Meyer and Peters, 2003). Briefly, chemical shifts describe the dependence of nuclear magnetic energy levels on the electronic environment in the given macromolecule. Electron density, electronegativity, and aromaticity are among the factors affecting chemical shifts. Not surprisingly, binding event changes the chemical shifts of both interacting partners, particularly in the area of the association (e.g. protein binding pocket, proteinpeptide interaction interface). Thus, changes in chemical shifts can be used to identify binding events and to describe the location of the binding.

Ligand-protein thermodynamics can be investigated using NMR relaxation analysis, which provides an insight into protein dynamics in the presence and the absence of ligand. These results can be integrated with thermodynamic data obtained from isothermal titration calorimetry (ITC) experiments and computational results (e.g. MD simulations). For proteins, the relaxation rates of backbone (15N) and side chains (2H and 13C), can be obtained. The time scales available to NMR ranges over 17 orders of magnitude, reflecting protein motions on timescales from picoseconds to milliseconds (Boehr *et al.*, 2006). This covers all the relevant motions of proteins and their complexes.

Backbone and side chain (methyl groups) NMR relaxation measurements revealed the role of protein dynamics in ligand binding and protein stability (Boehr *et al.*, 2006). Development of molecular biology techniques for incorporation of stable, 13C and 15N isotopes into expressed proteins allowed for design and application of modern multidimensional heteronuclear NMR techniques. As a consequence, the maximum size of the macromolecule studied using these techniques rose from about 10 kDa (when 1H homonuclear NMR is used) to 50 kDa and beyond (using 13C and 15N heteronuclear NMR with fractional 2H enrichment). Application of modern TROSY (transverse relaxation optimized spectroscopy) techniques further expanded the size limitations of NMR, reaching up to the 900 kDa (Fernandez and Wider, 2003).

While NMR methodologies are being developed to study ligand-protein complexes in solid state, special techniques have been developed specifically to study protein stability and folding (Baldus, 2006), or in-cell NMR (Burz *et al.*, 2006), providing complementary information to fluorescence studies in biological settings. In this chapter I will briefly discuss only application of relaxation analysis in solution for the study of ligand-protein thermodynamics, specifically intrinsic entropic contributions.

### **3.1.2.1 Slow and fast dynamics: from dynamics to entropy**

Conformational changes that may be associated with ligand binding events generally occur on 'slow' (microsecond to millisecond) time scales and thus report on slower motions than protein backbone and side chain fluctuations (pico-to-nanoseconds). There is no straightforward relationship between 'slow' and 'fast' motions. Experiments on several ligand-enzyme systems have shown that binding events, which decrease the 'fast' motions, may increase, decrease, or not affect the 'slow' motions (Boehr *et al.*, 2006). This obviously has an effect on the overall entropy contribution, but this has not been fully explored.

the study of ligand-induced conformational changes, investigating positions and dynamic behaviour of bound water molecules, and for quantification of conformational entropy. The steady-state heteronuclear Overhausser effects (NOEs) are very useful for structural analysis of three-dimensional structures of macromolecules in solution (Boehr *et al.*, 2006, Meyer and Peters, 2003). It is important to note that the NOE occurs through space, not through chemical bonds, which makes it applicable to characterise non-covalent binding events. When ligand binds the NOEs change dramatically, and transferred NOEs (trNOEs), relying

Another NMR technique commonly used for identification of the ligand binding is chemical shift mapping (Meyer and Peters, 2003). Briefly, chemical shifts describe the dependence of nuclear magnetic energy levels on the electronic environment in the given macromolecule. Electron density, electronegativity, and aromaticity are among the factors affecting chemical shifts. Not surprisingly, binding event changes the chemical shifts of both interacting partners, particularly in the area of the association (e.g. protein binding pocket, proteinpeptide interaction interface). Thus, changes in chemical shifts can be used to identify

Ligand-protein thermodynamics can be investigated using NMR relaxation analysis, which provides an insight into protein dynamics in the presence and the absence of ligand. These results can be integrated with thermodynamic data obtained from isothermal titration calorimetry (ITC) experiments and computational results (e.g. MD simulations). For proteins, the relaxation rates of backbone (15N) and side chains (2H and 13C), can be obtained. The time scales available to NMR ranges over 17 orders of magnitude, reflecting protein motions on timescales from picoseconds to milliseconds (Boehr *et al.*, 2006). This

Backbone and side chain (methyl groups) NMR relaxation measurements revealed the role of protein dynamics in ligand binding and protein stability (Boehr *et al.*, 2006). Development of molecular biology techniques for incorporation of stable, 13C and 15N isotopes into expressed proteins allowed for design and application of modern multidimensional heteronuclear NMR techniques. As a consequence, the maximum size of the macromolecule studied using these techniques rose from about 10 kDa (when 1H homonuclear NMR is used) to 50 kDa and beyond (using 13C and 15N heteronuclear NMR with fractional 2H enrichment). Application of modern TROSY (transverse relaxation optimized spectroscopy) techniques further expanded the size limitations of NMR, reaching up to the 900 kDa

While NMR methodologies are being developed to study ligand-protein complexes in solid state, special techniques have been developed specifically to study protein stability and folding (Baldus, 2006), or in-cell NMR (Burz *et al.*, 2006), providing complementary information to fluorescence studies in biological settings. In this chapter I will briefly discuss only application of relaxation analysis in solution for the study of ligand-protein

Conformational changes that may be associated with ligand binding events generally occur on 'slow' (microsecond to millisecond) time scales and thus report on slower motions than protein backbone and side chain fluctuations (pico-to-nanoseconds). There is no straightforward relationship between 'slow' and 'fast' motions. Experiments on several ligand-enzyme systems have shown that binding events, which decrease the 'fast' motions, may increase, decrease, or not affect the 'slow' motions (Boehr *et al.*, 2006). This obviously has an effect on the overall entropy contribution, but this has not been fully explored.

on different tumbling times of free and bound interactors, can be observed.

binding events and to describe the location of the binding.

covers all the relevant motions of proteins and their complexes.

thermodynamics, specifically intrinsic entropic contributions. **3.1.2.1 Slow and fast dynamics: from dynamics to entropy** 

(Fernandez and Wider, 2003).

NMR relaxation techniques have been used to study multiple time scale dynamics of ligandprotein complexes. Their results show that even though large conformational changes occur on 'slow' time scale, 'fast' (pico-to-nanosecond) protein motion plays important roles in all aspects of binding event. These are typically probed by measuring three relaxation rates: the longitudinal relaxation rate (*R*1), the transverse relaxation rate (*R*2), and the NOE. These relaxation rates are directly related to the spectral density function, *J*( ) . This function is proportional to the amplitude of the fluctuating magnetic field at the frequency *.* Such fluctuating magnetic fields are caused by molecular motion in an external magnetic field, which is closely coupled to nuclear spin relaxation (Boehr *et al.*, 2006, and references therein). In early studies of ps-ns time scale protein dynamics, various models for protein internal motion were used to generate different spectral density functions that were then compared to the experimental data. Subsequently, Lipari and Szabo (Lipari and Szabo, 1996) generated a spectral density function (5) that is independent of any specific physical model of motion, which is shown in equation 5 and is referred to as model free formalism.

$$J(\phi) = \frac{S^2 \tau\_m}{1 + \alpha^2 m \tau^2} + \frac{\left(1 - S^2\right) \tau}{1 + \alpha^2 \tau^2} \tag{5}$$

For isotropic tumbling (ligand-protein complex tumbles in the water solution), where *<sup>m</sup>* is the correlation time for the overall rotational diffusion of the macromolecule, <sup>2</sup> *S* is the order parameter, and <sup>111</sup> *m e* ,where *<sup>e</sup>* is the time scale (ns) for the bond vector internal motions. An order parameter of 1 indicates complete restriction of internal motion, and <sup>2</sup> *S* 0 indicates unrestricted isotropic internal motion. It should be emphasised that <sup>2</sup> *S* parameters have a straightforward physical interpretation. The simplest model relates <sup>2</sup> *S* to 'diffusion in a cone' with semi-angle , and is shown in Figure 3.

Fig. 3. Physical interpretation of <sup>2</sup> *S* order parameters. <sup>2</sup> *S* can be interpreted as a measure of a free rotation of a bond vector (here – N-H) in a cone. The semi-angle is displayed. Smaller <sup>2</sup> *S* parameters correspond to more flexible bond vectors. <sup>2</sup> *S* 0 means unrestricted rotation of the bond vector.

play in a binding event, a combination of ITC and other techniques (such as NMR) need to be used. A combination of ITC and NMR proves useful in studying cooperativity phenomena. Heteronuclear NMR spectroscopy is one of the few experimental techniques capable of measuring the occupancies of individual binding sites on proteins and therefore determining microscopic binding affinities. Coupling this site-specific data (e.g. chemical shift mapping and/or relaxation analysis data) with the macroscopic binding data from ITC allows a complete description of the binding properties of the system. A method of determining cooperativity using heteronuclear solution NMR spectroscopy has been described using an isotope-enriched two-dimensional heteronuclear single-quantum coherence experiment (2D HSQC) (Tochtrop *et al.*, 2002). The ligands are isotopically labelled (usually 1H, 15N, or 13C), while the receptor remains unlabelled. Spectra are acquired at different molar ratios and the peak volumes are integrated. Isotherms are generated by plotting the peak volume integration against molar ratio. The data is then fitted to site-

specific binding models to obtain the thermodynamic parameters (Brown, 2009).

affecting the free energy in a way which is difficult to predict.

assumptions (Homans, 2007, Shimokhina *et al.*, 2006).

Computational approaches to ligand-protein interaction studies have great potential and the development of various methods, briefly described in this chapter, have been truly outstanding. However, every method – computational, experimental alike - has its limitations and computational methods should not be used in a 'black box' manner; one should beware of the 'Garbage In Garbage Out' phenomenon. Yet it is evident that theoretical approaches have finally come to the stage that makes rational molecular design

During a binding event, the ligand may bind in multiple orientations. The conformation of either of the interacting partners can change significantly upon association. The network of intramolecular interactions (e.g. hydrogen bonds, salt bridges) can dramatically change (breaking and/or creating new contacts), and new intermolecular interactions occur. Water molecules and ions can be expelled upon binding, or – on the contrary – bind more tightly. Finally, conformational or solvation entropic contributions may play significant role,

Growing amount of calorimetric data available allows the investigation of the thermodynamic profiles for many ligand-protein complexes in detail. When structural data (crystal, NMR) are available as well – and often it is the case - it is very appealing to speculate about the link between the structure of the complex and the thermodynamics of the binding event. However, such speculations are challenging. It is important to bear in mind that both enthalpic and entropic contributions to the free energy terms obtained from ITC experiments are global parameters, containing a mixture of different contributions, which can have either equal or opposing signs and different magnitudes. This may lead to various thermodynamic signatures of a binding event. Moreover, 'structural' interpretation of intrinsic entropic contributions is notoriously difficult. Hence, the experimental thermodynamic data cannot be easily interpreted on the basis of structural information alone. Last but not least, the contribution from the solvation effects is difficult to get insight into, and although direct experimental estimations of solvation free energy have been attempted, these always require additional

No doubt, a great advantage of theoretical approaches lies with gaining an insight about each of those contributions and their de-convolution. Binding events (ligand-protein

**3.2 Computational approaches** 

truly rational.

There were attempts to relate order parameters to structural characteristics of proteins and ligand-protein complexes. It was observed that amino acids with smaller side chains tend to show – intuitively - greater backbone flexibility than those with bulkier side chains (Goodman *et al.*, 2000). However, the variation of backbone amide <sup>2</sup> *S* parameters is larger than the differences between the averages for different amino acid types. Backbone amide order parameters are also only weakly affected by secondary structure elements, with loops having only slightly smaller average <sup>2</sup> *S* N-H values than helices or beta-turns (Kay et al., 1989). Backbone <sup>2</sup> *S* N-H values can be predicted from structures using a simple model that takes account of local contacts to the N-H and C=O atoms of each peptide group (Zhang and Brüschweiler, 2002).

A more sophisticated model for predicting dynamics from structure has recently been reported (tCONCOORD) (Seeliger *et al.*, 2007). tCONCOORD allows for a fast and efficient sampling of protein's conformational degrees of freedom based on geometrical restraints. Weak correlation between side chain order parameters and contact distance between the methyl carbon and neighboring atoms, with solvent exposure (Ming and Brüschweiler, 2004), and amino acid sequence conservation patterns (Mittermaier *et al.*, 2003) have been reported in literature. These results demonstrate that protein dynamics are strongly affected by the unique architecture of the protein as well as the environment. Thus, it cannot be readily predicted by the bioinformatic techniques, based on the primary/secondary sequence analysis. Developing a fast and reliable method of assessment of protein dynamics is, nevertheless, crucial for predictions of ligand-protein interactions - as it will be shown in the course of this chapter, dynamics affects all stages of molecular recognition events.

Order parameters can be related to entropy through the relationship developed by Yang and Kay (1996). This formalism quantifies the conformational entropy associated with observable protein motions by means of a specific motion model. For a wide range of motion models, the functional dependence of entropy on the <sup>2</sup> *S* parameter was demonstrated to be similar (Yang and Kay, 1996). This suggests that changes in <sup>2</sup> *S* can be related to the conformational entropy change in a model-independent manner. This approach has many advantages: it is straightforward, relatively free of assumptions (the requirement is that the internal motions are uncorrelated with the global tumbling of the macromolecule), and applicable to both NMR experiments and theoretical approaches (MD simulations). Moreover, since <sup>2</sup> *S* parameters are measured per bond vector, this approach enables site-specific reporting of any loses, gains, and redistributions of conformational entropy through different dynamic states of the ligand-protein complex.

However, the model-free formalism can give only a qualitative view of micro-to-millisecond time scale motions. Failure to correctly account for anisotropic molecular tumbling and the assumption that all motions are un-correlated seriously compromises the usefulness of this approach for studying dynamics associated with large conformational changes or concerted motions. Because of the time scales, alternative approaches must be implemented to study motions occurring at a millisecond time scale (e.g. R2 relaxation dispersion).

### **3.1.3 Combination of ITC and NMR**

As described, ITC obtains free energy as the global parameter, thus, effects like ligandinduced conformational changes, domain-swapping, or protein oligomerisation, which contribute to the overall *G* , will not be resolved. In order to assess the role those factors play in a binding event, a combination of ITC and other techniques (such as NMR) need to be used. A combination of ITC and NMR proves useful in studying cooperativity phenomena. Heteronuclear NMR spectroscopy is one of the few experimental techniques capable of measuring the occupancies of individual binding sites on proteins and therefore determining microscopic binding affinities. Coupling this site-specific data (e.g. chemical shift mapping and/or relaxation analysis data) with the macroscopic binding data from ITC allows a complete description of the binding properties of the system. A method of determining cooperativity using heteronuclear solution NMR spectroscopy has been described using an isotope-enriched two-dimensional heteronuclear single-quantum coherence experiment (2D HSQC) (Tochtrop *et al.*, 2002). The ligands are isotopically labelled (usually 1H, 15N, or 13C), while the receptor remains unlabelled. Spectra are acquired at different molar ratios and the peak volumes are integrated. Isotherms are generated by plotting the peak volume integration against molar ratio. The data is then fitted to sitespecific binding models to obtain the thermodynamic parameters (Brown, 2009).
