**1. Introduction**

Often there is a high demand for the structures of biologically important proteins, especially those which are large or part of complex systems. However, it is not always possible, for many reasons, to get a high-resolution structure experimentally using X-ray crystallography, NMR, or cryo-electron microscopy. Among the many problems, we can mention low protein expression, low protein stability, high aggregation or poorly diffracting crystals [1]. In this situation, *in silico* models provide a good starting point for experimental research. One of the common techniques for obtaining a reasonable structural model of a protein is homology modeling (HM). Homology modeling techniques are predominantly used to construct a hypothetical structure of a protein of interest (the target) where only the aminoacid sequence is available using the known structural features or 3D structure of one or several homologous proteins (the templates) [2–4]. However, building a static model often does not answer all questions regarding the function of the protein and its role in the cell and organism, nor does it clarify its relationship with other cellular components nucleic acids, proteins, ions and other molecules. It is important to

understand protein dynamics and flexibility, as proteins during the fulfillment of their role in the cell often change their shape, oligomeric state or even fold. As it is often very challenging to observe protein dynamics *in vivo* or *in vitro,* to understand better all these changes, a useful computational method normal mode analysis (NMA) can be used [5–9].

where *aik* is the amplitude of oscillation, *ω<sup>k</sup>* is the frequency, and *δ<sup>k</sup>* is a phase factor. By substituting this into Eq. (4), the equation of motion can be rewritten as

*Normal Mode Analysis: A Tool for Better Understanding Protein Flexibility and Dynamics…*

where the matrix *A* contains the *Ak* eigenvectors of the Hessian matrix *V* and *λ* is a diagonal matrix containing the *λ<sup>k</sup>* eigenvalues. The *Ak* eigenvectors are the normal mode vectors and describe in which direction and how far each particle in the system moves with respect to each other particle; the *λ<sup>k</sup>* eigenvalues give the squares of the frequencies with which all the particles involved with a particular mode vibrate. While the eigenvectors can tell in which direction and how far each particle moves with respect to the others, it does not give absolute displacements. NMA alone therefore cannot normally be used to get the displacement amplitudes

The vibrational energy of the system is generally equally divided so that every vibrational mode has the same energy and the average oscillation amplitude of a given mode scales as the inverse of its frequency. Thus, modes with higher frequencies, which will have energetically greater displacements, typically describe rapid but small amplitude local motions involving relatively few atoms, while those with lower frequencies will describe slower displacements involving larger numbers of atoms and describe large-scale conformational changes. As the name of the method indicates, these vibrational modes are normal to one another, meaning that they move independently: the excitation of one mode does not trigger the motion of a second one and the general motion of the system can be described by a superposition of all the modes. These normal modes yield analytical solutions to the equations of motion: for a given set of initial positions and velocities, NMA allows us to calculate where each atom of the system in question will be at any subsequent time subject to the small oscillation approximation. (A more complete treatment of the theory behind NMA and its advantages and limitations may be found in [12]). NMA was first applied to peptides in 1979 [13] and was subsequently used to study the whole proteins bovine pancreatic trypsin inhibitor (BPTI) [14, 15], hexokinase [16], crambin [17], human lysozyme [17, 18], ribonuclease [17], and myoglobin [19, 20]. Application of the method to larger systems was hampered by

its computational expense. With advances in computer technology and the

development of more efficient algorithms, it has become possible to examine larger structures, including the skeletal ryanodine receptor [21], Ca-ATPase [22], GroEL [23–25], the ribosome [26–28], the yeast nuclear pore complex [29], and virus capsids [30–32]. All these will be examined in more detail in Section 3 below.

NMA is less computationally expensive than molecular dynamics simulation, but it is still not trivial for proteins containing many thousands of atoms. The first problem is that the structure to be studied must be energy minimized to ensure that the starting conformation is in a true minimum relative to the chosen force field. The minimization must then proceed until machine precision is reached, typically below 0.001 kJ/mol-nm, which is much more computationally demanding than the minimizations normally employed for other tasks. Frequently, the results of this process distort the structure, leading to NMA being carried out on a structure different from the experimentally determined one. The second problem, and the computationally limiting factor, is the diagonalization of the Hessian matrix. For classical, all-atom NMA, all *N* atoms in a structure, including the hydrogen atoms, must be used, making the total Hessian 3*N* � 3*N* in size. For large proteins with

of a given normal mode [11].

*DOI: http://dx.doi.org/10.5772/intechopen.94139*

**2.1 The elastic network model**

**15**

*VA* ¼ *λA* (6)

## **2. Normal mode analysis**

Normal mode analysis is a technique, based on the physics of small oscillations, that can be used to describe the flexible states accessible to a protein around an equilibrium position. The idea is that when an oscillating system at equilibrium, for example a protein in an energy minimum conformation, is slightly perturbed, a restoring force acts to bring the perturbed system back to its equilibrium conformation. A system is defined to be in equilibrium or at the bottom of a potential minimum when the generalized forces acting on it are equal to zero. At the minimum energy conformation, represented by the generalized coordinates *q*0, the potential energy equation can be written as a power series in *q*:

$$V(q) = V(q^0) + \left(\frac{\partial V}{\partial q\_i}\right)^0 \eta\_i + \frac{1}{2} \left(\frac{\partial^2 V}{\partial q\_i \partial q\_j}\right)^0 \eta\_i \eta\_j + \dots \tag{1}$$

where *qi* and *q <sup>j</sup>* are the instantaneous configuration of components *i* and *j* and the deviation of component *i* from its equilibrium configuration is given by *η<sup>i</sup>* ¼ *qi* � *<sup>q</sup>*<sup>0</sup> *<sup>i</sup>* . *V q*ð Þ is the potential energy equation of the system and, for proteins, usually takes the form of one of the commonly used molecular dynamics force fields [10]. The first term in the series represents the minimum value of the potential and may be set to zero and the second term will be zero at any local minimum, so the potential can be written as

$$V(q) = \frac{1}{2} \left( \frac{\partial^2 V}{\partial q\_i \partial q\_j} \right)^0 \eta\_i \eta\_j = \frac{1}{2} \eta\_i V\_{ij} \eta\_j \tag{2}$$

where *Vij* is the Hessian matrix which contains the second derivatives of the potential with respect to the components of the system.

It is also necessary to consider the kinetic energy (*T*) of the system since we are interested in dynamics. For component *i*, this can be given by

$$T(q) = \frac{1}{2}M\frac{d^2\eta\_i}{dt^2} \tag{3}$$

where *M* is a diagonal matrix containing the mass of each particle. The entire equation of motion can be written as

$$\frac{1}{2}M\frac{d^2\eta\_i}{dt^2} + \frac{1}{2}\eta\_i V\_{ij}\eta\_j = 0\tag{4}$$

One solution of this equation is the oscillatory equation

$$
\eta\_i = \mathfrak{a}\_{ik}\cos\left(a\mathfrak{g}\_{\mathfrak{k}}t + \delta\_{\mathfrak{k}}\right) \tag{5}
$$

*Normal Mode Analysis: A Tool for Better Understanding Protein Flexibility and Dynamics… DOI: http://dx.doi.org/10.5772/intechopen.94139*

where *aik* is the amplitude of oscillation, *ω<sup>k</sup>* is the frequency, and *δ<sup>k</sup>* is a phase factor. By substituting this into Eq. (4), the equation of motion can be rewritten as

$$VA = \lambda A \tag{6}$$

where the matrix *A* contains the *Ak* eigenvectors of the Hessian matrix *V* and *λ* is a diagonal matrix containing the *λ<sup>k</sup>* eigenvalues. The *Ak* eigenvectors are the normal mode vectors and describe in which direction and how far each particle in the system moves with respect to each other particle; the *λ<sup>k</sup>* eigenvalues give the squares of the frequencies with which all the particles involved with a particular mode vibrate. While the eigenvectors can tell in which direction and how far each particle moves with respect to the others, it does not give absolute displacements. NMA alone therefore cannot normally be used to get the displacement amplitudes of a given normal mode [11].

The vibrational energy of the system is generally equally divided so that every vibrational mode has the same energy and the average oscillation amplitude of a given mode scales as the inverse of its frequency. Thus, modes with higher frequencies, which will have energetically greater displacements, typically describe rapid but small amplitude local motions involving relatively few atoms, while those with lower frequencies will describe slower displacements involving larger numbers of atoms and describe large-scale conformational changes. As the name of the method indicates, these vibrational modes are normal to one another, meaning that they move independently: the excitation of one mode does not trigger the motion of a second one and the general motion of the system can be described by a superposition of all the modes. These normal modes yield analytical solutions to the equations of motion: for a given set of initial positions and velocities, NMA allows us to calculate where each atom of the system in question will be at any subsequent time subject to the small oscillation approximation. (A more complete treatment of the theory behind NMA and its advantages and limitations may be found in [12]).

NMA was first applied to peptides in 1979 [13] and was subsequently used to study the whole proteins bovine pancreatic trypsin inhibitor (BPTI) [14, 15], hexokinase [16], crambin [17], human lysozyme [17, 18], ribonuclease [17], and myoglobin [19, 20]. Application of the method to larger systems was hampered by its computational expense. With advances in computer technology and the development of more efficient algorithms, it has become possible to examine larger structures, including the skeletal ryanodine receptor [21], Ca-ATPase [22], GroEL [23–25], the ribosome [26–28], the yeast nuclear pore complex [29], and virus capsids [30–32]. All these will be examined in more detail in Section 3 below.

#### **2.1 The elastic network model**

NMA is less computationally expensive than molecular dynamics simulation, but it is still not trivial for proteins containing many thousands of atoms. The first problem is that the structure to be studied must be energy minimized to ensure that the starting conformation is in a true minimum relative to the chosen force field. The minimization must then proceed until machine precision is reached, typically below 0.001 kJ/mol-nm, which is much more computationally demanding than the minimizations normally employed for other tasks. Frequently, the results of this process distort the structure, leading to NMA being carried out on a structure different from the experimentally determined one. The second problem, and the computationally limiting factor, is the diagonalization of the Hessian matrix. For classical, all-atom NMA, all *N* atoms in a structure, including the hydrogen atoms, must be used, making the total Hessian 3*N* � 3*N* in size. For large proteins with

understand protein dynamics and flexibility, as proteins during the fulfillment of their role in the cell often change their shape, oligomeric state or even fold. As it is often very challenging to observe protein dynamics *in vivo* or *in vitro,* to understand better all these changes, a useful computational method normal mode analysis

Normal mode analysis is a technique, based on the physics of small oscillations, that can be used to describe the flexible states accessible to a protein around an equilibrium position. The idea is that when an oscillating system at equilibrium, for example a protein in an energy minimum conformation, is slightly perturbed, a restoring force acts to bring the perturbed system back to its equilibrium conformation. A system is defined to be in equilibrium or at the bottom of a potential minimum when the generalized forces acting on it are equal to zero. At the minimum energy conformation, represented by the generalized coordinates *q*0, the

> *η<sup>i</sup>* þ 1 2

where *qi* and *q <sup>j</sup>* are the instantaneous configuration of components *i* and *j* and the deviation of component *i* from its equilibrium configuration is given by *η<sup>i</sup>* ¼

*<sup>i</sup>* . *V q*ð Þ is the potential energy equation of the system and, for proteins, usually takes the form of one of the commonly used molecular dynamics force fields [10]. The first term in the series represents the minimum value of the potential and may be set to zero and the second term will be zero at any local minimum, so the

where *Vij* is the Hessian matrix which contains the second derivatives of the

It is also necessary to consider the kinetic energy (*T*) of the system since we are

where *M* is a diagonal matrix containing the mass of each particle. The entire

1 2

*∂*2 *V ∂qi ∂q j* !<sup>0</sup>

*T q*ð Þ¼ <sup>1</sup> 2 *<sup>M</sup> <sup>d</sup>*<sup>2</sup> *ηi*

*∂*2 *V ∂qi ∂q j* !<sup>0</sup>

*<sup>η</sup>i<sup>η</sup> <sup>j</sup>* <sup>¼</sup> <sup>1</sup> 2 *ηiη <sup>j</sup>* þ … (1)

*ηiVijη <sup>j</sup>* (2)

*dt*<sup>2</sup> (3)

*ηiVijη <sup>j</sup>* ¼ 0 (4)

*η<sup>i</sup>* ¼ *aik* cosð Þ *ωkt* þ *δ<sup>k</sup>* (5)

potential energy equation can be written as a power series in *q*:

*∂V ∂qi* � �<sup>0</sup>

*V q*ð Þ¼ *V q*<sup>0</sup> � � <sup>þ</sup>

*Homology Molecular Modeling - Perspectives and Applications*

*V q*ð Þ¼ <sup>1</sup> 2

potential with respect to the components of the system.

interested in dynamics. For component *i*, this can be given by

1 2 *<sup>M</sup> <sup>d</sup>*<sup>2</sup> *ηi dt*<sup>2</sup> <sup>þ</sup>

One solution of this equation is the oscillatory equation

(NMA) can be used [5–9].

**2. Normal mode analysis**

*qi* � *<sup>q</sup>*<sup>0</sup>

**14**

potential can be written as

equation of motion can be written as

thousands of atoms this can become computationally difficult very quickly. Consequently, a number of coarse-grained approximate methods have been developed to overcome both of these limitations [6, 11]. The most common and widely used of these is the elastic network model (ENM).

The general idea of ENMs, first put forward by Tirion in 1996 [33], is to replace the complicated semi-empirical force fields used in standard NMA with a simple harmonic potential:

$$V(q) = \sum\_{d\_{\vec{\eta}} < R\_{\vec{\epsilon}}} \mathbb{C} \left( d\_{\vec{\eta}} - d\_{\vec{\eta}}^{0} \right)^{2} \tag{7}$$

translating blocks (RTB) model of Sanejouand [41] was constructed to alleviate this. In this approach, the protein or other macromolecule is divided into *n<sup>β</sup>* blocks made up of one or a few residues connected by elastic springs. Next, it is assumed that a good approximation to the low-frequency normal modes can be made by forming linear combinations of the local rotations and translations of these individual blocks. Consequently, a 3*N* � 6*n<sup>β</sup>* projection matrix *P* is constructed and used to build a

*Normal Mode Analysis: A Tool for Better Understanding Protein Flexibility and Dynamics…*

the eigenvector matrix diagonalizing *H<sup>β</sup>* and Λ*<sup>β</sup>* is the corresponding eigenvalue matrix. The resulting eigenvectors can be projected back into the full 3*N*-dimensional space using *AP* ¼ *PAβ*, where *AP* is a 3*N* � 6*n<sup>β</sup>* matrix containing the 6*n<sup>β</sup>*

In a survey of the recent structural biology literature, Bauer *et al.* [12] found that NMA, as a component of a structural study, most often was used for descrying the overall flexible motions of a molecule or for determining how those motions might be correlated with ligand binding or catalytic activation. Another use for NMA is to study the motions of macromolecules or macromolecular assemblies that are too large for treatment using conventional MD simulation. It has been shown that the low-frequency normal modes and the first few principal components calculated from a MD simulation overlap considerably, meaning that NMA can plausibly substitute for MD for large systems when only the overall collective motions of the system are needed [42–44]. Below, we will examine the application of NMA to experimentally obtained protein structures as well as to protein HM studies.

The ryanodine receptors (RyRs) are the largest presently known ion channels, with molecular weights of 2.2 MDa. They are homotetramers embedded in the membrane of the sarcoplasmic reticulum of myocytes, where they play a key role in excitation-contraction coupling. They regulate Ca<sup>2</sup><sup>þ</sup> release from the sarcoplasmic reticulum by undergoing a closed-to-open gating transition in response to an action potential or calcium binding. RyRs are found in all animals. Three isoforms have been identified in mammals: RyR1 (predominantly expressed in skeletal muscle), RyR2 (cardiac muscle) and RyR3 (present in several tissues including the brain, diaphragm, and testes) [45–47]. RyR malfunction leads to severe muscular disorders, including malignant hyperthermia, central core disease, tachycardia, dysplasia, and others [45]. The first high-resolution cryo-EM structures of the complete rabbit skeletal RyR were reported in 2015 [48–50], and were followed by a number of other high-resolution structures of the skeletal and cardiac isoforms in both their open and closed conformations, either alone or bound to regulators (For review see

A number of MD simulation studies have been reported for both the skeletal and cardiac RyR isoforms, but only covering small parts of the whole channel; in particular, the N-terminal domain (roughly the first 600 amino acids) [52–55] and parts of the channel domain [56–58] were studied in this way. Generally, these studies focused on identifying how known disease-causing mutations affected the

*<sup>β</sup> HβA<sup>β</sup>* ¼ Λ*β*, where *A<sup>β</sup>* is

projected Hessian matrix *Hβ*, which is diagonalized with *AT*

lowest-frequency approximate normal modes.

*DOI: http://dx.doi.org/10.5772/intechopen.94139*

**3.1 Application to experimental structures**

**3. Applications**

*3.1.1 Ryanodine receptor*

Bauerová-Hlinková *et al.* [51]).

dynamics of these fragments.

**17**

where *dij* is the distance between atoms or nodes *i* and *j*, *d*<sup>0</sup> *ij* is the distance in the initial structure, and *C* is a spring constant assumed to be the same for each *i*–*j* pair. It should be noted that by design the input configuration is assumed to be a minimum energy one, and energy minimization against a potential is therefore unnecessary. *Rc* in this equation refers to a cut-off radius and the sum is only over all pairs less than this value. *Rc* is somewhat arbitrary, but in practice, values of between 7.0–8.0 Å are used based on the observed distances between non-bonded atoms in experimental structures [34, 35]. Most frequently, only the C*<sup>α</sup>* atoms are used for these calculations because they are sufficient for studying the backbone motions of the protein and are all that is necessary for characterizing the lowest-frequency normal modes.

A number of different ENM formulations have been developed. The simplest one is the Gaussian network model (GNM) developed by Bahar and co-workers [36]. The GNM replaces the 3*N* � 3*N* Hessian matrix with an *N* � *N* Kirchoff matrix (Γ). Γ is defined in terms of spring constants *γij*, which are created based on the assumption that the separation distance ∣*Ri* � *Rj*∣ ¼ *Rij* between the *i*th and *j*th C*<sup>α</sup>* atoms in the protein follows a Gaussian distribution. The potential is given by

$$W\_{\rm GNMM} = \frac{1}{2} \sum\_{\vec{\imath}\vec{\jmath}} \chi\_{\vec{\imath}\vec{\jmath}} \left(\Delta \overleftarrow{R\_{\vec{\imath}\vec{\jmath}}}\right)^2 \tag{8}$$

where Δ *Rij* � is a vector expressing the fluctuations in distance between the *i*th and *j*th C*<sup>α</sup>* atoms. The model assumes that these fluctuations are isotropic; consequently, no information about the three-dimensional directions of motion can be obtained. Eigenvalue decomposition of Γ does allow the contribution of individual modes to the equilibrium dynamics to be calculated, as well as the relative displacement of residues along each mode axis, the cross-correlation between the residues in the individual modes, and square displacement profiles.

Some form of the anisotropic network model (ANM) is perhaps the most commonly encountered ENM. This is the form originally suggested by Tirion [33] and incorporated into the Molecular Modeling Toolkit (MMTK) by Hinsen [37], and widely used in a number of other tools. The ANM gives the same information as the GNM, but also provides information on the directionality of the fluctuations. On the other hand, the mean-square fluctuations (*B*-factors) and cross-correlations it produces do not agree quite as well with experiment as GNM [38, 39]. In compensation, ANMs can be used to generate alternative conformations in the close neighborhood of the starting structure by deforming the structure along the lowest frequency modes [9]. Two groups led by Zheng [35] and Lin and Song [40] developed models which combined the best features of GNM and ANM into a single method.

While coarse-graining does allow ENMs to be scaled to very large models, it does so by losing detailed information on local structural movements. The rotating*Normal Mode Analysis: A Tool for Better Understanding Protein Flexibility and Dynamics… DOI: http://dx.doi.org/10.5772/intechopen.94139*

translating blocks (RTB) model of Sanejouand [41] was constructed to alleviate this. In this approach, the protein or other macromolecule is divided into *n<sup>β</sup>* blocks made up of one or a few residues connected by elastic springs. Next, it is assumed that a good approximation to the low-frequency normal modes can be made by forming linear combinations of the local rotations and translations of these individual blocks. Consequently, a 3*N* � 6*n<sup>β</sup>* projection matrix *P* is constructed and used to build a projected Hessian matrix *Hβ*, which is diagonalized with *AT <sup>β</sup> HβA<sup>β</sup>* ¼ Λ*β*, where *A<sup>β</sup>* is the eigenvector matrix diagonalizing *H<sup>β</sup>* and Λ*<sup>β</sup>* is the corresponding eigenvalue matrix. The resulting eigenvectors can be projected back into the full 3*N*-dimensional space using *AP* ¼ *PAβ*, where *AP* is a 3*N* � 6*n<sup>β</sup>* matrix containing the 6*n<sup>β</sup>* lowest-frequency approximate normal modes.
