**2. Methodology**

### **2.1 Sequence analysis and homology modeling of tubulin isotypes**

In biological research, sequence analysis plays a major role as it has wide range of applications such as, (i) whole genome sequencing and annotation, (ii) identification of functional elements in the sequence, (iii) gene prediction, (iv) comparative genomics, (v) protein classification, (vi) protein and RNA structure prediction, (vii) evolutionary studies, etc. Protein sequences reveal the evolutionary history and hence, the events occurred during evolutions can be traced from the protein sequences.

The structure 6CVN.pdb is used as template structure for homology modeling of neuronal specific human tubulin isotypes namely βI, βIIb and βIII tubulin (uniprot IDs Q9H4B7, Q9BVA1 and Q136509). The multiple sequence alignment of these sequence was performed using 'clustal omega' tool [37]. The multiple sequence alignment reveals that residue variations is mainly at the C-terminal tail when compared to the other regions of the protein. The high-resolution cryo-EM structure of β/α/β-tubulin bound with TauR2 was recently deposited in the RCSB structural databases [8] was used as a template to build βI/α/βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2 complexes. The structure for C-terminal tail was absent in the template structure (6CVN.pdb), therefore the tail region was modeled using the Modeler 9v20. Hereafter, template structure with modeled C-terminal tail region would be referred as 6CVN\* in the further discussion.

Template based homology models for neuronal specific tubulin isotypes βI, βIIb, βIII was build using Modeler 9v20 [38]. The least discrete optimized potential energy (DOPE) score model was selected for further use. The stereo-chemical properties of these modeled subunits were evaluated and validated using the GMQE score [39], verify3D [40], ERRAT score [41] and Ramachandran plot through PROCHECK [42]. The selected subunit models were further used to 6CVN\*-TauR2, build βI/α/ βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2. These complexes namely 6CVN-TauR2, 6CVN\*-TauR2, βI/α/βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2 were further subjected for energy minimization to get their least energy state. Here we used Steepest Descent and Conjugate Gradient methods in Gromacs 2018.1 software for minimization [43]. The process of energy minimization is a numerical procedure aimed to find a minimum on the potential energy surface (PES) of the newly modeled conformation which mostly exists at a higher energy level. These minimized models were used as a starting input structures for molecular dynamics simulations to understand the binding mode and binding affinity of TauR2 towards neuronal specific tubulin isotypes βI, βII and βIII.

#### **2.2 Molecular dynamics simulations of tubulin-TauR2 complexes**

Molecular dynamics (MD) simulation plays a key role in exploring the structure and function of biological macromolecules [44]. In MD simulations, the dynamic behavior of the molecule is studied as a function of time. Molecular dynamics is being routinely used to address various biological problems such as biomolecular interactions (Protein–protein, protein-DNA/RNA), molecular pathways, Drugreceptor interactions, dynamics of protein folding, protein aggregations, protein

**73**

**Table 1.**

**S. no.**

*Homology Modeling of Tubulin Isotypes to Investigate MT-Tau Interactions*

structure prediction, etc. Tremendous development in high performance computing and simplicity of the basic MD algorithm has shortened the time required to perform molecular dynamics simulation and hence, studying larger systems became

All atom molecular dynamics (MD) simulation was performed in explicit solvent

The primary sequence of a protein is a linear chain of amino acids linked by peptide bonds. There is a direct link between the protein sequence, structure and function. The secondary structure of a protein is comprised of coils, α-helices, β-sheets,

> <sup>=</sup> <sup>=</sup> <sup>∑</sup> is the distance, 'N' is number of atoms

<sup>=</sup> <sup>−</sup> <sup>=</sup> <sup>∑</sup> is atom position at time , reference

<sup>=</sup> <sup>−</sup> <sup>=</sup> ∑ centre of mass, is the distance

position is *xi*

at time from their centre of mass

**Parameter Equation Component**

2

2

1 *N <sup>i</sup> <sup>i</sup> d*

1 *t ij i <sup>j</sup> xt x RMSF <sup>t</sup>*

> 1 *N i i com*

*x x <sup>R</sup> <sup>N</sup>*

*RMSD <sup>N</sup>*

on the modeled tubulin-TauR2 complexes (i.e. 6CVN-TauR2, 6CVN\*-tau, βI/α/ βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2) using GROMACS 2018.1 [43, 46]. The 'Amber99SB-ILDN' force field [47] was chosen for the MD simulation because it is well customized to handle the parameters for GTP,GDP and MG atoms which are the functional players of all the modeled tubulin-TauR2 complexes. The force field parameters for the GDP and GTP molecules were retrieved from the amber parameter database [48, 49]. The '*xleap*' module of AmberTools was used to generate topology files and initial starting coordinates for all the complexes [50]. All the modeled tau-tubulin complexes were placed at the centre of a cubic shaped solvation box having dimension of 15 Å from the extent of the molecule and TIP3P water model was used for solvation. All the systems were neutralized by adding appropriate number of required counterions. The topology files generated using xleap module of AmberTools were converted to Gromacs compatible topologies with ParmEd tool [51]. Energy minimization was carried out in two steps, In the first step, steepest descent algorithm was used for 50,000 followed by the conjugate gradient [46]. The energy minimized models were equilibrated using canonical ensemble (NVT) followed by isothermal-isobaric ensemble (NPT) for 1 ns. In the NVT equilibration systems were heated to 300 K using V-rescale, a modified Berendsen thermostat [46]. These heated systems were further equilibrated using the Parrinello-Rahman barostat to maintain constant pressure of 1 bar. The production MD simulations were performed for 100 ns without restraining any atoms over all the tubulin-TauR2 complexes using parameters discussed in earlier study [52]. The PME method was used to treat long range electrostatic interactions [53, 54] and covalent bonds involving H-atoms were constrained by using 'LINCS' algorithm [55]. The 2 fs time step was set for integrating the newtonian equation during the MD simulation. Similar protocol was adopted to perform MD simulation on three additional systems (i) 6CVN\* (without tau), (ii) free tau and (iii) 6CVN\*-polyA (as negative control) having 27 amino acids residues. All the MD simulation trajectories were further analyzed by using the GROMACS 2018.1 inbuilt tools [43, 46]. The general parameters explaining the conformational stability such as Root Mean Square Deviation (RMSD), Root Mean Square Fluctuations (RMSF) and Radius of Gyration (Rg) were measured, and the equations used for calculation of

*DOI: http://dx.doi.org/10.5772/intechopen.95792*

these parameters are tabulated in the **Table 1**.

1 RMSD <sup>2</sup>

<sup>2</sup> RMSF ( ( ) )

*g*

3 Rg ( )

*Equations used to calculate RMSD, RMSF and Rg.*

an easier task [45].

*Homology Modeling of Tubulin Isotypes to Investigate MT-Tau Interactions DOI: http://dx.doi.org/10.5772/intechopen.95792*

*Homology Molecular Modeling - Perspectives and Applications*

would be referred as 6CVN\* in the further discussion.

specific tubulin isotypes βI, βII and βIII.

**2.2 Molecular dynamics simulations of tubulin-TauR2 complexes**

Molecular dynamics (MD) simulation plays a key role in exploring the structure and function of biological macromolecules [44]. In MD simulations, the dynamic behavior of the molecule is studied as a function of time. Molecular dynamics is being routinely used to address various biological problems such as biomolecular interactions (Protein–protein, protein-DNA/RNA), molecular pathways, Drugreceptor interactions, dynamics of protein folding, protein aggregations, protein

**2. Methodology**

sequences.

isotypes namely βI, βIIb, and βIII using molecular modeling [36].

**2.1 Sequence analysis and homology modeling of tubulin isotypes**

expressed in different types of cells is completely unknown. Therefore, we studied relative binding affinity of Tau repeat region R2 with neuronal specific β-tubulin

In biological research, sequence analysis plays a major role as it has wide range of applications such as, (i) whole genome sequencing and annotation, (ii) identification of functional elements in the sequence, (iii) gene prediction, (iv) comparative genomics, (v) protein classification, (vi) protein and RNA structure prediction, (vii) evolutionary studies, etc. Protein sequences reveal the evolutionary history and hence, the events occurred during evolutions can be traced from the protein

The structure 6CVN.pdb is used as template structure for homology modeling of neuronal specific human tubulin isotypes namely βI, βIIb and βIII tubulin (uniprot IDs Q9H4B7, Q9BVA1 and Q136509). The multiple sequence alignment of these sequence was performed using 'clustal omega' tool [37]. The multiple sequence alignment reveals that residue variations is mainly at the C-terminal tail when compared to the other regions of the protein. The high-resolution cryo-EM structure of β/α/β-tubulin bound with TauR2 was recently deposited in the RCSB structural databases [8] was used as a template to build βI/α/βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2 complexes. The structure for C-terminal tail was absent in the template structure (6CVN.pdb), therefore the tail region was modeled using the Modeler 9v20. Hereafter, template structure with modeled C-terminal tail region

Template based homology models for neuronal specific tubulin isotypes βI, βIIb, βIII was build using Modeler 9v20 [38]. The least discrete optimized potential energy (DOPE) score model was selected for further use. The stereo-chemical properties of these modeled subunits were evaluated and validated using the GMQE score [39], verify3D [40], ERRAT score [41] and Ramachandran plot through PROCHECK [42]. The selected subunit models were further used to 6CVN\*-TauR2, build βI/α/ βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2. These complexes namely 6CVN-TauR2, 6CVN\*-TauR2, βI/α/βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2 were further subjected for energy minimization to get their least energy state. Here we used Steepest Descent and Conjugate Gradient methods in Gromacs 2018.1 software for minimization [43]. The process of energy minimization is a numerical procedure aimed to find a minimum on the potential energy surface (PES) of the newly modeled conformation which mostly exists at a higher energy level. These minimized models were used as a starting input structures for molecular dynamics simulations to understand the binding mode and binding affinity of TauR2 towards neuronal

**72**

structure prediction, etc. Tremendous development in high performance computing and simplicity of the basic MD algorithm has shortened the time required to perform molecular dynamics simulation and hence, studying larger systems became an easier task [45].

All atom molecular dynamics (MD) simulation was performed in explicit solvent on the modeled tubulin-TauR2 complexes (i.e. 6CVN-TauR2, 6CVN\*-tau, βI/α/ βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2) using GROMACS 2018.1 [43, 46]. The 'Amber99SB-ILDN' force field [47] was chosen for the MD simulation because it is well customized to handle the parameters for GTP,GDP and MG atoms which are the functional players of all the modeled tubulin-TauR2 complexes. The force field parameters for the GDP and GTP molecules were retrieved from the amber parameter database [48, 49]. The '*xleap*' module of AmberTools was used to generate topology files and initial starting coordinates for all the complexes [50]. All the modeled tau-tubulin complexes were placed at the centre of a cubic shaped solvation box having dimension of 15 Å from the extent of the molecule and TIP3P water model was used for solvation. All the systems were neutralized by adding appropriate number of required counterions. The topology files generated using xleap module of AmberTools were converted to Gromacs compatible topologies with ParmEd tool [51]. Energy minimization was carried out in two steps, In the first step, steepest descent algorithm was used for 50,000 followed by the conjugate gradient [46]. The energy minimized models were equilibrated using canonical ensemble (NVT) followed by isothermal-isobaric ensemble (NPT) for 1 ns. In the NVT equilibration systems were heated to 300 K using V-rescale, a modified Berendsen thermostat [46]. These heated systems were further equilibrated using the Parrinello-Rahman barostat to maintain constant pressure of 1 bar. The production MD simulations were performed for 100 ns without restraining any atoms over all the tubulin-TauR2 complexes using parameters discussed in earlier study [52]. The PME method was used to treat long range electrostatic interactions [53, 54] and covalent bonds involving H-atoms were constrained by using 'LINCS' algorithm [55]. The 2 fs time step was set for integrating the newtonian equation during the MD simulation. Similar protocol was adopted to perform MD simulation on three additional systems (i) 6CVN\* (without tau), (ii) free tau and (iii) 6CVN\*-polyA (as negative control) having 27 amino acids residues. All the MD simulation trajectories were further analyzed by using the GROMACS 2018.1 inbuilt tools [43, 46]. The general parameters explaining the conformational stability such as Root Mean Square Deviation (RMSD), Root Mean Square Fluctuations (RMSF) and Radius of Gyration (Rg) were measured, and the equations used for calculation of these parameters are tabulated in the **Table 1**.

The primary sequence of a protein is a linear chain of amino acids linked by peptide bonds. There is a direct link between the protein sequence, structure and function. The secondary structure of a protein is comprised of coils, α-helices, β-sheets,


**Table 1.**

*Equations used to calculate RMSD, RMSF and Rg.*

β-bridge, bend, turn, coils, π-helix and 310-helices. DSSP is an algorithm developed by Kabsch and Sander to extract the secondary structural features based on atomic coordinates [56]. The overall stability of the structure is highly determined by the stable dynamics of these secondary structures and any significant changes in secondary structure attributes to the structural flexibility/fold as well as functional diversity of the protein. Hence, conformational changes in the secondary structure during MD simulation were analyzed using the DSSP programme [56]. The simulation movies over the entire trajectories were generated using the VMD software [57] and publication quality images were generated using the Biovia Discovery studio visualizer [58] and Chimera software [59].

#### **2.3 Calculations of contact surface area (CSA) for tubulin-TauR2 complexes**

Solvent Accessible Surface Area (SASA) is used to represent the degree of hydration of a biomolecule. SASA also be especially useful to quantify the stability of the biomolecular complexes in the aqueous medium. The C-terminal tail of the tubulin subunits is highly dynamic in nature and has no definite secondary structure, hence it affects the overall hydrophobic SASA. Therefore, interface of the MT (in this case tubulin trimer made up of β/α/β subunits) where TauR2 binds at the exterior surface has been selected for the calculating the precise CSA. The in-built gromacs tool *"gmx sasa*" [60] was used to calculate the SASA. In addition, SASA is also calculated for the tubulin subunits and the TauR2.

### **2.4 Binding affinity of tauR2 towards different neuronal specific tubulin isotypes**

The biomolecular recognition pattern mainly depends on the binding ability of the interacting biomolecules. The binding affinity as well as the energy between the two interacting molecules can be calculated using various theoretical approaches like (i) Pathway methods such as Thermodynamic integration (TI) as well as Free energy perturbation (FEP) and (ii) End point methods such as Molecular Mechanics Poission-Boltzman Surface Area (MM/PBSA) and Molecular Mechanics Generalized Born Surface Area (MM/GBSA) [61]. In the present study, MM/PBSA approach was used to calculate relative binding energies of the simulated molecules. This MMPBSA approach is very popular, computationally less expensive, and has better accuracy even for the larger systems [62].

Here, the binding affinity between different neuronal specific tubulin isotypes and TauR2 was estimated by performing relative binding energy calculation similar to earlier studies [63–65]. The stable trajectory observed in between 70 ns to 100 ns was chosen to perform the binding energy calculations for all the tubulin-TauR2 complexes. The 'g\_mmpbsa' tool v1.6 was used to perform binding energy calculation using MM/PBSA approach [66]. The parameters for the binding energy calculations were chosen from the earlier similar studies [52, 65, 67–69]. In the MMPBSA methods binding energy (ΔGbind) of tubulin and TauR2 was calculated by using the following Eq. (1),

$$
\Delta G\_{bind} = \Delta G\_{nubuliu - TauR2} - \left(\Delta G\_{nubulin} + \Delta G\_{TauR2}\right) \tag{1}
$$

**75**

*Homology Modeling of Tubulin Isotypes to Investigate MT-Tau Interactions*

omitted as similar to previous studies [21, 22, 70–72].

computationally expensive for a larger biomolecular complexes and hence it is

tive binding affinity between neuronal specific tubulin isotypes and TauR2.

were used as starting structures to perform MD simulations.

**3.2 Structural stability of the tubulin-TauR2 complexes**

well converged during the simulation period of 100 ns (**Figure 3**).

The all atom MD simulations were performed on tubulin-TauR2 complexes namely 6CVN-TauR2, 6CVN\*-TauR2, βI/α/βI-TauR2, βIIb/α/βIIb-TauR2, βIII/α/βIII-TauR2 using Gromacs 2018.1 [78]. The stability of these tubulin-TauR2 complexes is accessed by plotting the potential energy during the simulation period, which highlight that all the complexes are well minimized, and simulation trajectories are

The parameters describing the stability of tau-tubulin complex such as RMSD

(root mean square deviation), RMSF (root mean square fluctuation), and R*g* (radius of gyration) was studied. The RMSD values for simulated tubulin-TauR2 complexes, tauR2 and backbone atoms of tubulin trimer without considering

**3.1 Sequence analysis and homology modeling of neuronal specific tubulin** 

The residue composition of different β-tubulin isotypes mostly varies at the carboxy-terminal tail region as revealed by the multiple sequence alignment (**Figure 2**). The βI and βIII tubulin isotypes have longer C-terminal tail regions when compared with the βIIb tubulin isotype. The β-tubulin sequence in the template structure i.e., 6CVN (chain A) and human βIIb tubulin isotypes show 98.65% sequence identity. These sequence variations in the tubulin isotypes are reported to regulate number of protofilaments in the MT and their stability [73]. These β-tubulin isotypes sequences were used to generate three-dimensional homology models using 6CVN as the template. The structures of βI, βIIb, βIII tubulin isotypes were modeled using Modeler 9v20 [38]. The best homology model generated is selected using DOPE score. The DOPE score value for A and C chain of (i) βI subunits are −54299.89 and − 54291.24 (ii) βIIb subunits are −53487.13 and − 53054.42 and (iii) βIII subunits are −53725.86 and − 53054.42. The quality of these models is accessed using GMQE score [74], Verify3D [40], Errat score [41], Z-score [75] and Ramachandran plot [76, 77]. The parameters describing the overall quality of the modeled neuronal specific β-tubulin subunits are shorn in **Table 2**. The GMQE score provides an estimate of the accuracy of the modeled tertiary structure of neuronal specific β-tubulins. Here, GMQE score for all the modeled β-subunits is 0.98, which represents the accuracy of the generated model. Further, verify3D and ERRAT score also validates the quality of the generated models (**Table 1**). Ramachandran plots for all the modeled β-tubulin isotypes represents more than 98% of the residues occupy a favored region. The occupancy of amino acid residues in the Ramachandran plot is given in **Table 3**. These modeled structures of neuronal specific β-tubulin isotypes were used further to build the tubulin and TauR2 complexes such as βI/α/βI-TauR2, βIIb/α/βIIb-TauR2 and βIII/α/βIII-TauR2 using 6CVN.pdb as a template structure. These modeled complexes

In this chapter we employ sequence analysis, homology modeling, MD simulations, and binding energy calculation to (i) gain structural insights to the detailed binding mode, (ii) study atomic level tubulin isotypes-tauR2 interactions and (iii) study rela-

*DOI: http://dx.doi.org/10.5772/intechopen.95792*

**3. Results and discussion**

**isotypes**

Where, the ∆*Gtubulin TauR* <sup>−</sup> <sup>2</sup> , ∆*Gtubulin* and ∆*GTauR*2 represents the average free energies of the complex (tubulin-TauR2), receptor (tubulin) and ligand (TauR2), respectively. The calculation of the entropic contribution in binding energy is

computationally expensive for a larger biomolecular complexes and hence it is omitted as similar to previous studies [21, 22, 70–72].
