*2.1.2. RMSD clustering to extract receptor ensembles from an all-atom MD simulation*

The holo system with oseltamivir removed was used for RMSD clustering and the docking experiments. Clustering analyses were performed on 20 ns MD trajectories using the g\_cluster tool in the Gromacs package. In brief, snapshots at every 10 ps over the 20 ns simulation were recorded. 2000 resulting structures were superimposed, using all Cα atoms to remove possible rotational and translational movements of the whole system. We have visually verified that the binding-site residues of avian H5N1 neuraminidase, which cover the active site and are responsible for interaction with putative inhibitors, can also be used for the swine flu H1N1pdm neuraminidase. The four most populated clusters are shown in Figure 2. The RMSD clustering analysis was performed on this subset (117-119, 133-138, 146-152, 156, 179, 180, 196-200, 223-228, 243-247, 277, 278, 293, 295, 344-347, 368, 401, 402, and 426-441) using allatoms (including side chains and hydrogen atoms) with the cut-off of 1.5 Å. A total of 13

representative clusters were obtained, which account for 96.2% of the configuration space from the 20 ns MD trajectories. These 13 clusters were used in the docking experiments.

Incorporating Molecular Dynamics Simulations

(1)

(2)

into Rational Drug Design: A Case Study on Influenza a Neuraminidases 163

In the docking experiments, the 13 most representative configurations were used as receptors. AutoDockTools 1.5.2 was used to add polar hydrogens, assign Gasteiger charges11 and create grid binding boxes. The volume of each grid box was 72 x 72 x 72, with the default 0.375 Å spacing. The binding box was positioned to encompass all three possible binding sites, namely the SA, 150 and 430 cavities (Cheng et al., 2008). AutoGrid version 4.2.1 was used to calculate the binding affinities using the following atom types: A (aromatic carbon), C, N, NA (hydrogen bond accepting N), OA (hydrogen bond accepting O), P, S, SA (hydrogen bond accepting S), Cl, HD (polar hydrogen) and e (electrostatics). AutoDockTools version 1.5.2 was also used to merge nonpolar hydrogens, add Gasteiger charges and visually set up rotatable bonds for each ligand via AutoTors. For screening larger ligand libraries, Autodock Vina (Trott et al., 2010), which run more efficiently than this version, is highly suggested. The Lamarckian genetic algorithm was used to do the docking experiments using AutoDock 4.2.1.16 Docking parameters were chosen to reproduce structures of 13 corresponding oseltamivir–neuraminidase complexes in the MD simulation. Other parameters are as follows: trials of 100 dockings, population size of 200, random starting position and conformation, translation step range of 2.0 Å, rotation step range of 50 degrees, maximum number of generations of 27000, elitism of 1, mutation rate of 2%, crossover rate of 80%, local search rate of 6%, 8 million energy evaluations, unbound model was "same as bound", and docked conformations were clustered with the tolerance of 2.0 Å RMSD. Docking results were sorted by the lowest binding energy of the most populated cluster in cases of convergence. In the case of no dominant cluster, docking results were visually analyzed using VMD (Humphrey et al., 1996) to choose the best binding pose. Hydrogen bond analysis

utilized a distance and angle cut-off of 3.5 Å and 45 degrees, respectively.

The arithmetic means were calculated directly from the binding energies

The appearance frequency (AF) of important hydrogen bonds is calculated as follows:

*AF =*

13

1 13

Where i is the index number of each ensemble; ni the size of each ensemble; and hi = 1 if hydrogen bond exists, and 0 otherwise. After the dockings, statistical calculations were performed to obtain the final binding energies for compound ranking, using the arithmetic mean (AM) and harmonic mean (HM) binding energies as defined in the previous study.

13

1 13

*AM =*

Where Ei is the binding energy of each ensemble with the standard deviation

*i i i=*

*n E*

*i i=*

*n*

1

*i i i=*

*n h*

*i i=*

*n*

1

**2.3. Molecular docking** 

**Figure 2.** 13 representative ensembles resulting from clustering analysis, accounting for 96.2% of the configuration space, are ordered by the corresponding simulation time. The four largest cluster ensembles account for over 66% of the configuration space from the 20 ns MD simulation. The most dominant cluster is colored in orange, the second one in silver, the third one in blue and the fourth one in cyan.
