*2.2.2. New computational method - computational method of simulated annealing evolutionary computations*

The computational methods used to build the new MM-Models will be simulated annealing evolutionary computations (SAECs), where SAECs were got from the hybrid algorithms of (Abbass et al., 2003) by simply replacing the DG method by the SA algorithm of (Bagirov and Zhang, 2003) and numerical computational results show that SAECs can successfully pass the test of more than 40 well-known benchmark global optimization problems (Zhang, 2011c).

#### *2.2.3. New MM-models*

The atomic-resolution X-ray structure of 3NHD.pdb is a steric zipper, with strong vdw interactions between β-sheets and HBs to maintain the β-strands (Fig. 2).

By observations of the 3rd column of coordinates of 3NHD.pdb and Fig. 2, G(H) chains (i.e. β-sheet 2) of 3NHD.pdb can be calculated from A(B) chains (i.e. β-sheet 1) by Eq. 1 (where T is Transpose of column vector) and other chains can be calculated by Eqs. 2~3:

$$\mathbf{G(H) = ((-1, 0, 0)r, (0, 1, 0)r, (0, 0, -1)r)}\text{ A(B) + (-20.5865, 9.48, 0.0)r}\tag{1}$$

$$\mathbf{K(L) = G(H) + (0.0, 0.0, 9.59)^\gamma, I(f) = G(H) + (0.0, 0.0, -9.59)^\gamma} \tag{2}$$

$$\mathbf{C(D) = A(B) + (0.0, 0., 9.59)^{\mathrm{r}}}, \mathrm{E(F) = A(B) + (0.0, 0.0, -9.59)^{\mathrm{r}}}.\tag{3}$$

Basing on the template 3NHD.pdb from the Protein Data Bank, three prion AGAAAAGA palindrome amyloid fibril models - an AGAAAA model (Model 1), a GAAAAG model

Computational Potential Energy Minimization Studies on the Prion AGAAAAGA Amyloid Fibril Molecular Structures 301

300 Recent Advances in Crystallography

2011a).

2011c).

*2.2.1 New material* 

*evolutionary computations*

*2.2.3. New MM-models* 

SA-SDCG methods as in (Zhang, 2011a).

Gradient (DG (Bagirov et al., 2008)) method. Then the Models were optimized using SDCG-

Zhang et al. (2011, 2011d) used 3NHC.pdb, 3NVF/G/H/E.pdb templates to build several MM-Models (Figs. 9~11 in (Zhang et al., 2011b), and Figs. 5~8 in (Zhang, 2011b)). These Models were built in the use of canonical dual global optimization theory (Gao et al., 2012; Gao and Wu, 2012; Gao, 2000) and then refined by SDCG-SA-SDCG methods as in (Zhang,

This Chapter uses a suitable pdb file template 3NHD.pdb (the GYVLGS segment 127-132 from human prion with V129 (Apostol et al., 2010) from the Protein Data Bank to build MM-

The computational methods used to build the new MM-Models will be simulated annealing evolutionary computations (SAECs), where SAECs were got from the hybrid algorithms of (Abbass et al., 2003) by simply replacing the DG method by the SA algorithm of (Bagirov and Zhang, 2003) and numerical computational results show that SAECs can successfully pass the test of more than 40 well-known benchmark global optimization problems (Zhang,

The atomic-resolution X-ray structure of 3NHD.pdb is a steric zipper, with strong vdw

By observations of the 3rd column of coordinates of 3NHD.pdb and Fig. 2, G(H) chains (i.e. β-sheet 2) of 3NHD.pdb can be calculated from A(B) chains (i.e. β-sheet 1) by Eq. 1 (where T

Basing on the template 3NHD.pdb from the Protein Data Bank, three prion AGAAAAGA palindrome amyloid fibril models - an AGAAAA model (Model 1), a GAAAAG model

G(H) = ((-1, 0, 0)T, (0, 1, 0)T, (0, 0, -1)T) A(B) + (-20.5865, 9.48, 0.0)T, (1)

K(L) = G(H) + (0.0, 0.0, 9.59)T, I(J) = G(H) + (0.0, 0.0, -9.59)T, (2)

C(D) = A(B) + (0.0, 0., 9.59)T, E(F) = A(B) + (0.0, 0.0, -9.59)T. (3)

interactions between β-sheets and HBs to maintain the β-strands (Fig. 2).

is Transpose of column vector) and other chains can be calculated by Eqs. 2~3:

*2.2.2. New computational method - computational method of simulated annealing* 

*2.1.3. Computational method of canonical dual global optimization theory* 

**2.2. New material and method, and new MM-models** 

models of AGAAAAGA amyloid fibrils for prions.

**Figure 2.** Protein fibril structure of human V129 prion GYVLGS (127–132) (PDB ID: 3NHD). The dashed lines denote the hydrogen bonds. A, B ... K, L denote the chains of the fibril.

(Model 2), and an AAAAGA model (Model 3) - will be successfully constructed in this Chapter. Because the template is a segment of 6 residues, the three shorter prion fragments are selected. This Chapter does not perform calculations on the full AGAAAAGA. Chains AB of Models 1~3 were respectively got from AB chains of 3NHD.pdb using the mutate module of the free package Swiss-PdbViewer (SPDBV Version 4.01) (http://spdbv.vital-it.ch). It is pleasant to see that almost all the hydrogen bonds are still kept after the mutations; thus we just need to consider the vdw contacts only. Making mutations for GH chains of 3NHD.pdb, we can get the GH chains of Models 1~3. However, the vdw contacts between Chain A and Chain G, between B chain and H chain are too far at this moment (Figs. 3~5).

**Figure 3.** At initial state, the vdw contacts between AB chains (β-sheet 1) and GH chains (β-sheet 2) of Model 1 are very far.

Seeing Figs. 3~5, we may know that for Model 1 at least 3 vdw interactions B6.ALA.CB-H3.ALA.CB-B4.ALA.CB-H5.ALA.CB should be maintained (their distances in Fig. 3 are 7.82, 8.36, 9.04 angstroms respectively), for Model 2 at least 3 vdw interactions G4.ALA.CB-

**Figure 4.** At initial state, the vdw contacts between AB chains (β-sheet 1) and GH chains (β-sheet 2) of Model 2 are very far.

**Figure 5.** At initial state, the vdw contacts between AB chains (β-sheet 1) and GH chains (β-sheet 2) of Model 3 are very far.

A3.ALA.CB-G2.ALA.CB-A5.ALA.CB should be maintained (their distances in Fig. 4 are 7.16, 7.43, 9.31 angstroms respectively), and for Model 3 at least 3 vdw interactions A1.ALA.CB-G4.ALA.CB-A3.ALA.CB-G2.ALA.CB should be maintained (their distances in Fig. 5 are 3.45, 7.16, 7.43 angstroms respectively). For Model 1, fixing the coordinates of B6.ALA.CB and B4.ALA.CB, letting the coordinates of H3.ALA.CB and H5.ALA.CB be variables, we may get a simple Lennard-Jones (LJ) potential energy minimization problem just with 6 variables (see Eq. 9). Similarly, for Model 2 fixing the coordinates of A3.ALA.CB and A5.ALA.CB, letting the coordinates of G4.ALA.CB and G2.ALA.CB be variables, we may get a simple LJ potential energy minimization problem just with 6 variables (see Eq. 10); for Model 3, fixing the coordinates of A1.ALA.CB and A3.ALA.CB, letting the coordinates of G4.ALA.CB and G2.ALA.CB be variables, we may get a simple LJ potential energy minimization problem with 6 variables (see Eq. 11).

The vdw contacts of atoms are described by the LJ potential energy:

$$\mathbf{V}\_{\rm t}(\mathbf{r}) = 4\varepsilon \left[ (\boldsymbol{\sigma}/\mathbf{r})^{\rm z} - (\boldsymbol{\sigma}/\mathbf{r})^{\rm s} \right] \tag{4}$$

where ε is the depth of the potential well and σ is the atom diameter; these parameters can be fitted to reproduce experimental data or deduced from results of accurate quantum chemistry calculations. The (σ/r)12 term describes repulsion and the -(σ/r)6 term describes attraction. If we introduce the coordinates of the atoms whose number is denoted by N and let ε=σ= 1 be the reduced units, the Eq. 4 becomes into

Computational Potential Energy Minimization Studies on the Prion AGAAAAGA Amyloid Fibril Molecular Structures 303

$$\mathbf{f}(\mathbf{x}) = \mathbf{4} \sum\_{\mathbf{i} \sim \mathbf{1}^{\mathsf{N}}} \sum\_{\mathbf{j} \sim \mathbf{1}^{\mathsf{N}}} \mathbf{1} \left( \mathbf{1}/\mathsf{t}\mathbf{i}^{6} - \mathbf{1}/\mathsf{t}\mathbf{i}^{3} \right),\tag{5}$$

where tij = (x3i−2 – x3j−2)2 + (x3i−1 – x3j−1)2 + (x3i – x3j)2, (x3i−2, x3i−1, x3i) is the coordinates of atom i, N≥2. The minimization of LJ potential f(x) on Rn (where n = 3N) is an optimization problem:

$$\min \text{ f(x)} \text{ subject to} \ge \text{€} \to \text{ $\mathbb{R}^{\mathbb{N}}$ .} \tag{6}$$

Similarly as Eq. 4, i.e. the potential energy for the vdw interactions between β-sheets:

$$\mathbf{V}\iota(\mathbf{r}) = \mathbf{A}/\mathbf{r}^{\text{r}2} - \mathbf{B}/\mathbf{r}^{\epsilon},\tag{7}$$

the potential energy for the HBs between the β-strands has the formula

302 Recent Advances in Crystallography

Model 2 are very far.

Model 3 are very far.

**Figure 4.** At initial state, the vdw contacts between AB chains (β-sheet 1) and GH chains (β-sheet 2) of

**Figure 5.** At initial state, the vdw contacts between AB chains (β-sheet 1) and GH chains (β-sheet 2) of

A3.ALA.CB-G2.ALA.CB-A5.ALA.CB should be maintained (their distances in Fig. 4 are 7.16, 7.43, 9.31 angstroms respectively), and for Model 3 at least 3 vdw interactions A1.ALA.CB-G4.ALA.CB-A3.ALA.CB-G2.ALA.CB should be maintained (their distances in Fig. 5 are 3.45, 7.16, 7.43 angstroms respectively). For Model 1, fixing the coordinates of B6.ALA.CB and B4.ALA.CB, letting the coordinates of H3.ALA.CB and H5.ALA.CB be variables, we may get a simple Lennard-Jones (LJ) potential energy minimization problem just with 6 variables (see Eq. 9). Similarly, for Model 2 fixing the coordinates of A3.ALA.CB and A5.ALA.CB, letting the coordinates of G4.ALA.CB and G2.ALA.CB be variables, we may get a simple LJ potential energy minimization problem just with 6 variables (see Eq. 10); for Model 3, fixing the coordinates of A1.ALA.CB and A3.ALA.CB, letting the coordinates of G4.ALA.CB and G2.ALA.CB be variables, we may get a simple LJ potential

VLJ(r) = 4ε [(σ/r)12 - (σ/r)6], (4)

where ε is the depth of the potential well and σ is the atom diameter; these parameters can be fitted to reproduce experimental data or deduced from results of accurate quantum chemistry calculations. The (σ/r)12 term describes repulsion and the -(σ/r)6 term describes attraction. If we introduce the coordinates of the atoms whose number is denoted by N and

energy minimization problem with 6 variables (see Eq. 11).

let ε=σ= 1 be the reduced units, the Eq. 4 becomes into

The vdw contacts of atoms are described by the LJ potential energy:

$$\mathbf{V}\_{\rm TH}(\mathbf{r}) = \mathbf{C}/\mathbf{r}^{12} - \mathbf{D}/\mathbf{r}^{10},\tag{8}$$

where A, B, C, D are given constants. Thus, the amyloid fibril molecular modeling problem can be deduced into the problem to solve the mathematical optimization problem Eq. 6. Seeing Fig. 6, we may know that the optimization problem Eq. 6 reaches its optimal value at the bottom of the LJ potential well, where the distance between two atoms equals to the sum of vdw radii of the atoms. In this Chapter, the sum of the

**Figure 6.** The Lennard-Jone Potential (Eqs.4 and 7) (This Fig. can be found in website homepage.mac.com/swain/CMC/DDResources/mol\_interactions/molecular\_interactions.html).

vdw radii is the twice of the vdw radius of Carbon atom, i.e. 3.4 angstroms. The optimization problem Eq. 6 is a nonconvex complex optimization problem. By the observation from Fig. 6, we may solve its simple but equal convex-and-smooth least square optimization problem (or the so-called distance geometry problem or sensor network problem) with a slight perturbation if data for three atoms violate the triangle inequality. The following three optimization problems for Models 1~3 respectively are:

$$\begin{array}{ll}\min f(\mathbf{x}) = \mathbb{I}\{ (\mathbf{x}\_{11} + 16.359)^2 + (\mathbf{x}\_{12}, 9.934)^2 + (\mathbf{x}\_{13} + 3.526)^2 - 3.42\}^2 + \mathbb{I}\{ (\mathbf{x}\_{21} + 9.726)^2 + (\mathbf{x}\_{22}, 9.530)^2 \} \\\ (8.530)^2 + (\mathbf{x}\_{23} + 3.613)^2 - 3.42\}^2 + \mathbb{I}\{ (\mathbf{x}\_{11} + 9.726)^2 + (\mathbf{x}\_{12}, 8.530)^2 + (\mathbf{x}\_{13} + 3.613)^2 - 3.42\}^2 \text{with} \\\ \text{initial solution (-12.928, 12.454, 3.034; -6.635, 14.301, 2.628), \end{array} \tag{9}$$

min f(x)= ½{ (x11+8.655)2 + (x12-8.153)2 + (x13-1.770)2 -3.42}2 + ½{ (x21+8.655)2 + (x22- 8.153)2 + (x23-1.770)2 -3.42}2 + ½{ (x21+2.257)2 + (x22-6.095)2 + (x23-3.078)2 -3.42}2 with initial solution (-13.909, 12.227, -0.889; -7.439, 14.419, -2.033), (10)

min f(x)= ½{ (x11+15.632)2 + (x12-9.694)2 + (x13-0.687)2 -3.42}2 + ½{ (x11+8.655)2 + (x12- 8.153)2 + (x13-1.770)2 -3.42}2 + ½{ (x21+8.655)2 + (x22-8.153)2 + (x23-1.770)2 -3.42}2 with initial solution (-13.909, 12.227, -0.889; -7.439, 14.419, -2.033). (11)

We may use any optimization algorithms or packages to easily solve problems Eqs. 9~11 and get their respective global optimal solutions (-13.062, 9.126, -3.336; -12.344, 6.695, -2.457), (-11.275, 6.606, 3.288; -5.461, 7.124, 2.424), (-12.149, 8.924, 1.229; -9.256, 11.007, 3.517), which were got by the SAEC algorithms in this Chapter. Input these global optimal solutions into Eq. 1, take average and tests then we get Eq. 12:

$$\mathbf{G(H) = ((-1, 0, 0)^r, (0, 1, 0)^r, (0, 0, -1)^r)}\text{ A(B) + (-20.2788, -0.0821, 0.5609)^r.} \tag{12}$$

By Eq. 12, we can get close vdw contacts in Figs. 7~9.

**Figure 7.** After LJ potential energy minimization, the vdw contacts of Model 1 become very closer (the distances are illuminated by the overlap of border of CB atoms' surface).

**Figure 8.** Fig. 8: After LJ potential energy minimization, the vdw contacts of Model 2 become very closer (the distances are illuminated by the overlap of border of CB atoms' surface).

From Figs. 3~5 to Figs. 7~9, we may see that the Optimization algorithm works and the computational experiences show us we had better at least define two sensors and two anchors in order to form a zipper between the two β-sheets. Next, in order to remove very close bad contacts, we relax Figs. 7~9 by a slight SDCG-Optimization in the use of Amber 11 (Case et al., 2010) and we get the optimized MM-Models 1~3. The other CDEF and LKJI chains can be got by parallelizing ABGH chains in the use of mathematical Eqs. 2~3. The new amyloid fibril models are useful for the drive to find treatments for prion diseases in the field of medicinal chemistry. The computational algorithms presented in this Chapter and their references therein are useful in materials science, drug design, etc.

**Figure 9.** After LJ potential energy minimization, the vdw contacts of Model 3 become very closer (the distances are illuminated by the overlap of border of CB atoms' surface).

Because Eqs. 9~11 are optimization problems with 6 variables only and these optimization problems are to minimize fourth-order polynomials, the proposed SAEC method and other computational methods can easily get the same optimal solutions to optimize the above three models.

#### *2.2.4. The practical LBFGS quasi-Newtonian method*

304 Recent Advances in Crystallography

Eq. 1, take average and tests then we get Eq. 12:

By Eq. 12, we can get close vdw contacts in Figs. 7~9.

distances are illuminated by the overlap of border of CB atoms' surface).

min f(x)= ½{ (x11+8.655)2 + (x12-8.153)2 + (x13-1.770)2 -3.42}2 + ½{ (x21+8.655)2 + (x22- 8.153)2 + (x23-1.770)2 -3.42}2 + ½{ (x21+2.257)2 + (x22-6.095)2 + (x23-3.078)2 -3.42}2 with initial solution (-13.909, 12.227, -0.889; -7.439, 14.419, -2.033),

min f(x)= ½{ (x11+15.632)2 + (x12-9.694)2 + (x13-0.687)2 -3.42}2 + ½{ (x11+8.655)2 + (x12- 8.153)2 + (x13-1.770)2 -3.42}2 + ½{ (x21+8.655)2 + (x22-8.153)2 + (x23-1.770)2 -3.42}2 with initial solution (-13.909, 12.227, -0.889; -7.439, 14.419, -2.033).

We may use any optimization algorithms or packages to easily solve problems Eqs. 9~11 and get their respective global optimal solutions (-13.062, 9.126, -3.336; -12.344, 6.695, -2.457), (-11.275, 6.606, 3.288; -5.461, 7.124, 2.424), (-12.149, 8.924, 1.229; -9.256, 11.007, 3.517), which were got by the SAEC algorithms in this Chapter. Input these global optimal solutions into

G(H) = ((-1, 0, 0)T, (0, 1, 0)T, (0, 0, -1)T) A(B) +(-20.2788, -0.0821, 0.5609)T. (12)

**Figure 7.** After LJ potential energy minimization, the vdw contacts of Model 1 become very closer (the

**Figure 8.** Fig. 8: After LJ potential energy minimization, the vdw contacts of Model 2 become very

From Figs. 3~5 to Figs. 7~9, we may see that the Optimization algorithm works and the computational experiences show us we had better at least define two sensors and two

closer (the distances are illuminated by the overlap of border of CB atoms' surface).

(10)

(11)

Energy minimization (EM), with the images at the endpoints fixed in space, of the total system energy provides a minimum energy path. EM can be done using SD, CG, and LBFGS (Limited-memory Broyden-Fletcher-Goldfarb-Shanno). SD is robust and easy to implement but it is not most efficient especially when closer to minimum. CG is slower than SD in the early stages but more efficient when closer to minimum. The hybrid of SD-CG will make SD more efficient than SD or CG alone. However, CG cannot be used to find the minimization energy path, for example, when "forces are truncated according to the tangent direction, making it impossible to define a Lagrangian" (Chu et al., 2003). In this case, the powerful and faster quasi-Newtonian method (e.g. the LBFGS quasi-Newtonian minimizer) can be used (Chu et al., 2003; Liu and Nocedal, 1989; Nocedal and Morales, 2000; Byrd et al., 1995; Zhu et al., 1997). We briefly introduce the LBFGS quasi-Newtonian method as follows.

Newton's method in optimization explicitly calculates the Hessian matrix of the secondorder derivatives of the objective function and the reverse of the Hessian matrix (Dennis et al., 1996). The convergence of this method is quadratic, so it is faster than SD or CG. In high dimensions, finding the inverse of the Hessian is very expensive. In some cases, the Hessian is a non-invertible matrix, and furthermore in some cases, the Hessian is symmetric indefinite. Qusi-Newton methods thus appear to overcome all these shortcomings.

Quasi-Newton methods (a special case of variable metric methods) are to approximate the Hessian. Currently, the most common quasi-Newton algorithms are the SR1 formula, the BHHH method, the widespread BFGS method and its limited /low-memory extension LBFGS, and Broyden's methods (http://en.wikipedia.org/wiki/Quasi-Newton\_method). In Amber (Case et al., 2010) and Gromacs (van der Spoel et al., 2010), LBFGS is used, and the hybrid of LBFGS with CG - a Truncated Newton linear CG method with optional LBFGS Preconditioning (Nocedal and Morales, 2000) - is used in Amber (Case et al., 2010).

## **2.3. New thinking about the construction of 3D-structure of a protein**

If a NMR or X-ray structure of a protein has not been determined and stored in PDB bank yet, we still can easily get the 3D-structural frame of the protein. For example, before 2005 when we did not know the NMR structure of rabbit prion protein, we could get its homology model structure using the NMR structure of the human prion protein (PDB id: 1QLX) as the template (Zhang et al., 2006). We may use the homology structure to determine the 3D-structural frame of a protein when its NMR or X-ray structure has not been determined yet. The determination is an optimization problem described as follows.

"Very often in a structural analysis, we want to approximate a secondary structural element with a single straight line" (Burkowski, 2009: page 212). For example, Fig. 10 uses two straight lines that act as the longitudinal axis of β-Strand A (i.e. A chain), β-Strand B (i.e. B chain) respectively. Each straight line should be positioned among the Cα atoms so that it is closest to all these Cα atoms in a least-squares sense, which is to minimize the sum of the squares of the perpendicular

**Figure 10.** The 3D-structural frame of AB chains of Model 1 in Fig. 3 with two β-Strands.

distances (*di*) from the Cα atoms to the strand/helix axis:

$$\mathbf{S}^\* \mathbf{S}^\* = \min \mathbf{S} = \sum\_{l} \mathbf{N}^{\parallel} \parallel \mathbf{d}\_l \parallel \mathbf{2}. \tag{13}$$

Define the vector *w***=(***wx***,** *wy***,** *wz***)T** for the axis. Then *di* represents the perpendicular vector going from Cα atom *a* to the axis:

(16)

$$\begin{array}{rcl} \mid \mid d\_{l} \mid \mid \, ^{2} = \mid \mid \, a^{\otimes} \mid \mid ^{2} \sin \theta \rangle = \mid \mid \, a^{\otimes} \mid \mid \, ^{2} \left( \mathbf{1} \cdot \mathbf{cos} \, \theta \right) = \mid \mid \, a^{\otimes} \mid \mid \, ^{2} \left\{ \mathbf{1} \cdot \left( \mathbf{a}^{\otimes \top} \, \mathbf{w} \right) \neq \left( \mid \mid \, a^{\otimes} \mid \mid \, ^{2} \\\mid \mid \, \mathbf{w} \mid \mid \, ^{2} \right\} \right\} = \left\{ \mathbf{a}^{\otimes} \, ^{2} + \mathbf{a}^{\otimes} \, ^{2} + \mathbf{a}^{\otimes} \, ^{2} \right\} \left\{ \mathbf{1} \cdot \left( \mathbf{a}^{\otimes} \, \mathbf{w} \, \mathbf{z} + \mathbf{a}^{\otimes} \, \mathbf{w} \, \mathbf{y} + \mathbf{a}^{\otimes} \, \_{z} \mathbf{w} \, \mathbf{z} \right) \neq \left\{ \left( \mathbf{a}^{\otimes} \, \mathbf{z}^{\perp} + \mathbf{a}^{\otimes} \right) \neq \left( \mathbf{1} \cdot \mathbf{a}^{\otimes} \right) \right\} \right\}. \\ \mid \, \mathbf{a}^{\otimes} \mid \, ^{2} \left( \left( \mathbf{w} \, \mathbf{z}^{\perp} + \mathbf{w} \, \mathbf{y}^{\perp} + \mathbf{w} \, \mathbf{z}^{\perp} \right) \right) \}. \end{array}$$

According to Eqs. 13~14, for the β-Strand A – β-Strand B of AB chains, we get the following two optimization problems for Model 1 respectively:

min SA = ( (-16.196)2 + 8.3152 + 1.0612 ) { 1- (-16.196*wx* + 8.315*wy* + 1.061*wz* )2 /[( (- 16.196)2 + 8.3152 + 1.0612 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

306 Recent Advances in Crystallography

squares of the perpendicular

Quasi-Newton methods (a special case of variable metric methods) are to approximate the Hessian. Currently, the most common quasi-Newton algorithms are the SR1 formula, the BHHH method, the widespread BFGS method and its limited /low-memory extension LBFGS, and Broyden's methods (http://en.wikipedia.org/wiki/Quasi-Newton\_method). In Amber (Case et al., 2010) and Gromacs (van der Spoel et al., 2010), LBFGS is used, and the hybrid of LBFGS with CG - a Truncated Newton linear CG method with optional LBFGS

If a NMR or X-ray structure of a protein has not been determined and stored in PDB bank yet, we still can easily get the 3D-structural frame of the protein. For example, before 2005 when we did not know the NMR structure of rabbit prion protein, we could get its homology model structure using the NMR structure of the human prion protein (PDB id: 1QLX) as the template (Zhang et al., 2006). We may use the homology structure to determine the 3D-structural frame of a protein when its NMR or X-ray structure has not been determined yet. The determination is an optimization problem described as follows.

"Very often in a structural analysis, we want to approximate a secondary structural element with a single straight line" (Burkowski, 2009: page 212). For example, Fig. 10 uses two straight lines that act as the longitudinal axis of β-Strand A (i.e. A chain), β-Strand B (i.e. B chain) respectively. Each straight line should be positioned among the Cα atoms so that it is closest to all these Cα atoms in a least-squares sense, which is to minimize the sum of the

Preconditioning (Nocedal and Morales, 2000) - is used in Amber (Case et al., 2010).

**2.3. New thinking about the construction of 3D-structure of a protein** 

**Figure 10.** The 3D-structural frame of AB chains of Model 1 in Fig. 3 with two β-Strands.

Define the vector *w***=(***wx***,** *wy***,** *wz***)T** for the axis. Then *di* represents the perpendicular vector

S\* = min S = ∑i=1 N ||*di*||2. (13)

distances (*di*) from the Cα atoms to the strand/helix axis:

going from Cα atom *a* to the axis:

 ( (-12.977)2 + 6.4602 + 1.9082 ) { 1- (-12.977*wx* + 6.460*wy* + 1.908*wz* )2 /[( (-12.977)2 + 6.4602 + 1.9082 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( ( -9.178)2 + 6.7452 + 1.4482 ) { 1- ( -9.178*wx* + 6.745*wy* + 1.448*wz* )2 /[( (- 9.178)2 + 6.7452 + 1.4482 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( ( -6.455)2 + 4.1122 + 1.5582 ) { 1- ( -6.455*wx* + 4.112*wy* + 1.558*wz* )2 /[( (-6.455)2 + 4.1122 + 1.5582 ) (*wx* 2 + *wy* 2 + *wz* 2)]} + (15)

 ( ( -3.006)2 + 5.7502 + 1.7822 ) { 1- ( -3.006*wx* + 5.750*wy* + 1.782*wz* )2 /[( (-3.006)2 + 5.7502 + 1.7822 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( ( -1.226)2 + 2.7502 + 0.2332 ) { 1- ( -1.226*wx* + 2.750*wy* + 0.233*wz* )2 /[( (-1.226)2 + 2.7502 + 0.2332 ) (*wx* 2 + *wy* 2 + *wz* 2)]},

min SB = ( (-0.959)2 + 2.9502 +(- 4.817)2 ) { 1- (-0.959*wx* + 2.950*wy* -4.817*wz* )2 /[( (-0.959)2 + 2.9502 +(- 4.817)2 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( (-3.465)2 + 4.9992 +(-2.846)2 ) { 1- (-3.465*wx* + 4.999*wy* -2.846*wz* )2 /[( (-3.465)2 + 4.9992 + (-2.846)2 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( (-7.213)2 + 4.4122 +(-3.340)2 ) { 1- (-7.213*wx* + 4.412*wy* -3.340*wz* )2 /[( (-7.213)2 + 4.4122 + (-3.340)2 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( (-9.954)2 + 7.0782 +(-3.168)2 ) { 1- (-9.954*wx* + 7.078*wy* -3.168*wz* )2 /[( (-9.954)2 + 7.0782 + (-3.168)2 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( (-13.660)2 + 6.2412 +(-3.137)2 ) { 1- (-13.660*wx* + 6.241*wy* -3.137*wz* )2 /[( (-13.660)2 + 6.2412 + (-3.137)2 ) (*wx* 2 + *wy* 2 + *wz* 2)]} +

 ( (-16.702)2 + 8.5072 +(-3.074)2 ) { 1- (-16.702*wx* + 8.507*wy* -3.074*wz* )2 /[( (-16.702)2 + 8.5072 + (-3.074)2 ) (*wx* 2 + *wy* 2 + *wz* 2)]}.

We solve Eqs. 15~16 (taking the average of the coordinates of Cα atoms as initial solutions), getting their optimal solutions **w1** = (-10.751, 6.428, 1.411)T, **w2** = (-7.960, 4.579, -2.256)T respectively (Fig. 10). We may use the vectors **w1, w2** and Eq. 12 to construct Chains GH and then build an optimal Model 1 (Aqvist, 1986; Abagyan and Maiorov, 1988; Orengo et al., 1992; Young et al., 1999; Foote and Raman, 2000). In (Burkowski, 2009: pages 213-216), *wx* 2 + *wy* 2 + *wz* 2 =1 (i.e. *w* is a unit vector) is restrained and Eq. 13 becomes into a problem to seek the smallest eigenvalue (S\*) and its corresponding eigenvector **w** of the following matrix:

((∑i=1N(ay(i))2 + (az(i))2, -∑i=1Nax(i)ay(i) , -∑i=1Naz(i)ax(i))T, (-∑i=1Nax(i)ay(i) , ∑i=1N(az(i))2 + (ax(i))2, - ∑i=1Nay(i)az(i))T, (-∑i=1Naz(i)ax(i), -∑i=1Nay(i)az(i), ∑i=1N(ax(i))2 + (ay(i))2)T).

This matrix is symmetric and positive definite, and its eigenvectors form an orthogonal basis for the set of atoms under consideration. In physics, it is called the inertial tensor involving studies of rotational inertia and its eigenvectors are called the principle axes of inertia. Furthermore, we may also notice that Eq. 13 can be rewritten as

$$\min \left( \sum \mathbb{L}^{\perp\_1 \mathbb{N}} \mid \lvert \mathbf{d} \rvert \mid \mathbb{L} \right) \mathbb{z} \quad \text{subject to} \ \mathbf{w}^{\tau} \mathbf{w} \mathbf{w} \mathbf{1}, \tag{17}$$

where ||*di*||2 = (*a(i)x* 2 + *a(i)y* 2 + *a(i)z* 2) (*wx* 2 + *wy* 2 + *wz* 2) – ( *a(i)x wx* + *a(i)y wy* + *a(i)z wz* )2 . Thus, Eq. 17 can be easily solved by the canonical dual global optimization theory (Gao et al., 2012; Gao and Wu, 2012; Gao, 2000), by the ways of solving the canonical dual of Eq. 17 or solving the quadratic differential equations of the prime-dual Gao-Strang complementary function (Gao et al., 2012; Gao and Wu, 2012; Gao, 2000) through some ordinary or partial differential equation computational strategies.
