**2.1 The simplex method**

A geometrical figure having *M* + 1 interconnected vertices is called as *simplex,* where dimensionality of the energy function is *M.* Thus, a simplex with function of two variables will have a triangular shape. Further, for a function of three variables simplex will have tetrahedral shape. Therefore, for an energy function of 3 *N* Cartesian coordinates the simplex will have 3 *N* + 1 vertices; but simplex will have 3 *N –* 5 vertices, if internal coordinates are used. The energy could be calculated for a specific set of coordinates correspond to each every vertex. For the function *f(x,y)* = *x*<sup>2</sup>  *+ 2y*<sup>2</sup> the simplex method would use a triangular simplex [5].

The simplex algorithm identifies an energy minimum by traveling around on the potential energy surface in a manner that is similar to the movement of an amoeba. There are three possible primary moves. The most common move is a reflection of the vertex having maximum value on the opposite sides of the simplex. The reflection is used as an effort to produce a new point having a lower value. If this is the lowest energy point than any other points in the simplex then next move may be applied which is a "reflection and expansion." Reflection move will be failed to generate a better point, when a "floor of the valley" is reached. In this situation, simplex will make simple contraction all along the highest point dimension. If this fails to further decrease the energy then another kind of move is possible. In this move, contractions occur in all the directions towards the lowest point. The **Figure 2** illustrates above discussed three moves.

The vertices of the initial simplex have to be first generated before applying the simplex algorithm. The first conformation of the method fit to just one of these vertices. Rest of the points can be generated using various methodologies, e.g.

**123**

measured for each new point.

*contract in one dimension and contract around the lowest point).*

**Figure 2.**

*Energy Minimization*

*DOI: http://dx.doi.org/10.5772/intechopen.94809*

simplest method is to increase a fix value to each coordinate successively. To calculate the functional value of the applicable vertex, the energy of the whole system is

*The three basic moves permitted to the simplex algorithm (reflection, and its close relation reflect-and-expand;* 

When the starting configuration of the system is having high energy, it is best to use simplex method. The simplex method is more helpful in this because it seldom go wrong in the identification of a fitter answer. Nonetheless, it requires large computational time for the analysis of the high number of energy instances. For e.g. to create the starting simplex needs 3 *N* + 1 energy analysis. Due to this, the simplex method is frequently used along with other Minimization algorithms. In practice, starting configuration is fine tuned with few steps of the simplex method and then a more suitable and efficient method can be used for further calculations [6]. An important question is that what is the reason behind containing one extra vertex in the simplex than the degree of freedom? The answer to this is that of simplex is having lesser vertices than *M +* 1 then the simplex algorithm cannot search the entire surface of the energy. For e.g. if the simplex having just two vertex (a simplex with only two vertices is simply a straight line) is being used to search the quadratic surface of the energy, the only available move in this scenario would be

*Energy Minimization DOI: http://dx.doi.org/10.5772/intechopen.94809*

*Homology Molecular Modeling - Perspectives and Applications*

coordinates of the solvent molecules.

**1.2 Derivatives**

is more efficient [3].

the energies by 2*δx*i.

**2.1 The simplex method**

 *+ 2y*<sup>2</sup>

*f(x,y)* = *x*<sup>2</sup>

**2. Non-derivative minimization methods**

**Figure 2** illustrates above discussed three moves.

using its x-ray generated structure. Then place this in a solvent completely. Monte Carlo or molecular dynamics simulation can generate the atomics or Cartesian

For derivatives based Minimization methods, calculation of the derivatives of the energy is performed with respect to the different variables i.e. Cartesian or internal coordinates, as the case may be. These derivatives can be generated using either analytical or numerical procedures but derivatives obtained through analytical procedure are more preferred because these can be generated more readily and these are more exact. Although, if derivatives generated by only numerical procedure is available then one should use a non-derivative Minimization procedure as it

Although, in some situations it is always preferable to use derivatives generated though numerical procedure. By following way these can be generated: suppose there is a small alteration (*δx*i) in one of the coordinates *x*i and the energy calculation is performed for this new alteration the by dividing the alteration in energy (*δE*) by the alteration in coordinate (*δE / δx*j), the derivative *∂E/∂x*i is obtained. This rigorously yields the derivative at the mid-point between the two points *x*i and *x*i + *δx*i. A more correct value of the derivative at the point *x*i; could also be acquired (at the price of a further energy calculation) by assessing the energy at two points, *x*i + *δx*i and *x*i – *δx*i. The derivative is then obtained by dividing the variation with in

A geometrical figure having *M* + 1 interconnected vertices is called as *simplex,* where dimensionality of the energy function is *M.* Thus, a simplex with function of two variables will have a triangular shape. Further, for a function of three variables simplex will have tetrahedral shape. Therefore, for an energy function of 3 *N* Cartesian coordinates the simplex will have 3 *N* + 1 vertices; but simplex will have 3 *N –* 5 vertices, if internal coordinates are used. The energy could be calculated for a specific set of coordinates correspond to each every vertex. For the function

 the simplex method would use a triangular simplex [5]. The simplex algorithm identifies an energy minimum by traveling around on the potential energy surface in a manner that is similar to the movement of an amoeba. There are three possible primary moves. The most common move is a reflection of the vertex having maximum value on the opposite sides of the simplex. The reflection is used as an effort to produce a new point having a lower value. If this is the lowest energy point than any other points in the simplex then next move may be applied which is a "reflection and expansion." Reflection move will be failed to generate a better point, when a "floor of the valley" is reached. In this situation, simplex will make simple contraction all along the highest point dimension. If this fails to further decrease the energy then another kind of move is possible. In this move, contractions occur in all the directions towards the lowest point. The

The vertices of the initial simplex have to be first generated before applying the simplex algorithm. The first conformation of the method fit to just one of these vertices. Rest of the points can be generated using various methodologies, e.g.

**122**

**Figure 2.**

*The three basic moves permitted to the simplex algorithm (reflection, and its close relation reflect-and-expand; contract in one dimension and contract around the lowest point).*

simplest method is to increase a fix value to each coordinate successively. To calculate the functional value of the applicable vertex, the energy of the whole system is measured for each new point.

When the starting configuration of the system is having high energy, it is best to use simplex method. The simplex method is more helpful in this because it seldom go wrong in the identification of a fitter answer. Nonetheless, it requires large computational time for the analysis of the high number of energy instances. For e.g. to create the starting simplex needs 3 *N* + 1 energy analysis. Due to this, the simplex method is frequently used along with other Minimization algorithms. In practice, starting configuration is fine tuned with few steps of the simplex method and then a more suitable and efficient method can be used for further calculations [6].

An important question is that what is the reason behind containing one extra vertex in the simplex than the degree of freedom? The answer to this is that of simplex is having lesser vertices than *M +* 1 then the simplex algorithm cannot search the entire surface of the energy. For e.g. if the simplex having just two vertex (a simplex with only two vertices is simply a straight line) is being used to search the quadratic surface of the energy, the only available move in this scenario would be

#### **Figure 3.**

*The sequential univariate search procedure. From the starting point 1, two steps are created along one of the coordinates to give points 2 and 3. A parabola is fitted to these three points and the minimum located (point 4). The same steps is then repeated along the next coordinate (points 5, 6 and 7) (Figure adapted from Schlegel H B 1987. Optimization of equilibrium geometries and transition structures In Lawley K P (editor) ab initio methods in quantum chemistry - I New York, John Wiley, pp. 249–286).*

to find out other points that lie on this straight line. In this case, the energy surface which is away from the straight line would not be searched. Likewise, if we have function of three variable and simplex is just a triangle then only the area of search space that lies in the same plane as to the triangle will only be searched, whereas the energy minimum may not be present at this plane [7].

### **2.2 The sequential univariate search method**

It is seldom appropriate to use the simplex method for the calculations involve in quantum mechanics because in that case very high number of energy assessments has to be done. In this case a much befitting non-derivative procedure like the sequential univariate search method is well-advised [8]. This procedure consistently repeats through the coordinates successively. For every coordinate, two new configurations are created by making changes in the present coordinates (i.e. *x*j + *δx*i, and *x*i + *2δx*i). Then the energy calculation for these two configurations is performed. Three points related to the two twisted configurations and the original one are then fitted with a parabola. The identification of the minimum point in the current quadratic function is performed. Then in the next step, the coordinate is twisted to the point of the minimum. The procedure is illustrated in **Figure 3**.

The minimum is bound to reach when the changes in all the coordinates are adequately very small. Alternatively, a new iteration is performed. In comparison to the simplex method, the sequential univariate method normally needs less function assessment. But if two or more coordinates have a strong connection or bonding then the sequential univariate search method may converge slowly. It also converges slowly when the energy surface is similar to a long narrow valley.

### **3. Derivative minimization methods**

Most of the favorite Minimization procedures utilize derivatives because the information which is helpful in minimization is furnished by derivatives. The direction of the first derivative of the energy (the gradient) points where the minimum

**125**

*Energy Minimization*

*DOI: http://dx.doi.org/10.5772/intechopen.94809*

different non-moving point).

efficient procedure.

lies and the magnitude of the gradient tells about the steepness of the local slope. The energy of the system can be decreased by moving each atom with respect to the force acting on it. The force is equal to minus the gradient. Second derivatives point towards the curvature of the function. This information can be utilized to find out where the function will change its direction (i.e. pass through a minimum or any

The energy functions which are often utilized in molecular modeling are seldom quadratic and thus the Taylor series expansion can only be a well-advised approximation. There are two crucial consequences of this. First is, for a pure quadratic function a given minimization procedure executes very well rather than for a molecular mechanics or quantum mechanics energy surface. For example, the Newton–Raphson algorithm can identify the minimum in a one step for a purely quadratic function. But, for a typical molecular modeling energy function, it needs to run several iterations. The second consequence is that, even though they may function very well close to a minimum, where the harmonic approximation is more logical, the harmonic approximation is very bad and far from minimum. Due to this some of the less robust methods will not be successful. Because of this reason Minimization protocol must be picked very carefully. A robust or may be inefficient method could be exploited earlier then a comparatively least robust but more

On the basis of highest order derivatives used, the derivative methods can be classified. The first derivatives or the gradients based methods are called as firstorder methods. Methods in which both first and second order derivatives are used are known as second-order methods. Because the simplex method does not use any

The *steepest descents* and the *conjugate gradient* method are two such first order Minimization algorithms which are very often used in molecular modeling. In these methods coordinates of the atoms are altered step by step with respect to their movement towards the minimum point. For each iteration (*k*), the initial point is the molecular conformation generated from last step. It is represented by the multidimensional vector *xk - 1*. For the first iteration, the starting point is the initial

The steepest descents method moves in the direction parallel to the net force, which in our geographical analogy corresponds to walking straight downhill. For *3 N* Cartesian coordinates this direction is most conveniently represented by a

Once the direction of movement is clearly characterized then it should be decided that how much distance to be covered along the gradient. Consider the two-dimensional energy surface of **Figure 4.** The gradient direction from the initial point is along the line shown. Suppose we have a cross-section through the surface along the line, the function will pass through a minimum and then increase, as shown in the figure [9]. We can identify the minimum point by performing a line search or we can take a step of arbitrary size along the direction of the force.

/ *k kk s gg* = − (2)

derivatives can thus be called as a zeroth-order method.

configuration of the system provided by the user, the vector *x1*.

**3.1 First-order minimization methods**

*3.1.1 The steepest descents method*

*3 N*-dimensional unit vector, *sk*. Thus:

#### *Energy Minimization DOI: http://dx.doi.org/10.5772/intechopen.94809*

*Homology Molecular Modeling - Perspectives and Applications*

energy minimum may not be present at this plane [7].

*methods in quantum chemistry - I New York, John Wiley, pp. 249–286).*

the point of the minimum. The procedure is illustrated in **Figure 3**.

slowly when the energy surface is similar to a long narrow valley.

**3. Derivative minimization methods**

The minimum is bound to reach when the changes in all the coordinates are adequately very small. Alternatively, a new iteration is performed. In comparison to the simplex method, the sequential univariate method normally needs less function assessment. But if two or more coordinates have a strong connection or bonding then the sequential univariate search method may converge slowly. It also converges

Most of the favorite Minimization procedures utilize derivatives because the information which is helpful in minimization is furnished by derivatives. The direction of the first derivative of the energy (the gradient) points where the minimum

**2.2 The sequential univariate search method**

to find out other points that lie on this straight line. In this case, the energy surface which is away from the straight line would not be searched. Likewise, if we have function of three variable and simplex is just a triangle then only the area of search space that lies in the same plane as to the triangle will only be searched, whereas the

*The sequential univariate search procedure. From the starting point 1, two steps are created along one of the coordinates to give points 2 and 3. A parabola is fitted to these three points and the minimum located (point 4). The same steps is then repeated along the next coordinate (points 5, 6 and 7) (Figure adapted from Schlegel H B 1987. Optimization of equilibrium geometries and transition structures In Lawley K P (editor) ab initio* 

It is seldom appropriate to use the simplex method for the calculations involve in quantum mechanics because in that case very high number of energy assessments has to be done. In this case a much befitting non-derivative procedure like the sequential univariate search method is well-advised [8]. This procedure consistently repeats through the coordinates successively. For every coordinate, two new configurations are created by making changes in the present coordinates (i.e. *x*j + *δx*i, and *x*i + *2δx*i). Then the energy calculation for these two configurations is performed. Three points related to the two twisted configurations and the original one are then fitted with a parabola. The identification of the minimum point in the current quadratic function is performed. Then in the next step, the coordinate is twisted to

**124**

**Figure 3.**

lies and the magnitude of the gradient tells about the steepness of the local slope. The energy of the system can be decreased by moving each atom with respect to the force acting on it. The force is equal to minus the gradient. Second derivatives point towards the curvature of the function. This information can be utilized to find out where the function will change its direction (i.e. pass through a minimum or any different non-moving point).

The energy functions which are often utilized in molecular modeling are seldom quadratic and thus the Taylor series expansion can only be a well-advised approximation. There are two crucial consequences of this. First is, for a pure quadratic function a given minimization procedure executes very well rather than for a molecular mechanics or quantum mechanics energy surface. For example, the Newton–Raphson algorithm can identify the minimum in a one step for a purely quadratic function. But, for a typical molecular modeling energy function, it needs to run several iterations. The second consequence is that, even though they may function very well close to a minimum, where the harmonic approximation is more logical, the harmonic approximation is very bad and far from minimum. Due to this some of the less robust methods will not be successful. Because of this reason Minimization protocol must be picked very carefully. A robust or may be inefficient method could be exploited earlier then a comparatively least robust but more efficient procedure.

On the basis of highest order derivatives used, the derivative methods can be classified. The first derivatives or the gradients based methods are called as firstorder methods. Methods in which both first and second order derivatives are used are known as second-order methods. Because the simplex method does not use any derivatives can thus be called as a zeroth-order method.

### **3.1 First-order minimization methods**

The *steepest descents* and the *conjugate gradient* method are two such first order Minimization algorithms which are very often used in molecular modeling. In these methods coordinates of the atoms are altered step by step with respect to their movement towards the minimum point. For each iteration (*k*), the initial point is the molecular conformation generated from last step. It is represented by the multidimensional vector *xk - 1*. For the first iteration, the starting point is the initial configuration of the system provided by the user, the vector *x1*.

### *3.1.1 The steepest descents method*

The steepest descents method moves in the direction parallel to the net force, which in our geographical analogy corresponds to walking straight downhill. For *3 N* Cartesian coordinates this direction is most conveniently represented by a *3 N*-dimensional unit vector, *sk*. Thus:

$$\mathfrak{s}\_k = -\mathfrak{g}\_k / \left| \mathfrak{g}\_k \right| \tag{2}$$

Once the direction of movement is clearly characterized then it should be decided that how much distance to be covered along the gradient. Consider the two-dimensional energy surface of **Figure 4.** The gradient direction from the initial point is along the line shown. Suppose we have a cross-section through the surface along the line, the function will pass through a minimum and then increase, as shown in the figure [9]. We can identify the minimum point by performing a line search or we can take a step of arbitrary size along the direction of the force.

**Figure 4.** *Steepest descents method.*

#### *3.1.2 Line search in one dimension*

The goal of a line search is to find out the minimum along a specific direction (i.e. along a line through the multidimensional space) [10]. In the very first step of the line search is to *bracket* the minimum. This implies determining three points along the line in a way that the energy of the intermediate point is less than the energy of the two extrinsic points. If it is possible to identify these kinds of three points, then it should be make sure that two extrinsic points must have at least one minimum in between. Then to reduce the distance in between the three points, an iterative algorithm could be applied which in a step by step manner, limits the minimum to a very smaller space. Theoretically, it looks easy but it may involve a large number of functional analysis. Thus it is computationally very expensive methods.

Alternatively, we can set a suitable quadratic function to the three points. Then apply differentiation to this suited function to modify an approximation to the minimum along the line which should be identified analytically. To get a better approximate, a new function can be set then, as shown in **Figure 5**. Higher-order polynomials may yield an improved fit to the bracketing points but when these

**127**

*Energy Minimization*

*3.1.3 Arbitrary step approach*

*3.1.4 Conjugate gradients minimization*

equation:

*DOI: http://dx.doi.org/10.5772/intechopen.94809*

orthogonal to the previous direction (i.e. *gk. gk − 1* = 0) [11].

are utilized with functions that altered aggressively in the bracketed region, these higher-order polynomials can yield wrong interpolations. The gradient at the minimum point obtained from the line search will be perpendicular to the previous direction. Thus, when the line search method is used to locate the minimum along the gradient then the next direction in the steepest descents algorithm will be

As, we know that the line search may be computationally very expensive, New coordinates can be identified by walking a step of arbitrary size along the gradient unit vector *sk*. The new set of coordinates after step *k* would then be given by the

where, *λ*k is the step size. In most of the applications within molecular modeling, the steepest descents algorithm, the step size at the start has a predetermined default value. If energy decreases after the first iteration, then for second iteration the step size is increases by an increasing component. The process repeats till the point at which each iteration decreases the energy. When a step produces an addition in energy, it is assumed that the algorithm has leapt across the valley which comprise the minimum and up the slope on the opposite face. The step size is then reduced by a multiplicative factor (e.g. 0.5). Often, the size of the step is decided according to the nature of the energy surface. It would be more suitable to have bigger step size for a plane or flat surface rather than a slender or narrow altered valley, where more smaller step are much appropriate. Computational time is less in the case of the arbitrary step method than much stringent line search method, because the arbitrary or random step approach needs higher number of steps to find out the minimum than line search method but

arbitrary step method may frequently needs lesser functional analysis [12].

The largest inter-atomic forces indicate the direction of the gradient. Therefore, the steepest descent is more suitable for alleviating attributes of the highest-energy in the initial conformation. If the harmonic calculations corresponding to the energy is hypothesized badly and the initial point is distant from a minimum, even then the method performs strongly. But, in the case of downward movements in a long slender valley, the method uses short steps in high number and this causes trouble to the method. Although, it is not suitable manner to find out the minimum, the steepest descents process is bound to move in the right-angled direction at every point. The route constantly over compensates itself and vibrates. However, Subsequent steps reintroduce errors which were already rectified by prior steps [13].

The vibrating activity of the steepest descents procedure in slender depression is absent in the set of directions generated by the conjugate gradients methods. Rather, both the directions of consecutive steps and the gradients are orthogonal in the steepest descents method [14]. More specifically, in the conjugate gradients method, the gradients are orthogonal in nature at every point and the directions of consecutive steps are *conjugate* that is why it is more correctly known as the conjugate direction method. Because of the feature of a set of conjugate directions, for a quadratic function of *M* variables, in *M* steps the minimum can be identified. The conjugate gradients method moves in a direction *v*k from point *x*k where *v*k is computed from the gradient at the point and the previous direction vector *v*k – 1*.*

*xk* + 1 = *x*k + *λ*<sup>k</sup> *s*<sup>k</sup> (3)

are utilized with functions that altered aggressively in the bracketed region, these higher-order polynomials can yield wrong interpolations. The gradient at the minimum point obtained from the line search will be perpendicular to the previous direction. Thus, when the line search method is used to locate the minimum along the gradient then the next direction in the steepest descents algorithm will be orthogonal to the previous direction (i.e. *gk. gk − 1* = 0) [11].

### *3.1.3 Arbitrary step approach*

*Homology Molecular Modeling - Perspectives and Applications*

*3.1.2 Line search in one dimension*

**Figure 4.**

*Steepest descents method.*

The goal of a line search is to find out the minimum along a specific direction (i.e. along a line through the multidimensional space) [10]. In the very first step of the line search is to *bracket* the minimum. This implies determining three points along the line in a way that the energy of the intermediate point is less than the energy of the two extrinsic points. If it is possible to identify these kinds of three points, then it should be make sure that two extrinsic points must have at least one minimum in between. Then to reduce the distance in between the three points, an iterative algorithm could be applied which in a step by step manner, limits the minimum to a very smaller space. Theoretically, it looks easy but it may involve a large number of functional analysis. Thus it is computationally very expensive methods. Alternatively, we can set a suitable quadratic function to the three points. Then apply differentiation to this suited function to modify an approximation to the minimum along the line which should be identified analytically. To get a better approximate, a new function can be set then, as shown in **Figure 5**. Higher-order polynomials may yield an improved fit to the bracketing points but when these

*A line search is used to locate the minimum in the function in the direction of the gradient.*

**126**

**Figure 5.**

As, we know that the line search may be computationally very expensive, New coordinates can be identified by walking a step of arbitrary size along the gradient unit vector *sk*. The new set of coordinates after step *k* would then be given by the equation:

$$\mathcal{X}\mathbb{k}\_{\text{+-}1} = \mathcal{X}\mathbb{k}\_{\text{k}} + \mathbb{k}\_{\text{k}}\mathfrak{s}\_{\text{k}} \tag{3}$$

where, *λ*k is the step size. In most of the applications within molecular modeling, the steepest descents algorithm, the step size at the start has a predetermined default value. If energy decreases after the first iteration, then for second iteration the step size is increases by an increasing component. The process repeats till the point at which each iteration decreases the energy. When a step produces an addition in energy, it is assumed that the algorithm has leapt across the valley which comprise the minimum and up the slope on the opposite face. The step size is then reduced by a multiplicative factor (e.g. 0.5). Often, the size of the step is decided according to the nature of the energy surface. It would be more suitable to have bigger step size for a plane or flat surface rather than a slender or narrow altered valley, where more smaller step are much appropriate. Computational time is less in the case of the arbitrary step method than much stringent line search method, because the arbitrary or random step approach needs higher number of steps to find out the minimum than line search method but arbitrary step method may frequently needs lesser functional analysis [12].

The largest inter-atomic forces indicate the direction of the gradient. Therefore, the steepest descent is more suitable for alleviating attributes of the highest-energy in the initial conformation. If the harmonic calculations corresponding to the energy is hypothesized badly and the initial point is distant from a minimum, even then the method performs strongly. But, in the case of downward movements in a long slender valley, the method uses short steps in high number and this causes trouble to the method. Although, it is not suitable manner to find out the minimum, the steepest descents process is bound to move in the right-angled direction at every point. The route constantly over compensates itself and vibrates. However, Subsequent steps reintroduce errors which were already rectified by prior steps [13].
