*3.1.4 Conjugate gradients minimization*

The vibrating activity of the steepest descents procedure in slender depression is absent in the set of directions generated by the conjugate gradients methods. Rather, both the directions of consecutive steps and the gradients are orthogonal in the steepest descents method [14]. More specifically, in the conjugate gradients method, the gradients are orthogonal in nature at every point and the directions of consecutive steps are *conjugate* that is why it is more correctly known as the conjugate direction method. Because of the feature of a set of conjugate directions, for a quadratic function of *M* variables, in *M* steps the minimum can be identified. The conjugate gradients method moves in a direction *v*k from point *x*k where *v*k is computed from the gradient at the point and the previous direction vector *v*k – 1*.*

Both the conjugate gradients method and the steepest descents method move in the direction of the gradient in the first step. The line search method should ideally be used to find out the one-dimensional minimum in all direction to assure that each gradient is orthogonal to all preceding gradients and that each direction is conjugate to all preceding directions. However, at this stage random step procedure is also achievable [15]. To identify the second point a line search should be applied along the line with gradient but it must pass through the point. Therefore, the conjugate gradient procedure identifies the perfect minimum of the function in just two moves.

#### **3.2 Second order derivative methods**

#### *3.2.1 The Newton-Raphson method*

Second-order methods utilize the information from both the first derivatives and the second derivatives to find out a minimum. First derivatives provide gradient information while second derivative furnish details about the curvature of the function. Having these properties, the Newton–Raphson method is the simplest second order method [16]. For a strictly quadratic function of the first derivative the second derivative will be same everywhere. If we talk about a multidimensional function the Hessian matrix of second derivatives essentially be inverted. Thus, for larger molecules it is more computationally expensive because there are a large number of atoms present and this necessitates bigger storage. The Newton-Raphson method is thus more appropriate to small molecules (usually less than 100 atoms or so) [17].

As stated earlier, for a strictly quadratic function, the Newton–Raphson method requires just one step to locate the minimum from any point on the surface. Practically, the surface is exclusively quadratic to a first approximation and this necessitates a large number of steps to move. The Hessian matrix of second derivatives should be calculated first and then inverted at each step. This must be 'positive definite' in a Newton–Raphson Minimization method. A positive definite matrix is one for which all the eigen-values are positive. When the Hessian matrix is not positive definite then the Newton–Raphson method moves to saddle points where the energy increases, rather than a narrow point where energy decreases. Additionally, the harmonic approximation is not suitable at positions which are very far from the minimum because this leads to instability of the Minimization. This can be solved by employing a more efficient and robust method (prior to the application of the Newton–Raphson method) to find out minimum or to reach close to minimum (in case of the positive definite Hessian matrix) [18].

#### *3.2.2 Quasi-Newton method*

Computation of the inverse Hessian matrix can be a possibly long procedure that represents an important disadvantage to the 'pure' second derivative methods such as Newton–Raphson. Furthermore, analytical second derivatives could not be generated preferably. Variable metric methods which are also an alternative name to the Quasi-Newton methods gradually develop the inverse Hessian matrix in consecutive iterations. That means, a sequence of matrices *H*k is developed.

At each iteration *k*, the new positions *x*k + 1 are obtained from the current positions *x*k, the gradient *g*k and the current approximation to the inverse Hessian matrix *H*k. For quadratic function it is same, but for 'real' job a line search may be desired. Hence, a line search is performed along the vector (*x*k + 1 — *x*k). It may not be essential to find out the minimum in the direction of the line search very accurately, at the cost of a few more steps of the quasi-Newton algorithm [19]. For quantum

**129**

*Energy Minimization*

*DOI: http://dx.doi.org/10.5772/intechopen.94809*

**4. Which minimization method should 1 use?**

mechanics calculations the additional energy evaluations required by the line search may prove more expensive than using the more approximate approach. An effective compromise is to fit a function to the energy and gradient at the current point *x*k and

The choice of Minimization algorithm is determined by a number of components, including the storage and computational requirements, the relative speeds with which the various parts of the calculation can be performed, the availability of analytical derivatives and the robustness of the method. Thus, any method that requires the Hessian matrix to be stored (let alone its inverse calculated) may present memory problems when applied to systems containing thousands of atoms. Calculations on systems of this size are invariably performed using molecular mechanics, and so the steepest descents and the conjugate gradients methods are very popular here. For molecular mechanics calculations on small molecules, the Newton–Raphson method may be used, although this algorithm can have problems with structures that are far from a minimum. For this reason it is usual to perform a few steps of Minimization using a more robust method such as the simplex or steepest descents before applying the Newton–Raphson algorithm Analytical expressions for both first and second derivatives are available for most of the terms found in common force fields. The steepest descent method can actually be superior to conjugate gradients when the starting structure is some way from the minimum. However, conjugate gradients are much better once the initial strain has been removed. Quantum mechanical calculations are restricted to systems with relatively small numbers of atoms, and so storing the Hessian matrix is not a problem. As the energy calculation is often the most time-consuming part of the calculation, it is desirable that the Minimization method chosen takes as few steps as possible to reach the minimum. For many levels of quantum mechanics theory analytical first derivatives are available. However, analytical second derivatives are only available for a few levels of theory and can be expensive to compute. The quasi-Newton methods are thus particularly popular for quantum mechanical calculations. When using internal coordinates in a quantum mechanical Minimization it can be important to use an appropriate Z-matrix as input. For many systems the Z-matrix can often be written in many different ways as there are many combinations of internal coordinates. There should be no strong coupling between the coordinates. Dummy atoms can often help in the construction of an appropriate Z-matrix. A dummy atom is used solely to define the geometry and has no nuclear charge and no basis functions. Strong coupling between coordinates can give long 'valleys' in the energy surface, which may also present problems. Care must be taken when defining the Z-matrix for cyclic systems in particular. The natural way to define a cyclic compound would be to number the atoms sequentially around the ring. However, this would then mean that the ring closure bond will be very strongly coupled to all of the other bonds, angles and torsion angles. Some quantum mechanics programs are able to convert the input coordinates (be they Cartesian or internal) into the most efficient set for Minimization so removing from the user the problems of trying to decide what is an appropriate set of internal coordinates. For energy Minimizations redundant internal coordinates have been shown to give significant improvements in efficiency compared with Cartesian coordinates or non-redundant internal coordinates, especially for flexible and polycyclic systems [21]. The redundant internal coordinates employed generally com- comprise the bond lengths, angles and torsion angles in the system. These methods obviously also

at the point *x*k + 1 and find out the minimum in the fitted function [20].

#### *Energy Minimization DOI: http://dx.doi.org/10.5772/intechopen.94809*

*Homology Molecular Modeling - Perspectives and Applications*

**3.2 Second order derivative methods**

*3.2.1 The Newton-Raphson method*

100 atoms or so) [17].

*3.2.2 Quasi-Newton method*

Both the conjugate gradients method and the steepest descents method move in the direction of the gradient in the first step. The line search method should ideally be used to find out the one-dimensional minimum in all direction to assure that each gradient is orthogonal to all preceding gradients and that each direction is conjugate to all preceding directions. However, at this stage random step procedure is also achievable [15]. To identify the second point a line search should be applied along the line with gradient but it must pass through the point. Therefore, the conjugate gradient procedure identifies the perfect minimum of the function in just two moves.

Second-order methods utilize the information from both the first derivatives and the second derivatives to find out a minimum. First derivatives provide gradient information while second derivative furnish details about the curvature of the function. Having these properties, the Newton–Raphson method is the simplest second order method [16]. For a strictly quadratic function of the first derivative the second derivative will be same everywhere. If we talk about a multidimensional function the Hessian matrix of second derivatives essentially be inverted. Thus, for larger molecules it is more computationally expensive because there are a large number of atoms present and this necessitates bigger storage. The Newton-Raphson method is thus more appropriate to small molecules (usually less than

As stated earlier, for a strictly quadratic function, the Newton–Raphson method

Computation of the inverse Hessian matrix can be a possibly long procedure that represents an important disadvantage to the 'pure' second derivative methods such as Newton–Raphson. Furthermore, analytical second derivatives could not be generated preferably. Variable metric methods which are also an alternative name to the Quasi-Newton methods gradually develop the inverse Hessian matrix in consecutive

At each iteration *k*, the new positions *x*k + 1 are obtained from the current positions *x*k, the gradient *g*k and the current approximation to the inverse Hessian matrix *H*k. For quadratic function it is same, but for 'real' job a line search may be desired. Hence, a line search is performed along the vector (*x*k + 1 — *x*k). It may not be essential to find out the minimum in the direction of the line search very accurately, at the cost of a few more steps of the quasi-Newton algorithm [19]. For quantum

requires just one step to locate the minimum from any point on the surface. Practically, the surface is exclusively quadratic to a first approximation and this necessitates a large number of steps to move. The Hessian matrix of second derivatives should be calculated first and then inverted at each step. This must be 'positive definite' in a Newton–Raphson Minimization method. A positive definite matrix is one for which all the eigen-values are positive. When the Hessian matrix is not positive definite then the Newton–Raphson method moves to saddle points where the energy increases, rather than a narrow point where energy decreases. Additionally, the harmonic approximation is not suitable at positions which are very far from the minimum because this leads to instability of the Minimization. This can be solved by employing a more efficient and robust method (prior to the application of the Newton–Raphson method) to find out minimum or to reach close to minimum (in

case of the positive definite Hessian matrix) [18].

iterations. That means, a sequence of matrices *H*k is developed.

**128**

mechanics calculations the additional energy evaluations required by the line search may prove more expensive than using the more approximate approach. An effective compromise is to fit a function to the energy and gradient at the current point *x*k and at the point *x*k + 1 and find out the minimum in the fitted function [20].
