Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations

*Ioan R. Ciric*

## **Abstract**

Various ordinary differential equations of the first order have recently been used by the author for the solution of general, large linear systems of algebraic equations. Exact solutions were derived in terms of a new kind of infinite series of matrices which are truncated and applied repeatedly to approximate the solution. In these matrix series, each new term is obtained from the preceding one by multiplication with a matrix which becomes better and better conditioned tending to the identity matrix. Obviously, this helps the numerical computations. For a more efficient computation of approximate solutions of the algebraic systems, we consider new differential equations which are solved by simple techniques of numerical integration. The solution procedure allows to easily control and monitor the magnitude of the residual vector at each step of integration. A related iterative method is also proposed. The solution methods are flexible, permitting various intervening parameters to be changed whenever necessary in order to increase their efficiency. Efficient computation of a rough approximation of the solution, applicable even to poorly conditioned systems, is also performed based on the alternate application of two different types of minimization of associated functionals. A smaller amount of computation is needed to obtain an approximate solution of large linear systems as compared to existing methods.

**Keywords:** matrix equations, large linear algebraic systems, solution by numerical integration

## **1. Introduction**

Exact analytic expressions in the form of infinite series of matrices for the solution of linear systems of algebraic equations were derived in [1] by integrating associated ordinary differential equations. These differential equations were obtained using a quadratic functional related to the system of algebraic equations and describe the orthogonal trajectories of the family of hypersurfaces representing the functional. More convergent matrix series were presented in [2] which can be applied to approximate the solution of the system of equations. Solution of linear systems based on the numerical integration of differential equations has originally been formulated in [3].

In Section 2 of the present book chapter, we use recently derived highly convergent series formulae for matrix exponentials [3] in order to construct improved iterative methods for solving approximately large systems of algebraic equations. In Section 3, we use novel functionals that allow to formulate differential equations which lead to a substantial increase in the efficiency of the solution process [4]. Independently of the starting point, the paths of integration of these equations converge all to the solution point of the system considered. At each step of the numerical solution, one passes, in fact, from one path to a slightly different one due to computation errors. The procedure does not require to find accurately an entire path but only the solution point which is common to all the paths. This is why we apply the simple Euler method [5] to integrate the differential equations. The computation errors are now determined by the magnitude of the second derivative of the position vector with respect to the parameter defining the location along the path. A related iterative method [6] is also described. In Section 4, two different kinds of minimization of the system functionals are applied alternately for quick computation of a rough solution of large linear systems [7].

## **2. Matrix series formulae for the solution of linear algebraic systems**

Consider a system of equations written in matrix form as

$$Ax - b = 0\tag{1}$$

where *A* ∈*R<sup>n</sup>*�*<sup>n</sup>* is an arbitrary nonsingular matrix, *b*∈*R<sup>n</sup>* is a given *n*-dimensional vector and *<sup>x</sup>* <sup>¼</sup> ð Þ *<sup>x</sup>*1, *<sup>x</sup>*2, … , *xn <sup>T</sup>* is the unknown *<sup>n</sup>*-dimensional vector, with *<sup>T</sup>* indicating the transpose. Assume that *x* is a continuous function of the real variable *v* over a certain interval and associate to (1) the vector differential equation of the first order

$$\frac{d\mathbf{x}}{dv} = f(v)(Ax - b) \tag{2}$$

where *f v*ð Þ is a continuous function to be chosen.

### **2.1 Exact analytic expressions for the solution of (2)**

Imposing the condition *x v*ð Þ¼ *<sup>o</sup> xo*, *xo* being a chosen position vector, (2) has a unique solution over a specified interval [4], namely,

$$\mathbf{x}(v) = \mathbf{x}\_o + e^{-A\mathbf{g}(v\_o)} \sum\_{k=0}^{\infty} \frac{A^k}{(k+1)!} \left[ (\mathbf{g}(v))^{k+1} - (\mathbf{g}(v\_o))^{k+1} \right] (A\mathbf{x}\_o - b) \tag{3}$$

where *g v*ð Þ� <sup>Ð</sup> *f v*ð Þ*dv* is a primitive of *f v*ð Þ, i.e., *f v*ð Þ¼ *dg=dv:*If *f v*ð Þ is taken to be *f v*ð Þ¼ 1*=v*, then *g v*ð Þ¼ ln *v:* Choosing *vo* ¼ 1 gives *g v*ð Þ¼ *<sup>o</sup>* 0 and (3) becomes now

$$\mathfrak{x}(v) = \mathfrak{x}\_o + (\ln v) \sum\_{k=0}^{\infty} \frac{(A \ln v)^k}{(k+1)!} (A \mathfrak{x}\_o - b) \tag{4}$$

Over the interval *v*∈ð Þ 0, 2 , *x v*ð Þ can also be expressed in the form [3]

*Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations DOI: http://dx.doi.org/10.5772/intechopen.104209*

$$\mathbf{x}(\boldsymbol{\nu}) = \mathbf{x}\_o - (\mathbf{1} - \boldsymbol{\nu}) \left[ I + \sum\_{k=1}^{\infty} \frac{(\mathbf{1} - \boldsymbol{\nu})^k}{k + \mathbf{1}} (I - A) \left( I - \frac{A}{2} \right) \cdots \left( I - \frac{A}{k} \right) \right] (A\boldsymbol{x}\_o - b) \tag{5}$$

*I* denoting the identity matrix of order *n*. The solution of (1) is theoretically obtained for *v* ¼ 0, the series being in general extremely low convergent. Since the rate of convergence of the series in (5) is very small for *v* very close to the value corresponding to the solution of (1), i.e., *v* ¼ 0, this formula should be applied repeatedly for a *v* not too close to zero, *v*∈ (0,1), until a satisfactory small distance to the solution *x*ð Þ 0 is reached.

#### **2.2 Highly convergent series iteration formulae**

Practical formulae for an approximate solution of (1) can be derived by using matrix series that are much more convergent than the one in (5). This can be done by writing (4) in the form

$$\mathcal{X}(v) = \mathfrak{x}\_o + A^{-1}(e^{A \ln v} - I)(A\mathfrak{x}\_o - b) \tag{6}$$

and by expressing the matrix exponential in terms of series of superior convergence given recently in ref. [3]. Very close to the solution *v* ¼ 0, say for *<sup>v</sup>* <sup>¼</sup> *<sup>e</sup>*�*<sup>N</sup>*, *<sup>N</sup>* <sup>≫</sup> 1 , we have *<sup>e</sup><sup>A</sup>* ln *<sup>v</sup>* <sup>¼</sup> *<sup>e</sup>*�*NA* and with [3]

$$\begin{aligned} e^{-NA} - I &= -\mathbf{10}^{-q} c\_q N A \left[ I + \sum\_{k=1}^{\infty} \frac{(-1)^k \mathbf{10}^{-qk}}{k+1} (I + c\_q N A) \right] \\ &\times \left( I + \frac{c\_q N A}{2} \right) \cdots \left( I + \frac{c\_q N A}{k} \right) \end{aligned} \tag{7}$$

where *<sup>q</sup>* <sup>&</sup>gt; 0 and *cq* � <sup>1</sup>*<sup>=</sup>* ln 1 <sup>þ</sup> <sup>10</sup>�*<sup>q</sup>* ð Þ , we get

$$\begin{aligned} \left(\mathbf{x} (e^{-N}) = \mathbf{x}\_o - \mathbf{10}^{-q} c\_q N \left[ I + \sum\_{k=1}^{\infty} \frac{(-1)^k \mathbf{10}^{-qk}}{k+\mathbf{1}} \left( I + c\_q N \mathbf{A} \right) \right] \\\\ \times \left( I + \frac{c\_q N \mathbf{A}}{2} \right) \cdots \left( I + \frac{c\_q N \mathbf{A}}{k} \right) \left( A \mathbf{x}\_o - b \right) \end{aligned} \tag{8}$$

In order to perform numerical computations, the number of terms in the series has to be appropriately chosen. To have a small *cq* in (8) one has to take a small *q*. For *q* = 0.5, e.g., *cq* ¼ 3*:*639409. One may start with *N* = 15 which would require to retain about 50 terms in this alternating series to get a rough approximation of *x e*�<sup>15</sup> ð Þ. The computation is repeated with the new *xo* taken to be the preceding *x e*�*<sup>N</sup>* � � until an acceptable accuracy of the solution of (1) is reached.

Much more convergent formulae are constructed if we apply in (6) the expansion [3]

*Matrix Theory - Classics and Advances*

$$e^{-NA} = (1 + \mathbf{10}^{-q})^{-p} \left\{ I + p! \sum\_{k=1}^{p} \frac{\mathbf{10}^{-qk}}{k!(p-k)!} \left( I - \frac{c\_qNA}{p} \right) \left( I - \frac{c\_qNA}{p-1} \right) \cdots \left( I - \frac{c\_qNA}{p-k+1} \right) \right\} \tag{1}$$

$$- \mathbf{10}^{-(p+1)q} c\_qNA (I - c\_qNA) \left( I - \frac{c\_qNA}{2} \right) \cdots \left( I - \frac{c\_qNA}{p} \right) \left[ \frac{1}{p+1}I \right]$$

$$+ p! \sum\_{k=1}^{\infty} \frac{(-1)^k \mathbf{10}^{-qk}}{(k+1)(k+2)\cdots(k+p+1)} \left( I + c\_qNA \right) \left( I + \frac{c\_qNA}{2} \right) \cdots \left( I + \frac{c\_qNA}{k} \right) \tag{1}$$

*p* ¼ 1, 2, …

The multiplication with *A*�<sup>1</sup> from (6) is avoided by arranging the first summation in (9) in terms of powers of *A* and by taking into account that

$$p!\sum\_{k=1}^{p}\frac{\mathbf{10^{-qk}}}{k!(p-k)!} = (\mathbf{1}+\mathbf{10^{-q}})^p - \mathbf{1} \tag{10}$$

It is obvious that using (9) leads to computations with a much smaller number of terms retained in the series for a given *N*. With the same amount of computation we obtain an *x e*�*<sup>N</sup>* � � which is much closer to the exact solution *<sup>x</sup>*ð Þ <sup>0</sup> of the system of equations (1).

## **3. Methods of numerical integration**

In this Section, we use special kinds of functionals that lead to the construction of differential equations which allow a substantial increase in the efficiency of the solution of large linear algebraic systems.

### **3.1 Vector differential equations and their application to the solution of (1)**

Consider a functional of the form

$$F(\mathbf{x}) = \|A\mathbf{x} - b\|^a \tag{11}$$

associated to (1) where *α* is a positive real number to be chosen. A real variable *v*, *v*>0, *v*∈ ð Þ *vo*, *vS* , is now defined by

$$F(\mathfrak{x}(v)) = F(\mathfrak{x}(v\_o)) \, h(v) \tag{12}$$

where *h v*ð Þ is an appropriately selected real function with definite first and second derivatives in ð Þ *vo*, *vS* , *vo* corresponding to a starting point *x v*ð Þ� *<sup>o</sup> <sup>x</sup>*ð Þ <sup>0</sup> , with *h v*ð Þ¼ *<sup>o</sup>* 1, and *vS* corresponding to the solution vector of (1) *x v*ð Þ� *<sup>S</sup> xS*, with *h v*ð Þ¼ *<sup>S</sup>* 0. Denoting *F x*ð Þ <sup>0</sup> � � � *Fo*, we have

$$F\_o \frac{dh}{dv} = a||Ax - b||^{a-2} \left(A^T (Ax - b)\right)^T \frac{d\mathbf{x}}{dv} \tag{13}$$

*Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations DOI: http://dx.doi.org/10.5772/intechopen.104209*

Thus,

$$\frac{d\mathbf{x}}{dv} = \frac{F\_o}{a} \frac{dh}{dv} \frac{A^T(A\mathbf{x} - b)}{\left\|A\mathbf{x} - b\right\|^{a-2} \left\|A^T(A\mathbf{x} - b)\right\|^2} \tag{14}$$

which is the differential equation to be integrated from *v* ¼ *vo* to *v* ¼ *vS*. The second derivative of *x* is

$$\begin{split} \frac{d^2 \mathbf{x}}{dv^2} &= \frac{F\_o}{a} \frac{d^2 h}{dv^2} \frac{A^T (Ax - b)}{\|Ax - b\|^{a - 2} \left\| A^T (Ax - b) \right\|^2} \\ &+ \left( \frac{F\_o}{a} \right)^2 \left( \frac{dh}{dv} \right)^2 \frac{1}{\left\| Ax - b \right\|^{2(a - 1)} \left\| A^T (Ax - b) \right\|^2} \\ &\times \left\{ \frac{\left\| Ax - b \right\|^2}{\left\| A^T (Ax - b) \right\|^2} \Bigg| A^T A - 2 \frac{\left\| AA^T (Ax - b) \right\|^2}{\left\| A^T (Ax - b) \right\|^2} I \right\} + (2 - a)I \Bigg\} A^T (Ax - b) \end{split} \tag{15}$$

From (11) and (12) we get a useful relationship,

$$\|\|A\mathbf{x} - b\|\| = \left(F\_o \, h(v)\right)^{1/a} \tag{16}$$

that allows to simply monitor the magnitude of the residual vector of (1) during the computation process.

As explained in the Introduction we apply the Euler method for the solution of (14) and compute successively

$$\mathbf{x}^{(i+1)} = \mathbf{x}^{(i)} + \eta \left(\frac{d\mathbf{x}}{dv}\right)\_{\mathbf{x} = \mathbf{x}^{(i)}}, \ \ i = \mathbf{0}, \mathbf{1}, \mathbf{2}, \dots \tag{17}$$

where *η* is the step size. In the absence of any hint about a good starting point *x*ð Þ <sup>0</sup> corresponding to *v* ¼ *vo*, we have used the point along the normal from the origin *x* ¼ 0 to the surface *F x*ð Þ¼ *const* in (11) which is the closest to the solution point *xS* [8], i.e.,

$$\mathbf{x}^{(0)} = \frac{\left\|\mathbf{b}\right\|^2}{\left\|\mathbf{A}^T \mathbf{b}\right\|^2} \mathbf{A}^T \mathbf{b} \tag{18}$$

The function *h v*ð Þ and the parameter *α* are chosen such that the first and the second derivatives of *x* in (14) and (15) remain finite when *v* ! *vS*, i.e., when the residual k k *Ax* � *b* ! 0, while the errors evaluated with the second derivative are kept reasonably small along the interval of integration *v*∈ð Þ *vo*, *vS* . For each *h v*ð Þ, *α* is determined by imposing the condition that the second derivative in (15) tends to zero as k k *Ax* � *b* ! 0. To decide on the value of *α* for a given *h v*ð Þ, we require to have a good rate of decrease of k k *Ax* � *b* at the beginning of the computation process, for instance to have (see (11) and (12))

$$\frac{\left\|A\mathbf{x}^{(1)} - b\right\|}{\left\|A\mathbf{x}^{(0)} - b\right\|} = \left(h(\eta)\right)^{1/a} \cong \mathbf{0.8} \tag{19}$$

when *η* ¼ 0*:*1. Finally, to solve (1), we determine the actual step size to be used for the numerical solution such that the errors at the beginning of the computation process are small, say

$$\frac{\eta^2}{2} \left. \frac{1}{\left\| \mathbf{x}^{(1)} \right\|} \right| \frac{d^2 \mathbf{x}}{d\upsilon^2} \bigg|\_{\mathbf{x} = \mathbf{x}^{(0)}} < \mathbf{0}.\mathbf{01} \tag{20}$$

If the magnitude of the residual vector, which is computed at each step, does not decrease anymore the computation is continued with a new cycle of integration by applying (17) with *x*ð Þ <sup>0</sup> replaced by a new starting point, i.e.,

$$\begin{aligned} \mathbf{x}^{(new)} &= \mathbf{x}^{(last)} - \frac{\left\| \mathbf{A} \mathbf{x}^{(last)} - \mathbf{b} \right\|^2}{\left\| \mathbf{A}^T \left( \mathbf{A} \mathbf{x}^{(last)} - \mathbf{b} \right) \right\|^2} \\ &\times \mathbf{A}^T \left( \mathbf{A} \mathbf{x}^{(last)} - \mathbf{b} \right) \end{aligned} \tag{21}$$

where *x*ð Þ *last* is the position vector from the preceding step. *x*ð Þ *new* is the closest point to *xS* along the normal to the surface *F x*ð Þ¼ *const* taken at *<sup>x</sup>*ð Þ *last* [8]. Subsequent cycles of integration are performed in the same way until a satisfactory approximate solution of (1) is obtained. If the difference between *x*ð Þ *new* and *x*ð Þ *last* is insignificant we find the point along the normal to *F x*ð Þ¼ *const*, taken at *<sup>x</sup>*ð Þ *new* , where *<sup>F</sup>* has a minimum and, then, apply (21) again. It should be noted that as one approaches the solution point the direction of the gradient *AT*ð Þ *Ax* � *<sup>b</sup>* tends to become more and more perpendicular to the direction of *xS* � *x* and, thus, the residual k k *Ax* � *b* and the relative error k k *x* � *xS =*k k *xS* will not decrease any more as expected. This is why the computation has to be continued by opening a new cycle of integration. For systems with higher condition numbers, this happens more quickly.

Numerical experiments obtained using *h v*ð Þ¼ 1 � *v* with *α* ¼ 0*:*45 and also using *h v*ð Þ¼ ð Þ 1 � *v* <sup>2</sup> with *<sup>α</sup>* <sup>¼</sup> <sup>0</sup>*:*9 shows that only two up to five integration cycles with a step size of 0.1 are needed in order to get an accuracy of about 1% for the solution of systems with condition numbers up to 100.

#### **3.2 A related iterative method**

The basic idea of this method is to find, starting from a point *x*ð Þ <sup>0</sup> , a point *x*ð Þ<sup>1</sup> along the gradient of a functional (11) associated with the general system (1), such that the magnitude of the new residual vector is an established fraction of its initial value, *Ax*ð Þ<sup>1</sup> � *<sup>b</sup>* <sup>¼</sup> *<sup>τ</sup> Ax*ð Þ <sup>0</sup> � *<sup>b</sup>* , *τ* <1. Instead of performing an integration as in Section 3.1, one proceeds iteratively, i.e.,

$$\left\|Ax^{(i+1)} - b\right\| = \tau \left\|Ax^{(i)} - b\right\|, \ i = 0, 1, 2, \dots \tag{22}$$

for each iteration the starting point being the point found in the preceding iteration, with τ maintained as tight as possible at the same value. To do this, we impose that *F x*ð Þ in (11) varies from *<sup>x</sup>*ð Þ*<sup>i</sup>* to *<sup>x</sup>*ð Þ *<sup>i</sup>*þ<sup>1</sup> in the same way for each iteration, namely,

*Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations DOI: http://dx.doi.org/10.5772/intechopen.104209*

$$F(\boldsymbol{\omega}) = F\left(\boldsymbol{\omega}^{(i)}\right)h(\boldsymbol{v})\tag{23}$$

where *h v*ð Þ is now a real function of a real variable, monotone decreasing in the interval *v*∈½ � *vo*, *vo* þ *η* , *vo* ≥ 0, *η*>0, with definite first and second derivatives, and with *h v*ð Þ¼ *<sup>o</sup>* 1 such that

$$F\left(\mathbf{x}^{(i+1)}\right) = F\left(\mathbf{x}^{(i)}\right)h(v\_o + \eta), \ i = \mathbf{0}, \mathbf{1}, \mathbf{2}, \dots \tag{24}$$

Then, from (11),

$$\frac{F(\mathbf{x}^{(i+1)})}{F(\mathbf{x}^{(i)})} = \frac{\left\|A\mathbf{x}^{(i+1)} - b\right\|^a}{\left\|A\mathbf{x}^{(i)} - b\right\|^a} = \boldsymbol{\tau}^a \tag{25}$$

and

$$
\pi = \left( h(v\_o + \eta) \right)^{1/a} \tag{26}
$$

*x*ð Þ *<sup>i</sup>*þ<sup>1</sup> is computed as

$$\mathbf{x}^{(i+1)} = \mathbf{x}^{(i)} + \eta \left(\frac{d\mathbf{x}}{dv}\right)\_{v=v\_o} \tag{27}$$

i.e., with (14) where *Fo* is replaced with *F x*ð Þ*<sup>i</sup>* ,

$$\begin{split} \mathbf{x}^{(i+1)} &= \mathbf{x}^{(i)} + \frac{\eta}{a} \left( \frac{dh}{dv} \right)\_{v = v\_o} \frac{\left\| A\mathbf{x}^{(i)} - b \right\|^2}{\left\| A^T \left( A\mathbf{x}^{(i)} - b \right) \right\|^2} \\ &\times A^T \left( A\mathbf{x}^{(i)} - b \right), \quad i = \mathbf{0}, \mathbf{1}, \mathbf{2}, \dots \end{split} \tag{28}$$

in which, taking into account (21), one has to have ð Þ *<sup>η</sup>=<sup>α</sup>* ð Þ *dh=dv <sup>v</sup>*¼*vo* <sup>&</sup>lt; 1. This expression corresponds to the first step in the numerical integration by Euler's method of the differential Eq. (14), starting from *x*ð Þ*<sup>i</sup>* with a step *η*, or to the first two terms of the Taylor series expansion of *x v*ð Þ *<sup>o</sup>* þ *η :*

The starting value *<sup>x</sup>*ð Þ <sup>0</sup> and the function *h v*ð Þ are chosen in the same way as in Section 3.1. For selected ratios *η=α* the iteration cycle continues as long as the residual *Ax*ð Þ*<sup>i</sup>* � *<sup>b</sup>* decreases at a proper rate. Theoretically, to make *Ax*ð Þ*<sup>i</sup>* � *<sup>b</sup>* ¼ *<sup>ε</sup> Ax*ð Þ <sup>0</sup> � *<sup>b</sup>* with *ε* ≪ 1 one would need to conduct ln ð Þ*ε =* ln *τ* iterations. Since computation errors are introduced at each iteration (as in the Euler method), the initially chosen value of τ cannot be maintained the same as the iterative process continues. An approximate solution of (1) is obtained at the end of the iteration cycle as before, applying (21). Subsequent iteration cycles are performed with the starting point in each cycle being the point determined in the preceding cycle.

Numerical results were generated using, as in the method presented in Section 3.1, *h v*ð Þ¼ 1 � *v* but with *η=α* ¼ 0*:*2 and *h v*ð Þ¼ ð Þ 1 � *v* <sup>2</sup> with *<sup>η</sup>=<sup>α</sup>* <sup>¼</sup> <sup>0</sup>*:*1. Now, in both

cases, *vo* <sup>¼</sup> 0,ð Þ *<sup>η</sup>=<sup>α</sup>* ð Þ *dh=dv <sup>v</sup>*¼<sup>0</sup> ¼ �0*:*2 in (28), and *<sup>τ</sup>* ffi <sup>0</sup>*:*8*:* A substantial increase in <sup>τ</sup>, approaching *τ* ¼ 1, or oscillations of its value during the first few iterations show that the computation errors evaluated from (15) (with *Fo* replaced by k k *Ax* � *<sup>b</sup> <sup>α</sup>* ) are too big. In such a situation, one has to decrease the factor ð Þ *<sup>η</sup>=<sup>α</sup>* ð Þ *dh=dv <sup>v</sup>*¼<sup>0</sup> in (28), i.e., to decrease *η=α* for a given *h v*ð Þ, which leads to an increase of τ and a decrease of ð Þ <sup>1</sup>*=*<sup>2</sup> *<sup>η</sup>*<sup>2</sup> *<sup>d</sup>*<sup>2</sup> *x=dv*<sup>2</sup> � � � � *v*¼*vo* . For accuracy of 1% for the solution of (1), one needs a number of three to six short iteration cycles, with about eight iterations per cycle, for systems with condition numbers below 100. Of course, as in the previous method, an increased number of iteration cycles is required to reach the same accuracy for the solution of poorly conditioned systems.

*Remarks.* One can easily see that *x*ð Þ *<sup>i</sup>*þ<sup>1</sup> in (28) can be expressed in the form

$$\mathfrak{x}^{(i+1)} = \mathfrak{x}^{(0)} + \sum\_{\ell=0}^{i} \mathfrak{x}\_{d}^{(\ell)}, \ \ i = 0, 1, 2, \dots \tag{29}$$

where

$$\begin{aligned} \mathbf{x}\_d^{(\ell)} &\equiv \mathbf{x}^{(\ell+1)} - \mathbf{x}^{(\ell)} = \frac{\eta}{a} \left( \frac{dh}{dv} \right)\_{v = v\_o} \frac{\left\| A\mathbf{x}^{(\ell)} - b \right\|^2}{\left\| A^T \left( A\mathbf{x}^{(\ell)} - b \right) \right\|^2} \\ &\times A^T \left( A\mathbf{x}^{(\ell)} - b \right) \end{aligned} \tag{30}$$

with the solution of (1) given by the infinite series

$$\mathbf{x}\_{\rm S} = \mathbf{x}^{(0)} + \mathbf{x}\_{d}^{(0)} + \mathbf{x}\_{d}^{(1)} + \mathbf{x}\_{d}^{(2)} + \cdots \tag{31}$$

*x*ð Þ *<sup>ℓ</sup> <sup>d</sup>* in (30) can also be written as

$$\mathbf{x}\_d^{(\ell)} = \frac{\eta}{a} \left(\frac{dh}{dv}\right)\_{v=v\_o} \frac{\left\|\mathbf{b}^{(\ell)}\right\|^2}{\left\|\mathbf{A}^T \mathbf{b}^{(\ell)}\right\|^2} \mathbf{A}^T \mathbf{b}^{(\ell)} \tag{32}$$

where

$$\begin{aligned} \mathbf{b}^{(\ell)} &= A\mathbf{x}\_d^{(\ell-1)} - \mathbf{b}^{(\ell-1)}, \quad \ell = \mathbf{1}, 2, \dots; \\ \mathbf{b}^{(0)} &= A\mathbf{x}^{(0)} - \mathbf{b} \end{aligned} \tag{33}$$

This shows that the difference *x*ð Þ *<sup>ℓ</sup> <sup>d</sup>* is a "rough approximation" to the solution of a system (1) whose right-hand side is, at each iteration, just the residual vector of the system in the preceding iteration, which decreases in magnitude from one iteration to the next. Thus, the method presented in this Section represents a practical, concrete implementation of the well-known idea of successive approximations/perturbations [9].

To search for a possible increase in efficiency, more general functionals of the form

$$F(\mathbf{x}) = \mathcal{F}(\|A\mathbf{x} - b\|)\tag{34}$$

*Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations DOI: http://dx.doi.org/10.5772/intechopen.104209*

could be tested, where F and its first derivative are finite and continuous at all the points within the interval of integration. Now the corresponding (14) is

$$\frac{d\mathbf{x}}{dv} = F\_o \frac{dh}{dv} \left(\frac{d\mathcal{F}}{d||A\mathbf{x} - b||}\right)^{-1} \frac{||A\mathbf{x} - b||}{||A^T(A\mathbf{x} - b)||}\tag{35}$$
 
$$\times A^T(A\mathbf{x} - b)$$

*Note.* The paths of integration in the methods presented can be looked at as being the field lines of a Poissonian electrostatic field in a homogeneous region bounded by a surface of constant potential *F x*ð Þ¼ k k *Axo* � *<sup>b</sup> <sup>α</sup>* and with a zero potential at the solution point, *F x*ð Þ¼ *<sup>S</sup>* 0. In the particular case of *α* ¼ 2 the ratio of the volume density of charge within the region to the material permittivity is constant, namely, �2 P*<sup>n</sup> i*¼1 P*<sup>n</sup> <sup>k</sup>*¼<sup>1</sup>*a*<sup>2</sup> *ik* where *aik* are the entries of *A.* By altering this electrostatic field one could eventually make quicker the approach to the solution point along the integration path.

## **4. Method of alternate minimizations**

This simple method is based on the property of a functional of the form (34) associated with a general system of equations (1) to allow not only to minimize the value of the functional but also the distance to the solution point of (1). Using only minimizations along ordinary gradients of the functional is not efficient unless the system is very well-conditioned.

For computing efficiently an approximate solution of general, large linear systems of algebraic equations, we propose in this Section the alternate application of minimizations of a functional and of the distance to the solution point, along the direction of the gradient of the functional.

Consider the functional in (34) where F is a real function defined for all *x* from a chosen starting point *x*ð Þ*<sup>s</sup>* to the solution point *xS* of (1), monotone decreasing with k k *Ax* � *b* , Fð Þ¼ 0 *F x*ð Þ*<sup>S</sup>* . The gradient of *F x*ð Þ is

$$\nabla F(\mathbf{x}) = \frac{d\mathcal{F}(||A\mathbf{x} - b||)}{d||A\mathbf{x} - b||} \frac{A^T(A\mathbf{x} - b)}{||A\mathbf{x} - b||}\tag{36}$$

i.e., in the direction of the vector *AT*ð Þ *Ax* � *<sup>b</sup>* .

The minimum of *F x*ð Þ along a straight line through *<sup>x</sup>*ð Þ*<sup>s</sup>* in the direction defined by an *<sup>n</sup>*-dimensional vector *<sup>d</sup>* is found from the condition that *<sup>x</sup>* � *<sup>x</sup>*ð Þ*<sup>s</sup>* <sup>¼</sup> *<sup>λ</sup>d*, where *<sup>λ</sup>* is a scalar to be determined, and *AT*ð Þ *Ax* � *<sup>b</sup>* are perpendicular. This gives *<sup>λ</sup>* and *<sup>x</sup>* for the minimum of *F x*ð Þ, namely,

$$\mathbf{x}\_{\min F}^{(d)} = \mathbf{x}^{(\circ)} - \frac{\left[ (\mathbf{A}d)^T \left( \mathbf{A} \mathbf{x}^{(\circ)} - b \right) \right]}{\left|| \mathbf{A}d ||^2} \, d \tag{37}$$

The point at which *F x*ð Þ is minimum along the normal taken at *<sup>x</sup>*ð Þ*<sup>s</sup>* is determined by replacing *<sup>d</sup>* with *AT Ax*ð Þ*<sup>s</sup>* � *<sup>b</sup>* � � in (37),

$$\mathbf{x}\_{\min F} = \mathbf{x}^{(\boldsymbol{\epsilon})} - \frac{\left\| \mathbf{A}^T \left( \mathbf{A} \mathbf{x}^{(\boldsymbol{\epsilon})} - \mathbf{b} \right) \right\|^2}{\left\| \mathbf{A} \mathbf{A}^T \left( \mathbf{A} \mathbf{x}^{(\boldsymbol{\epsilon})} - \mathbf{b} \right) \right\|^2} \mathbf{A}^T \left( \mathbf{A} \mathbf{x}^{(\boldsymbol{\epsilon})} - \mathbf{b} \right) \tag{38}$$

The minimum distance between the solution point *xS* and a point *x* along the direction of the gradient of *F x*ð Þ taken at *<sup>x</sup>*ð Þ*<sup>s</sup>* is at

$$\mathbf{x} = \mathbf{x}^{(s)} + \mu \mathbf{A}^T (\mathbf{A} \mathbf{x} - \mathbf{b}) \tag{39}$$

where the scalar *μ* is determined by requiring the distance k k *x* � *xS* to be minimum. This gives *μ* and *x* for this minimum, namely,

$$\mathbf{x}\_{\min D} = \mathbf{x}^{(\circ)} - \frac{\left\| A\mathbf{x}^{(\circ)} - b \right\|^2}{\left\| A^T \left( A\mathbf{x}^{(\circ)} - b \right) \right\|^2} \mathbf{A}^T \left( A\mathbf{x}^{(\circ)} - b \right) \tag{40}$$

which depends only on the residual vector and on *AT*ð Þ *Ax* � *<sup>b</sup>* at the point *<sup>x</sup>*ð Þ*<sup>s</sup>* , independently of the form of F.

As already mentioned, repeated minimizations using only (38) are not efficient for solving a general system (1). The same is true when using only (40). To obtain an approximate solution of (1), we apply the formulae (40) and (38) alternately, the starting point *x*ð Þ*<sup>s</sup>* being each time the point determined in the preceding minimization. As in Section 3, when there is no indication about a convenient first starting point *<sup>x</sup>*ð Þ*<sup>s</sup>* , one can use the origin *<sup>x</sup>*ð Þ*<sup>s</sup>* <sup>¼</sup> <sup>0</sup>*:* Only a few iterations, up to ten, are needed for a solution with a relative error k k *x*~ � *xS =*k k *xS* of about 1% for systems with condition numbers of up to 100.

The procedure is surprisingly efficient for a rough solution even for very illconditioned systems. For example, for the system *Hx* ¼ *b* where *H* is the Hilbert matrix of order eight and whose solution is *xS* <sup>¼</sup> ½ � 1, 1, … , 1 *<sup>T</sup>* [10], a solution of 6% accuracy is obtained with a starting point *<sup>x</sup>*ð Þ*<sup>s</sup>* <sup>¼</sup> 0 by performing only seven alternate minimizations (40), (38), with no equilibration or regularization preoperated on the system. By comparison [10], for the Gauss elimination method, the accuracy is only 40.6%, for the Gauss elimination with equilibration 9.15%, and for the Householder method with equilibration 5.6%.

Experimental results show that, as the new point *x* given by (40), (38) becomes closer to the solution point *xS*, the direction *AT*ð Þ *Ax* � *<sup>b</sup>* of the gradient in (36), i.e., of the correction terms in (40) and (38), becomes more and more orthogonal to the direction of *x* � *xS*. This causes the magnitude k k *Ax* � *b* of the residual vector of (1) and the relative difference k k *x* � *xS =*k k *xS* to not significantly decrease anymore. To progress with the computation one can intercalate minimizations of *F x*ð Þ along directions that are different from that of *<sup>A</sup><sup>T</sup>*ð Þ *Ax* � *<sup>b</sup>* . By numerical testing, it is observed that *<sup>x</sup>* � *<sup>x</sup>*ð Þ <sup>0</sup> , where *<sup>x</sup>*ð Þ <sup>0</sup> is the original starting point, has at this stage, in most cases, a significant component along the direction of *x* � *xS*. Thus, one can use such a minimization direction to try to improve the solution accuracy.

## **5. Conclusions**

Highly convergent iteration formulae for solving general, large linear systems of algebraic equations are derived from exact analytic solutions of particular differential

## *Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations DOI: http://dx.doi.org/10.5772/intechopen.104209*

equations based on new, accurate series representations of the matrix exponential recently published in ref. [3]. Specialized differential equations which make it possible to monitor and control the computation errors and the decrease of the magnitude of the residual vector k k *Ax* � *b* of (1) at each stage of the computation procedure are constructed and integrated numerically to approximate the solution of these systems. Two methods of the solution have been presented in this book chapter and the simplest Euler method is applied for the numerical integration of the vector differential equations. In the first method cycles of integration are used, each cycle starting from a convenient value of the unknown and continuing until the rate of decrease of k k *Ax* � *b* becomes too small. The second method is an iterative method where a fixed rate of decrease of k k *Ax* � *b* is imposed at the beginning of the iteration cycles. In this method, only the first step in the Euler method is computed at each iteration and each cycle of iteration is conducted until there is no significant change in the magnitude of the residual vector. These two methods are highly efficient for large systems with condition numbers below 100 since only up to six cycles with less than ten steps per cycle are necessary to get a solution accuracy of 1%, at each step within a cycle having to compute two matrix-vector multiplications. The method in Section 3.2 seems to be more efficient for systems with bigger condition numbers. The number of cycles of integration/iteration increases with the condition number and preconditioning should be done for ill-conditioned systems before attempting to apply the methods presented in this work.

The iterative method of alternate minimizations presented in Section 4 is intended for computing quickly a rough approximation of the solution of linear systems of equations. In this method, preequilibration or preregularization/preconditioning are not required to obtain useful results even for systems with poorly conditioned matrices.

The present Book Chapter has been intended to constitute a review of the work done so far on the subject matter. It describes the proposed new methods for an approximate solution of large linear algebraic systems using appropriately chosen matrix functionals and shows the procedures for constructing the concrete solution algorithms. At this stage, the results presented are only validated by preliminary numerical experiments which indicate the efficiency of the proposed procedures for deriving approximate/rough solutions of large systems. More numerical experiments involving systems with higher condition numbers, as well as theoretical results, will be presented in future work. It is my hope that other researchers will be attracted to this new area and rigorous theoretical results will also be established (theorems, etc.).

*Matrix Theory - Classics and Advances*

## **Author details**

Ioan R. Ciric The University of Manitoba, Winnipeg, Canada

\*Address all correspondence to: ioan.ciric@umanitoba.ca

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations DOI: http://dx.doi.org/10.5772/intechopen.104209*

## **References**

[1] Ciric IR. Series solutions of vector differential equations related to linear systems of algebraic equations. In: Proceedings of the 12th International Symposium on Antenna Technology and Applied Electromagnetics and Canadian Radio Sciences Conference (ANTEM/ URSI). Montréal, Canada: IEEE; 2006. pp. 317-321. ISBN: 978-0-9638425-1-7

[2] Ciric IR. Rapidly convergent matrix series for solving linear systems of equations. In: Proceedings of the 17th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Montréal, Canada: IEEE; 2016. DOI: 10.1109/ ANTEM.2016.755022/978-1- 4673-8478-0

[3] Ciric IR. New matrix series formulae for matrix exponentials and for the solution of linear systems of algebraic equations. In: Shah K, Okutmustur B, editors. Functional Calculus. London: IntechOpen; 2020. DOI: 10.5772/ Intechopen.2022.77599/978-1- 83880-007-9

[4] Ciric IR. Approximate solution of large linear algebraic systems using differential equations. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654- 0335-1

[5] Burden RL, Faires JD. Numerical Analysis. 5th ed. Boston: PWS-KENT; 1993. ISBN: 0-534-93219-3

[6] Ciric IR. Using selected rates of residual vector decrease for solving iteratively large linear systems. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM).

Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1

[7] Ciric IR. Alternate minimizations for an approximate solution of general systems of linear algebraic equations. In: Proceedings of the 19th International Symposium on Antenna Technology and Applied Electromagnetics (ANTEM). Winnipeg, Canada: IEEE; 2021. ISBN: 978-1-6654-0335-1

[8] Ciric IR. A geometric property of the functional associated with general systems of algebraic equations. In: Proceedings of the 12th International Symposium on Antenna Technology and Applied Electromagnetics and Canadian Radio Sciences Conference (ANTEM/ URSI). Montréal, Canada: IEEE; 2006. pp. 311-315. ISBN: 978-0-9638425-1-7

[9] Lanczos C. Applied Analysis. New York: Dover Publications, Inc.; 1988. ISBN: 0-486-65656-X (Paperback)

[10] Hämmerlin G, Hoffmann K-H. Numerical Mathematics. New York: Springer-Verlag; 1991

## *Edited by Mykhaylo Andriychuk*

*Matrix Theory - Classics and Advances* examines matrix theory and its application in solving a series of problems related to natural phenomena and applied science. It consists of eleven chapters divided into two sections. Section 1, "Theory and Progress", discusses the classical problems of matrix theory and its contribution to different fields of pure mathematics. Section 2, "Applications", contains the research related to the application of matrix theory in applied science.

Published in London, UK © 2023 IntechOpen © Nobi\_Prizue / iStock

Matrix Theory - Classics and Advances

Matrix Theory

Classics and Advances

*Edited by Mykhaylo Andriychuk*