**6. Numerical results**

1


 l l

{

 l

2 11

*Xp A*


= (),

**b e**


*K K KK K*

1


 l l

*K K*

jl

l

12 1 1 11

K K

jl l


**v b**

*K H*

= ,

*X*

 b

*H KK*

=

*p*

w


l jl

bb

*K K*

w

Finally, using F to denote the discrete Fourier transform, we have

t

w

computed, the *K* FFTs can be performed in parallel.

2 1, 1,

P P

b b

1 1 12 1 1 2 1 1

 jl l

 l


L

= ( ) = [ ] [ , ]( ) [ , , , ]( ) ( )

*p A A I*

K L

jl

+ --

1

1, 1 1, 1 2, 1,

w

( ) = [ , , , ] [ , , ]( )

% KK L K L

+ -- ++ +

be computed by repeated application of nested multiplication to the last *K* terms of the Newton

( ) ( ) = () = [ ] ,

 w

and it can easily be seen that the solution at each time step requires *K* FFTs and one inverse FFT. It is worth noting that once the Krylov subspace vectors *Aj* **w**, *j* = 0, 1, …, *K* − 1 are

*HK H K j*

*A p A p A v CA*

 jl

1 10

= ,

L

*C CC*

 l l l

1 , 1, 1,

*<sup>ω</sup>*, for *j* = 0, 1, …, *K* − 1, are the coefficients of *p*˜ *<sup>K</sup>* <sup>−</sup>1,*<sup>ω</sup>* in power form, which can easily

ww

w

 l

*K K*

 w  ww

+ -+

 l l

 l l l

[ , , ]( ) ( )

*K H K H*

l

= ( ) =( ) ( )

**w b b r e**

*qA A I A I*

1 1, , , =1 =1 =1

 l

å Õ Õ

Arranging the interpolation points in the order indicated above allows us to reduce the number

*j i k*

ww

*K j K*


1


*Kj i k*

 l l

æ ö ç ÷ - è ø

 w  l l

= , *<sup>T</sup> AX X T K KK KK* + **r e** (27)

}

 w


<sup>1</sup> <sup>1</sup>


*<sup>K</sup> <sup>j</sup>*

w

=0

*j*

**b**

L

 l


 l

+ -+

 l

*K K H*

*AI A I*

(26)

[ , , , , , ] ( ) ( ).

=1 =1

å Õ

K

( )= [ , , ] ( )

 l

 l l

K K

*K j K j i j i*

2 1, 1

 jl

jl

of FFTs needed. Using the relation from Lanczos iteration,

l

w


*p*

16 Applied Linear Algebra in Action

we define

and

where *Cj*

form of *p*2*K* − 1,*<sup>ω</sup>*.

j t In this section, we compare several versions of EPI methods, as applied to two test problems; additional test problems are featured in [6, 7]. The versions differ in the way in which they compute matrix function-vector products of the form *φ*(*Aτ*)**b**:


All of these approaches are used in the context of two EPI methods. The first is a third-order, two-stage EPI method [1]

$$\begin{aligned} Y\_1 &= \mathbf{y}\_s + \frac{1}{3} h a\_{1i} \boldsymbol{\rho}\_i \Big( \frac{1}{3} hA \Big) F(\mathbf{y}\_s), \\ \mathbf{y}\_{s+1} &= \mathbf{y}\_s + h \boldsymbol{\rho}\_i (hA) F(\mathbf{y}\_s) + \mathfrak{A} h b\_i \boldsymbol{\rho}\_2 (hA) [F(Y\_1) - F(\mathbf{y}\_s) - A(Y\_1 - \mathbf{y}\_s)], \end{aligned} \tag{29}$$

where *a*11 = 9/4 and *b*1 = 32/81, and

$$R(Y\_1) = F(Y\_1) - F(\mathbf{y}\_n) - A(Y\_1 - \mathbf{y}\_n).$$

For this method,

$$\varphi\_{\mathfrak{p}}(\lambda) = \frac{e^{\lambda} - 1}{\lambda}, \quad \varphi\_{\mathfrak{p}}(\lambda) = \frac{e^{\lambda} - \lambda - 1}{\lambda^2}, \quad \varphi\_{\mathfrak{p}}(\lambda) = \frac{e^{\lambda}(6 - \lambda) - (6 + 5\lambda + 2\lambda^2)}{\lambda^3}.$$

The second is a fifth-order, three-stage EPI method [25]

$$\begin{aligned} Y\_1 &= \mathbf{y}\_n + h a\_{11} \boldsymbol{\nu}\_1(\mathbf{g}\_{11} h A) F(\mathbf{y}\_n), \\ Y\_2 &= \mathbf{y}\_n + h a\_{21} \boldsymbol{\nu}\_1(\mathbf{g}\_{21} h A) F(\mathbf{y}\_n) + h a\_{22} \boldsymbol{\nu}\_2(\mathbf{g}\_{22} h A) R(Y\_1), \\ \mathbf{y}\_{n+1} &= \mathbf{y}\_n + h b\_1 \boldsymbol{\nu}\_1(\mathbf{g}\_{31} h A) F(\mathbf{y}\_n) + h b\_2 \boldsymbol{\nu}\_2(\mathbf{g}\_{32} h A) R(Y\_1) + \\ h b\_3 \boldsymbol{\nu}\_3(\mathbf{g}\_{33} h A) [-2 R(Y\_1) + R(Y\_2)], \end{aligned} \tag{30}$$

where

$$\psi\_i(z) = \sum\_{j=1}^{\ell} p\_{\psi} \varphi\_j(z), \quad i = 1, 2, 3, \dots$$

and the coefficients *gij*, *aij*, *bj* , and *pij* are obtained from the description of the EPIRK5s3 method in [25].

For all methods used to compute matrix function-vector products, efficiency will be measured in two ways: (1) total execution time required to integrate over the entire time interval, and (2) the average number of matrix-vector products (plus the average number of FFTs, in the case of KSS-EPI) performed for each evaluation of a matrix function-vector product of the form *φ*(*Aτ*)**b**. The phrase "number of iterations" is used throughout this section to refer to this quantity. Throughout this section, for the purpose of discussing performance as a function of spatial resolution, *N* refers to the number of grid points per dimension.

#### **6.1. Diffusive problem**

For our first test problem, we choose a diffusion-dominated PDE: the 2-D Allen-Cahn equation,

$$u\_t = \alpha \nabla^2 u + u - u^3, \quad \text{x, y} \in [0, 1], \quad t \in [0, 0.2] \tag{31}$$

with *α* = 0.1. We impose homogeneous Neumann boundary conditions, and the initial condition

**Figure 4.** Relative error plotted against execution time for solving the Allen-Cahn equation (31) using the third-order EPI method (29). Matrix function-vector products are computed using KSS-EPI with denoising (solid curves), Krylov-EPI (dashed curves), AKP (dashed-dotted curves), and LEJA (dotted curves), on grids with *N* = 50 ('+' markers), 150 ('x' markers) and 300 ('o' markers) points per dimension. Time steps used are *Δt* = (0.2)2<sup>−</sup>*<sup>p</sup>* , for *p* = 0, 1, 2, 3, 4.

$$
\mu\_0(\mathbf{x}, \boldsymbol{\nu}) = 0.4 + 0.1 \cos(2\pi \mathbf{x}) \cos(2\pi \boldsymbol{\nu}).
$$

where

in [25].

and the coefficients *gij*, *aij*, *bj*

18 Applied Linear Algebra in Action

**6.1. Diffusive problem**

condition

=1

 j*z p zi* å

*j i ij j j*

y

spatial resolution, *N* refers to the number of grid points per dimension.

a

( ) = ( ), = 1, 2,3,

For all methods used to compute matrix function-vector products, efficiency will be measured in two ways: (1) total execution time required to integrate over the entire time interval, and (2) the average number of matrix-vector products (plus the average number of FFTs, in the case of KSS-EPI) performed for each evaluation of a matrix function-vector product of the form *φ*(*Aτ*)**b**. The phrase "number of iterations" is used throughout this section to refer to this quantity. Throughout this section, for the purpose of discussing performance as a function of

For our first test problem, we choose a diffusion-dominated PDE: the 2-D Allen-Cahn equation,

with *α* = 0.1. We impose homogeneous Neumann boundary conditions, and the initial

**Figure 4.** Relative error plotted against execution time for solving the Allen-Cahn equation (31) using the third-order EPI method (29). Matrix function-vector products are computed using KSS-EPI with denoising (solid curves), Krylov-EPI (dashed curves), AKP (dashed-dotted curves), and LEJA (dotted curves), on grids with *N* = 50 ('+' markers), 150

('x' markers) and 300 ('o' markers) points per dimension. Time steps used are *Δt* = (0.2)2<sup>−</sup>*<sup>p</sup>*

2 3 = , , [0,1], [0,0.2] *<sup>t</sup> u u u u xy t*

, and *pij* are obtained from the description of the EPIRK5s3 method

Ñ +- Î Î (31)

, for *p* = 0, 1, 2, 3, 4.

The Laplacian is discretized using a centered finite difference. For KSS-EPI, the low-frequency portion **b***L* consists of all components with wave numbers *ω<sup>i</sup>* ≤ 7, *i* = 1, 2. That is, for this problem, the value of *Nc*, as defined in the previous section, is 7. The low value of this threshold is due to the smoothness of the initial data.

**Figure 5.** Average number of matrix-vector products, shown on a logarithmic scale, per matrix function-vector product evaluation for each method when solving the Allen-Cahn equation (31) using the third-order EPI method (29). For KSS and KSS denoised, FFTs are also included. For each method, bars correspond to grid sizes of *N* = 50, 150, 300 points per dimension, from left to right. Left plot: *Δt* = 0.2. Right plot: *Δt* = 0.0125.

**Figure 6.** Relative error plotted against execution time for solving the Allen-Cahn equation (31) using the fifth-order EPI method (30). Matrix function-vector products are computed using KSS-EPI with denoising (solid curves), Krylov-EPI (dashed curves), AKP (dashed-dotted curves), and LEJA (dotted curves), on grids with *N* = 50 ('+' markers), 150 ('x' markers), and 300 ('o' markers) points per dimension. Time steps used are *Δt* = (0.2)2<sup>−</sup>*<sup>p</sup>* , for *p* = 0, 1, 2, 3, 4.

The results are shown in **Figures 4** and **5** for the third-order EPI method (29), and in **Fig‐ ures 6** and **7** for the firth-order EPI method (30). It can be seen from **Figures 4** and **6** that for both EPI methods, the accuracy of all four approaches to computing matrix function-vector products are comparable. However, the efficiency and scalability of these four approaches are very different. In particular, these figures show that for KSS-EPI methods, the growth in execution time as a function of the number of grid points per dimension is much less. For example, note that for the coarsest grid, with *N* = 50, the speed of KSS-EPI is similar to that of Krylov-EPI, but for *N* = 150 and *N* = 300, KSS-EPI is much faster. Furthermore, as *N* increases, this gap in performance grows. This is due to the fact that while both methods use Krylov projection, KSS-EPI applies it to an initial vector with only low-frequency components, thus significantly reducing the number of iterations needed for convergence.

**Figure 7.** Average number of matrix-vector products, shown on a logarithmic scale, per matrix function-vector product evaluation for each method when solving the Allen-Cahn equation (31) using the fifth-order EPI method (29). For KSS and KSS denoised, FFTs are also included. For each method, bars correspond to grid sizes of *N* = 50, 150, 300 points per dimension, from left to right. Left plot: *Δt* = 0.2. Right plot: *Δt* = 0.0125.

The difference in scalability is more clearly illustrated in **Figures 5** and **7**. It can be seen that for KSS-EPI, the number of overall iterations (matrix-vector multiplications + FFTs) shows almost no sensitivity to the grid size, compared to Krylov-EPI, AKP and Leja interpolation, all of which exhibit substantial growth as the number of grid points increases. It can also be seen that denoising results in a slightly higher number of overall iterations, so it is not beneficial for this problem. This is not surprising, as this is a diffusive problem that has a smooth solution and is therefore less susceptible to high-frequency oscillations during Lanczos iteration. Although AKP is slightly more accurate than KSS-EPI for fifth order, this is more than offset by the superior efficiency of KSS-EPI with denoising, particularly at larger grid sizes and larger time steps.

## **6.2. Advective problem**

For our second test problem, we choose an advection-dominated PDE: a 1-D Burgers' equation

$$
\mu\_t + \mu u\_x = \nu u\_{\infty}, \quad \ge \in [0, 1], \quad t \in [0, 1] \tag{32}
$$

with *ν* = 0.03. We impose homogeneous Dirichlet boundary conditions, and the initial condition

$$
\mu\_0(\mathbf{x}) = \sin^3(3\pi\mathbf{x})(1-\mathbf{x})^{3/2}.
$$

Matrices, Moments and Quadrature: Applications to Time-Dependent Partial Differential Equations http://dx.doi.org/10.5772/62247 21

products are comparable. However, the efficiency and scalability of these four approaches are very different. In particular, these figures show that for KSS-EPI methods, the growth in execution time as a function of the number of grid points per dimension is much less. For example, note that for the coarsest grid, with *N* = 50, the speed of KSS-EPI is similar to that of Krylov-EPI, but for *N* = 150 and *N* = 300, KSS-EPI is much faster. Furthermore, as *N* increases, this gap in performance grows. This is due to the fact that while both methods use Krylov projection, KSS-EPI applies it to an initial vector with only low-frequency components, thus

**Figure 7.** Average number of matrix-vector products, shown on a logarithmic scale, per matrix function-vector product evaluation for each method when solving the Allen-Cahn equation (31) using the fifth-order EPI method (29). For KSS and KSS denoised, FFTs are also included. For each method, bars correspond to grid sizes of *N* = 50, 150, 300 points per

The difference in scalability is more clearly illustrated in **Figures 5** and **7**. It can be seen that for KSS-EPI, the number of overall iterations (matrix-vector multiplications + FFTs) shows almost no sensitivity to the grid size, compared to Krylov-EPI, AKP and Leja interpolation, all of which exhibit substantial growth as the number of grid points increases. It can also be seen that denoising results in a slightly higher number of overall iterations, so it is not beneficial for this problem. This is not surprising, as this is a diffusive problem that has a smooth solution and is therefore less susceptible to high-frequency oscillations during Lanczos iteration. Although AKP is slightly more accurate than KSS-EPI for fifth order, this is more than offset by the superior efficiency of KSS-EPI with denoising, particularly at larger grid sizes and larger

For our second test problem, we choose an advection-dominated PDE: a 1-D Burgers' equation

with *ν* = 0.03. We impose homogeneous Dirichlet boundary conditions, and the initial condition

3 3/2 <sup>0</sup> *ux x x* ( ) = (3 )(1 ) . sin p-

(32)

= , [0,1], [0,1] *t x xx u uu u x t* + ÎÎ

n

significantly reducing the number of iterations needed for convergence.

dimension, from left to right. Left plot: *Δt* = 0.2. Right plot: *Δt* = 0.0125.

time steps.

**6.2. Advective problem**

20 Applied Linear Algebra in Action

**Figure 8.** Relative error plotted against execution time for solving Burgers' equation (32) using the third-order EPI method (29). Matrix function-vector products are computed using KSS-EPI with denoising (solid curves), Krylov-EPI (dashed curves), AKP (dashed-dotted curves), and LEJA (dotted curves), on grids with *N* = 500 ('+' markers), 1500 ('x' markers), and 3000 ('o' markers) points. Time steps used are *Δt* = (0.01)2<sup>−</sup>*<sup>p</sup>* , for *p* = 0, 1, 2, 3, 4.

**Figure 9.** Average number of matrix-vector products, shown on a logarithmic scale, per matrix function-vector product evaluation for each method when solving Burgers' equation (32) using the third-order EPI method (29). For KSS and KSS denoised, FFTs are also included. For each method, bars correspond to grid sizes of *N* = 500, 1500, 3000 points, from left to right. Left plot: *Δt* = 0.01. Right plot: *Δt* = 0.000625.

For KSS-EPI, the low-frequency portion **b***L* consists of all components with wave numbers *ω* ≤ *Nc* = 40. The higher threshold for this problem, compared to the Allen-Cahn equation (31), is due to the fact that the initial data is less smooth.

The results are shown in **Figures 8** and **9** for the third-order EPI method (29), and in **Fig‐ ures 10** and **11** for the fifth-order EPI method (30). From **Figures 8** and **10**, it can be seen that Krylov-EPI and KSS-EPI have comparable accuracy, but even for the coarsest grid with *N* = 500 points, KSS-EPI is much faster, and as with the Allen-Cahn equation, this gap in performance only grows with *N*. As shown in **Figures 9** and **11**, this is again due to the increasing number of Krylov projection steps needed for Krylov-EPI. While Leja interpolation is more accurate than KSS-EPI for both the third- and fifth-order EPI methods, and AKP is also more accurate in the fifth-order case, both of these methods are still significantly slower than KSS-EPI, and a similar scalability gap can also be observed.

**Figure 10.** Relative error plotted against execution time for solving Burgers' equation (32) using the fifth-order EPI method (30). Matrix function-vector products are computed using KSS-EPI with denoising (solid curves), Krylov-EPI (dashed curves), AKP (dashed-dotted curves), and LEJA (dotted curves), on grids with *N* = 500 ('+' markers), 1500 ('x' markers), and 3000 ('o' markers) points. Time steps used are *Δt* = (0.01)2<sup>−</sup>*<sup>p</sup>* , for *p* = 0, 1, 2, 3, 4.

**Figure 11.** Average number of matrix-vector products, shown on a logarithmic scale, per matrix function-vector prod‐ uct evaluation for each method when solving Burgers' equation (32) using the fifth-order EPI method (30). For KSS and KSS denoised, FFTs are also included. For each method, bars correspond to grid sizes of *N* = 500, 1500, 3000 points, from left to right. Left plot: *Δt* = 0.01. Right plot: *Δt* = 0.000625.

The difference in scalability among the four approaches to computing matrix function-vector products is more apparent in **Figures 9** and **11**. Denoising applied to KSS-EPI is advantageous for this problem, unlike with the Allen-Cahn equation. As can be seen in these figures, for KSS- EPI without denoising, the number of iterations is increasing with the number of grid points, though not nearly as rapidly as with the other methods. However, with denoising included, the insensitivity of the number of iterations to the grid size is restored.
