**3. Asymptotic analysis of block Lanczos iteration**

For each Fourier coefficient of the solution, KSS methods use a different quadrature rule, because the measure *α*(*λ*) in (8) is defined in terms of the frequency. It follows that if *S*(*L*; *Δt*) is the solution operator of the PDE (e.g. *S*(*L*; *Δt*) = *e*<sup>−</sup> *LΔt* for the PDE (3)), then the function *S*(*λ*; *Δt*) is approximated by a polynomial of degree 2*K* that interpolates *S*(*λ*; *Δt*) at different points for each frequency, which are the block Gaussian quadrature nodes *λ<sup>j</sup>* in (11). Taking all Fourier coefficients together, the computed solution at time *tn* + 1, **u***<sup>n</sup>* + 1 can be expressed as

$$\mathbf{u}^{n+1} = \sum\_{j=0}^{2K-1} D\_j(\Delta t) L\_N^j \mathbf{u}^{n},$$

where *LN* is a discretization of *L* on an *N*-point grid and *Dj* (*Δt*) is a matrix that is diagonal in discrete Fourier space. The diagonal entries are the coefficients of these frequency-dependent interpolating polynomials in power form, with each row corresponding to a different fre‐ quency.

In the block KSS method described in [3, 4] and reviewed in the previous section, these interpolation points are the eigenvalues of the block tridiagonal matrix T*K* from (12) that is produced by block Lanczos iteration. Although each such matrix is small, computing the eigenvalues and eigenvectors for *each* frequency is computationally expensive. Therefore, in this section, we describe a much faster approach to obtaining estimates of these nodes, at least for high frequencies. This approach was first presented in [11] and generalized in [6, 7]. The basic idea is to examine the entries of T*K* as |*ω*| → ∞, where *ω* is the wave number.

#### **3.1. The block case**

As in the previous section, let **u***<sup>n</sup>* be a discretization of the approximate solution *u*(*x*, *t*) at time *tn* = *nΔt* on a uniform *N*-point grid. Then, KSS methods use the initial block *R*<sup>0</sup> <sup>=</sup> **<sup>e</sup>** ^ *<sup>ω</sup>* **u***<sup>n</sup>* , for each *ω* = − *N*/2 + 1, …, *N*/2. We start the first iteration of the block Lanczos algorithm by finding the *QR*-factorization of *R*0:

$$R\_0 = X\_1 B\_0,$$

where

= ˆˆ ˆ = ( , ). ˆ *n n Hn n <sup>n</sup> u t* **u u ee u u e**

Then we apply the block Lanczos algorithm [15] to the matrix *LN*, which comes from the discretization of *L*, with initial block *X*1(*ω*). This produces a block tridiagonal matrix T*K* of the form (12), where every entry of T*K* is a function of *ω*. Then, at time *tn* + 1, each Fourier coefficient

> [ ] <sup>1</sup> 0 12 12 0 12 12 1 2 [ ] = [ ( ) exp[ ( ) ] ( )] , = . ˆ *<sup>n</sup> H H B E <sup>K</sup>*

ww

An inverse FFT applied to the vector of Fourier coefficients **û***<sup>n</sup>* + 1 yields the vector **û***<sup>n</sup>* + 1, which

This algorithm has temporal accuracy *O*(*Δt*<sup>2</sup>*<sup>K</sup>* <sup>−</sup> 1) for parabolic problems [3]. Even higher-order accuracy, *O*(*Δt*<sup>4</sup>*<sup>K</sup>* <sup>−</sup> 2), is obtained for the second-order wave equation [4], due to the secondorder time derivative. Furthermore, under appropriate assumptions on the coefficients of the PDE, the 1-node KSS method is unconditionally stable [3, 4]. More generally, the temporal

this order of accuracy has been observed with the Schrödinger equation [16] and Maxwell's

For each Fourier coefficient of the solution, KSS methods use a different quadrature rule, because the measure *α*(*λ*) in (8) is defined in terms of the frequency. It follows that if *S*(*L*; *Δt*) is the solution operator of the PDE (e.g. *S*(*L*; *Δt*) = *e*<sup>−</sup> *LΔt* for the PDE (3)), then the function *S*(*λ*; *Δt*) is approximated by a polynomial of degree 2*K* that interpolates *S*(*λ*; *Δt*) at different

all Fourier coefficients together, the computed solution at time *tn* + 1, **u***<sup>n</sup>* + 1 can be expressed as

= () , *<sup>K</sup> <sup>n</sup> j n j N*

discrete Fourier space. The diagonal entries are the coefficients of these frequency-dependent interpolating polynomials in power form, with each row corresponding to a different fre‐

In the block KSS method described in [3, 4] and reviewed in the previous section, these interpolation points are the eigenvalues of the block tridiagonal matrix T*K* from (12) that is produced by block Lanczos iteration. Although each such matrix is small, computing the eigenvalues and eigenvectors for *each* frequency is computationally expensive. Therefore, in

*D tL* - <sup>+</sup> **u u** å <sup>D</sup>

2 1 <sup>1</sup> =0

*j*

points for each frequency, which are the block Gaussian quadrature nodes *λ<sup>j</sup>*

*tE B E* <sup>+</sup> **u** - D T **e e** (17)

), where *d* is the highest order of a time derivative in the PDE;

w w

(16)

in (11). Taking

(*Δt*) is a matrix that is diagonal in

 - w w

w

w

consists of the values of the solution *u*(*x*, *tn* + 1) at the grid points.

**3. Asymptotic analysis of block Lanczos iteration**

where *LN* is a discretization of *L* on an *N*-point grid and *Dj*

of the solution is

8 Applied Linear Algebra in Action

w

order of accuracy is *O*(*Δt*(2*K* − 1)*<sup>d</sup>*

equations [18].

quency.

$$\mathbf{X}\_1 = \begin{bmatrix} \hat{\mathbf{e}}\_{o} & \frac{\mathbf{u}\_{o}^{\prime\prime}}{\mathbf{P}\mathbf{u}\_{o}^{\prime\prime}\mathbf{P}\_2} \end{bmatrix} \text{and } B\_0 = \begin{bmatrix} 1 & \hat{u}(o\boldsymbol{\sigma}, t\_n) \\ 0 & \mathbf{P}\mathbf{u}\_{o}^{\prime\prime}\mathbf{P}\_2 \end{bmatrix}. \tag{18}$$

with **u***<sup>ω</sup> <sup>n</sup>* is defined as in (16). We note that if the solution *u* is continuous, then as |*ω*| → ∞, | *ûn*(*ω*)| → 0, so that in the limit *B*0 is diagonal.

The next step is to compute

$$M\_1 = X\_1^H L\_N X\_1,\tag{19}$$

where the matrix *LN* is a spectral discretization of the operator *L* defined by *Lu* = *puxx* + *q*(*x*)*u*, with *p* being a constant. Substituting the value of *X*1 from (18) into (19) yields

$$M\_{1} = \begin{bmatrix} \widehat{\boldsymbol{\alpha}^{2}\boldsymbol{p} + \overline{\boldsymbol{q}}} & \widehat{\boldsymbol{L}\_{N}\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{n}}(\boldsymbol{\alpha})} \\ \hline \widehat{\boldsymbol{L}\_{N}\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{n}}(\boldsymbol{\alpha})} & \\ \hline \widehat{\boldsymbol{L}\_{N}\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{n}}} \boldsymbol{\|}\_{\boldsymbol{\alpha}} & \boldsymbol{R}(\boldsymbol{L}\_{N},\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{n}}) \end{bmatrix},$$

where *q*¯ is the mean of *q*(*x*) on (0, 2*π*), *L <sup>N</sup>* **u***<sup>ω</sup>* ^ *<sup>n</sup>* (*ω*)=**e** ^ *ω <sup>H</sup> <sup>L</sup> <sup>N</sup>* **<sup>u</sup>***<sup>ω</sup> <sup>n</sup>* is the Fourier coefficient of the grid function *LN* **u***<sup>n</sup>* corresponding to the wave number *ω*, and *R*(*L <sup>N</sup>* , **u***<sup>ω</sup> <sup>n</sup>* ) <sup>=</sup> **<sup>u</sup>***<sup>ω</sup> <sup>n</sup>* , *<sup>L</sup> <sup>N</sup>* **<sup>u</sup>***<sup>ω</sup> n* **u***ω <sup>n</sup>* , **<sup>u</sup>***<sup>ω</sup> <sup>n</sup>* is the Rayleigh quotient of *LN* and **u***<sup>ω</sup> <sup>n</sup>* . As |*ω*| increases, the Fourier coefficients of a continuous function decay to zero; therefore, as long as the solution is sufficiently regular, the non-diag‐ onal entries of *M*1 become negligible, that is,

$$M\_1 \approx \begin{bmatrix} \alpha^2 p + \overline{q} & 0\\ 0 & R(L\_N, \mathbf{u}^n) \end{bmatrix}.$$

Proceeding with the iteration, and neglecting any terms that are Fourier coefficients or are of lower order in *ω*, we obtain

$$R\_1 = L\_N X\_1 - X\_1 M\_1 \approx \left[ \tilde{\mathbf{q}} \hat{\mathbf{e}}\_{\boldsymbol{\alpha}} \quad \frac{L\_N \mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{\alpha}}}{||\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{\alpha}}||\_2} - R(L\_N, \mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{\alpha}}) \frac{\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{\alpha}}}{||\mathbf{u}\_{\boldsymbol{\alpha}}^{\boldsymbol{\alpha}}||\_2} \right],$$

where **q** is a vector consisting of the value of *q*(*x*) at the grid points, **q**˜ =**q**−*q*¯, and multiplication of vectors is component-wise.

To obtain *X*2, we perform the *QR*-factorization *R*1 = *X*<sup>2</sup> *B*1. We note that the (1, 2) entry of *B*1, modulo lower-order terms, is the Fourier coefficient *v* ^ 1 (*ω*), where

$$\mathbf{v}\_1 = \tilde{\mathbf{q}} \left( \frac{L\_N \mathbf{u}\_{\alpha}^{\boldsymbol{\kappa}}}{\|\| \mathbf{u}\_{\alpha}^{\boldsymbol{\kappa}} \|\|\_{2}} - R(L\_N, \mathbf{u}\_{\alpha}^{\boldsymbol{\kappa}}) \frac{\mathbf{u}\_{\alpha}^{\boldsymbol{\kappa}}}{\|\| \mathbf{u}\_{\alpha}^{\boldsymbol{\kappa}} \|\|\_{2}} \right).$$

It follows that if the coefficient *q*(*x*) and solution *u*(*x*, *t*) are sufficiently regular, then *B*<sup>1</sup> approaches a diagonal matrix as |*ω*| → ∞, just as *B*0 does. Continuing this process, it can be shown that given sufficient regularity of the solution and coefficients of *L*, each block *Mj* or *Bj* of T*K* from (12) becomes approximately diagonal at high frequencies.

Because T*<sup>K</sup> ij* ≈0 when *i* + *j* is odd in the high-frequency case, it follows that if the rows and columns of T*K* are permuted so that odd-numbered and even-numbered rows and columns are grouped together and in order, then the eigenvalue problem for T*<sup>K</sup>* approximately *decouples*. Therefore, approximate eigenvalues of T*K* can be obtained by computing the eigenvalues of the tridiagonal matrices obtained by performing "non-block" Lanczos on *LN* with initial vectors equal to the two columns of *R*0 separately, rather than applying block Lanczos with these two columns together in the initial block. In [11], it is shown that this decoupling also takes place if the leading coefficient *p*(*x*) of *L* is *not* constant.

### **3.2. The non-block case**

function decay to zero; therefore, as long as the solution is sufficiently regular, the non-diag‐

<sup>0</sup> . 0 (,)*<sup>n</sup> N*

 <sup>+</sup> » ê ú ë û **u**

Proceeding with the iteration, and neglecting any terms that are Fourier coefficients or are of

= ˆ (,) , w

*N N n n <sup>L</sup> R L X XM R L* **u u**

where **q** is a vector consisting of the value of *q*(*x*) at the grid points, **q**˜ =**q**−*q*¯, and multiplication

To obtain *X*2, we perform the *QR*-factorization *R*1 = *X*<sup>2</sup> *B*1. We note that the (1, 2) entry of *B*1,

w

**v q u**

w

w

é ù -» - ê ú

**qe u**

w

*R L* é ù

2 2

(*ω*), where

 w

 w

*n n N n*

ë û % PP PP

> ^ 1

2 2 = (,) .

æ ö ç ÷ è ø % PP PP *n n N n n n N <sup>L</sup> R L* **u u**

**u u**

It follows that if the coefficient *q*(*x*) and solution *u*(*x*, *t*) are sufficiently regular, then *B*<sup>1</sup> approaches a diagonal matrix as |*ω*| → ∞, just as *B*0 does. Continuing this process, it can be shown that given sufficient regularity of the solution and coefficients of *L*, each block *Mj*

Because T*<sup>K</sup> ij* ≈0 when *i* + *j* is odd in the high-frequency case, it follows that if the rows and columns of T*K* are permuted so that odd-numbered and even-numbered rows and columns are grouped together and in order, then the eigenvalue problem for T*<sup>K</sup>* approximately *decouples*. Therefore, approximate eigenvalues of T*K* can be obtained by computing the eigenvalues of the tridiagonal matrices obtained by performing "non-block" Lanczos on *LN* with initial vectors equal to the two columns of *R*0 separately, rather than applying block Lanczos with these two columns together in the initial block. In [11], it is shown that this

w

**u u**

 w  w

 w

or *Bj*

2

*p q <sup>M</sup>*

w

1

1 1 11

modulo lower-order terms, is the Fourier coefficient *v*

1

of T*K* from (12) becomes approximately diagonal at high frequencies.

decoupling also takes place if the leading coefficient *p*(*x*) of *L* is *not* constant.

onal entries of *M*1 become negligible, that is,

lower order in *ω*, we obtain

10 Applied Linear Algebra in Action

of vectors is component-wise.

The decoupling observed in the preceding discussion reveals that we can obtain approxima‐ tions of half of the block Gaussian quadrature nodes for all Fourier coefficients by applying "non-block" Lanczos iteration to the matrix *LN* with initial vector **u***<sup>n</sup>*, the computed solution, as is done in Krylov projection methods such as those described in [8–10]. These nodes, denoted by *λ*1, …, *λm* where *m* is the number of iterations, will be referred to as *frequency-independent nodes*. Because this iteration does not depend on the wave number *ω*, the frequency-inde‐ pendent nodes need only be computed once for each vector **u** for which an expression of the form *φ*(*LN τ*)**u***<sup>n</sup>* is required. The other half of the nodes can be estimated through an asymptotic analysis of Lanczos iteration applied to *LN* with initial vector **ê***<sup>ω</sup>* [11]; these are called *frequencydependent nodes* and will be denoted by *λ*1,*<sup>ω</sup>*, …, *λ<sup>m</sup>*,*<sup>ω</sup>*.

As an example, we consider the case where the matrix *A* comes from a spectral discretization of the operator *Lu* = − *puxx* + *q*(*x*)*u*, where *p* is a constant, and assuming periodic boundary conditions [11]. Carrying out three iterations, which corresponds to a fifth-order accurate KSS method for a parabolic PDE, yields the following recursion coefficients as functions of the wave number *ω*, after neglecting lower-order terms:

$$
\begin{bmatrix}
\alpha\_1 & \overline{\beta\_1} & 0 \\
\beta\_1 & \alpha\_2 & \overline{\beta\_2} \\
0 & \beta\_2 & \alpha\_3
\end{bmatrix} \approx \begin{bmatrix}
po^2 & \|\|\tilde{\mathbf{q}}\|\|\_2 & 0 \\
\|\|\tilde{\mathbf{q}}\|\|\_2 & po^2 & 2p \mid o \parallel \|\mathbf{q}\_x\|\|\_2 \/ \|\|\tilde{\mathbf{q}}\|\|\_2 \\
0 & 2p \mid o \parallel \|\mathbf{q}\_x\|\|\_2 \/ \|\|\tilde{\mathbf{q}}\|\|\_2
\end{bmatrix}.
$$

It follows that the frequency-dependent nodes can easily be estimated as

$$
\lambda\_{1,o} = \rho o^2, \quad \lambda\_{i,o} = \rho o^2 \pm \sqrt{\rho\_1^2 + \rho\_2^2}, \quad i = 2, 3 \tag{20}
$$

whereas for a third-order KSS method, the frequency-dependent nodes are

$$
\mathcal{A}\_{\mathfrak{r}\_{1,\alpha}} = p\alpha^2 + \mathcal{B}\_{\mathfrak{r}}, \quad \mathcal{A}\_{\mathfrak{r}\_{2,\alpha}} = p\alpha^2 - \mathcal{B}\_{\mathfrak{r}}.
$$

In [6, 7], similar formulas for the nodes are derived for a PDE with homogeneous Neumann boundary conditions, and for a 2-D PDE with periodic boundary conditions.

When the matrix *A* is a finite-difference representation of the underlying differential operator, the block Gaussian quadrature nodes can be represented more accurately if formulas for the eigenvalues of symmetric Toeplitz matrices are used for the leading-order terms in the nodes. For example, in (20), *pω*<sup>2</sup> is replaced by 2*p*(*N*/*π*) 2 (1 − cos(*πω*/*N*)), where *N* is the number of grid points.

#### **3.3. Non-self-adjoint operators**

The theory developed in [12] applies to symmetric positive definite matrices, but this prop‐ erty is not essential [17, 18]. That said, care must be taken with nonsymmetric matrices, as a straightforward use of unsymmetric Lanczos to treat bilinear forms as Riemann-Stiejtjes in‐ tegrals, as described in [19], can suffer from "serious breakdown" [20]. Therefore, a more ro‐ bust approach for applying KSS methods to linear PDE with non-self-adjoint spatial differential operators is to use Arnoldi iteration to approximate *φ*(*Aτ*)**b**. The algorithm for Arnoldi iteration, applied to a matrix *A* and initial vector **z**0, is as follows:


The output of Arnoldi iteration is an upper Hessenberg matrix *Hm*, and a matrix *Vm* with or‐ thonormal columns, such that

$$AV\_m = V\_m H\_m + h\_{m+1,m} \mathbf{z}\_{m+1} \mathbf{e}\_m^H. \tag{21}$$

By analogy with (17), to approximate **u***<sup>n</sup>* + 1 = *φ*(*Aτ*)**u***<sup>n</sup>*, we can compute each Fourier coefficient [**û***<sup>n</sup>* + 1]*ω* of **u***<sup>n</sup>* + 1 by applying *block* Arnoldi iteration [21] to *A*, with initial block *R*0(*ω*) as defined in (13). After *m* iterations, this iteration produces a block upper Hessenberg matrix *Hm*(*ω*), from which we obtain

$$\left[\hat{\mathbf{u}}^{n+1}\right]\_o = \left[B\_0(o)^H E\_{12}^H \boldsymbol{\varphi}(H\_m(o)\boldsymbol{\tau}) E\_{12} B\_0(o)\right]\_{12},\tag{22}$$

where *B*0(*ω*) and *E*<sup>12</sup> are as defined in (15) and (17), respectively. As shown in [6, 7], like with block Lanczos, the eigenvalue problem for *Hm*(*ω*) approximately decouples for high frequen‐ cies, due to the decay of the Fourier coefficients of **u***<sup>n</sup>*. It follows that we can easily estimate the frequency-dependent eigenvalues of *Hm*(*ω*), which are used as interpolation points for a polynomial approximation of *φ*(*λ*), by applying "non-block" Arnoldi iteration, as described in the above algorithm, with an initial vector **z**<sup>0</sup> chosen to be a discretization of an appropriate Fourier basis function. Additional details can be found in [6, 7].
