2.1. Stability of filter

optimal data assimilation algorithms (like Kalman filter (KF) [2], for example) for operational

This chapter is devoted to the role of perturbations as an efficient tool for predictability of dynamical system, ensemble forecasting and for overcoming the difficulties in the design of data assimilation algorithms, in particular, of the optimal adaptive filtering for extremely HdS. In [3, 4], Lorenz has studied the problem of predictability of the atmosphere. It is found that the atmosphere is a chaotic system and a predictability limit to numerical forecast is of about 2 weeks. The barrier of predictability has to be overcome in order to increase the time period of a forecast further. The fact that estimates of the current state are inaccurate and that numerical models have inadequacies leads to forecast errors that grow with increasing forecast lead time. Ensemble forecasting aims at quantifying this flow-dependent forecast uncertainty. Today, a medium-range forecast has become a standard product. In the 1990s, the ensemble forecasting (EnF) technique was introduced in operational centers such as the European Centre for Medium-range Weather Forecast (ECMWF) (see Palmer et al. [5]), the NCEP (US National Center for Environmental Prediction) (Toth and Kalnay [6]). It is found that a single forecast can depart rapidly from the real atmosphere. The idea of the ensemble forecasting is to add the perturbations around the control forecast to produce a collection of forecasts that try to better simulate the possible uncertainties in a numerical forecast. The ensemble mean can then act as a nonlinear filter such that its skill is higher than that of individual members in a statistical

The chapter is organized as follows. Section 2 outlines first the optimal perturbation (OP) theory, on how the OP plays the important role for seeking the most growing direction of prediction error (PE). The predictability theory of the dynamical system as well as a stability of the filtering algorithm all are developed on the basis of OP. The definition of the optimal deterministic perturbation (ODP) and some theoretical results on the ODP are introduced. It is found that the ODP is associated with the right singular vector (SV) of the system dynamics. In Section 3, the two other classes of ODPs are presented: the leading eigenvector (EV) and real Schur vector (SchV) of the system dynamics. Mention that the first EV is the ODS in the eigen invariant subspace (EI-InS) of the system dynamics. As to the leading SchV, it is ODS in the Schur invariant subspace (Sch-InS) which is closely related to the EI-InS in the sense that the subspace of the leading SchVs, generated by the sampling procedure (Sampling-P, Section 3), converges to the EI-InS. In Section 4, we present the other type of OP called as optimal stochastic perturbation (OSP). Mention that the OSP is a natural extension of the ODP which gives insight into understanding of what represents the most growing PE and how one can produce it by stochastically perturbing the initial state. One important class of perturbations (known as simultaneous stochastic perturbation—SSP) is presented in Section 5. It will be shown that the SSP is very efficient for solving optimization problems in high-dimensional (Hd) setting. The different algorithms for estimating, decomposing … Hd matrices are also presented here. Numerical examples are presented in Section 6 for illustrating the theoretical results and efficiency of the OPs in solving data assimilation problems. The experiment on data assimilation in the Hd ocean model MICOM by the filters constructed on the basis of the Schur ODSs and SSPs is presented in Section 7.

forecasting systems (OFS).

62 Perturbation Methods with Applications in Science and Engineering

sense (Toth and Kalnay [6]).

The concluding remarks are presented in Section 8.

The behavior of atmosphere or ocean is recognized as highly sensitive to initial conditions. It means that a small change in an initial condition can alter strongly the trajectory of the system. It is therefore important to be able to know about the directions of rapid growth of the system state. The research on OP is namely aimed at finding the methods to better capture these rapidly growing directions of the system dynamics, to optimize the predictability of the physical process under consideration.

To explain this phenomenon more clearly, consider a standard linear filtering problem

$$\mathbf{x}(k+1) = \Phi \mathbf{x}(k) + w(k+1), \mathbf{z}(k+1) = H \mathbf{x}(k+1) + v(k+1). \tag{1}$$

where Φ ∈ Rn�<sup>n</sup> is the state transition matrix, H ∈Rp�<sup>n</sup> is an observation matrix. Under standard conditions related to the model and observation noises wk, vk, the minimum mean squared (MMS) estimate <sup>b</sup>xk can be obtained by the well-known KF [2].

$$
\hat{\mathfrak{X}}(k+1) = \hat{\mathfrak{x}}(k+1/k) + K\zeta(k+1), \\
\hat{\mathfrak{x}}(k+1/k) = \Phi \hat{\mathfrak{x}}(k) \tag{2}
$$

where <sup>ζ</sup>ð Þ¼ <sup>k</sup> <sup>þ</sup> <sup>1</sup> z kð Þ� <sup>þ</sup> <sup>1</sup> <sup>H</sup>bx kð Þ <sup>þ</sup> <sup>1</sup>=<sup>k</sup> is the innovation vector, <sup>b</sup>x kð Þ <sup>þ</sup> <sup>1</sup> is the filtered (or analysis) estimate, <sup>b</sup>x kð Þ <sup>þ</sup> <sup>1</sup>=<sup>k</sup> is the one-step ahead prediction for x kð Þ <sup>þ</sup> <sup>1</sup> . The KF gain <sup>K</sup> is given by

$$K = MH^T \left[ HMH^T + R \right]^{-1} \tag{3}$$

From Eq. (2), it can be shown that the transition matrix for the filtered estimate equation is expressed by L ¼ ½ � I � KH Φ.

For HdS, the KF gain (3) is impossible to compute. In a study by Hoang et al. [7], it is suggested to find the gain with the structure

$$K = P\_r K\_t \tag{4}$$

with Pr ∈ Rn�ne —an operator projecting a vector from the reduced space Rne to the full system space Rn, Ke ∈Rne�<sup>p</sup> is the gain for the reduced filter. One of very important questions arising here is how one can choose a subspace of projection and structure of Ke to make L to be stable? It is found in the work done by Hoang et al. [8] that detectability of the input-output system (1) is sufficient for the existence of a stabilizing gain K and this gain can be constructed with Pr consisting from all unstable EVs (or unstable SVs, SchVs. See Section 3) of the system dynamics.

#### 2.2. Singular value decomposition and optimal perturbations

Consider the singular value decomposition (SVD) of Φ [9],

$$\begin{aligned} \Phi &= \mathcal{U}ID^{\top}, \mathcal{D} = \text{diag}\left[\sigma\_1, \sigma\_2, \dots, \sigma\_n\right], \sigma\_1 \ge \sigma\_2 \dots \ge \sigma\_m \\ \mathcal{U} &= [\mathcal{U}\_1, \mathcal{U}\_2], \mathcal{D} = \text{block diag} \ [\mathcal{D}\_1, \mathcal{D}\_2], \mathcal{V} = [V\_1, V\_2] \end{aligned} \tag{5}$$

assimilation instant. This ensemble size is too small compared to the system dimension. That is why it is important to have a good strategy for selecting the "optimal" samples (perturbations)

On Optimal and Simultaneous Stochastic Perturbations with Application to Estimation of High-Dimensional…

where k k: <sup>2</sup> denotes the Euclidean vector norm (here < :, : > denotes the dot product). Let Φ

<sup>δ</sup>xo ¼ þð Þ <sup>=</sup>� <sup>v</sup><sup>1</sup>

<sup>Φ</sup><sup>1</sup> <sup>≔</sup> <sup>Φ</sup> � <sup>σ</sup>1u1vT

<sup>δ</sup>xo ¼ þð Þ <sup>=</sup>� <sup>v</sup><sup>2</sup>

applying Lemma 2.2 with slight modifications, one finds that the OPs for Φi, i ¼ 0, 1, …, n � 1

under the constraint (7). Similar to the proof of Lemma 2.1, one can prove

<sup>Φ</sup><sup>i</sup> <sup>≔</sup> <sup>Φ</sup><sup>i</sup>�<sup>1</sup> � <sup>σ</sup>iuivT

Lemma 2.2. The optimal perturbation in the sense (9) and (7) is

1

<sup>S</sup>ð Þ <sup>δ</sup><sup>x</sup> : <sup>δ</sup>x; k k <sup>δ</sup><sup>x</sup> <sup>2</sup> <sup>¼</sup><sup>&</sup>lt; <sup>δ</sup>x; <sup>δ</sup><sup>x</sup> <sup>&</sup>gt;<sup>¼</sup> <sup>1</sup> (7)

http://dx.doi.org/10.5772/intechopen.77273

65

Jð Þ¼ δx k k Φδx <sup>2</sup> ! maxδ<sup>x</sup> (8)

J1ð Þ¼ δx k k Φ1δx <sup>2</sup> ! maxδ<sup>x</sup> (9)

<sup>i</sup> , i ¼ 1, …, n � 1; Φ<sup>0</sup> ¼ Φ: (10)

to better approximate the ECM in the filtering algorithm.

Definition 2.1. The ODP δx<sup>o</sup> is the solution of the extremal problem

Lemma 2.1. The optimal perturbation in the sense (8) and (7) is

2.3.1. Optimal deterministic perturbation

under the constraint (7). One can prove

where v<sup>1</sup> is the first right SV of Φ.

Consider the optimization problem

where v<sup>2</sup> is the second right SV of Φ.

By iteration, for

are ð Þ þ=� vi, i ¼ 1, 2,…n:.

2.3.2. Subspaces of ODPs

Introduce

Introduce

have the SVD (5).

where U1, V<sup>1</sup> ∈ R�n<sup>1</sup> , Dn<sup>1</sup> ∈ Rn1�n<sup>1</sup> , n<sup>1</sup> is the number of all unstable and neutral SVs of Φ. In the future, for simplicity, unless otherwise stated, we say on the set of all unstable SVs as that including all unstable and neutral SVs.

Suppose the system is s-detectable (detectability of all the columns of U<sup>1</sup> or V1). From the research by Hoang et al. [10], there exists a stabilizing gain Ks (sufficient but not necessary condition) with Pr ¼ U<sup>1</sup> (see Eq. (4)) such that the transition matrix L ¼ ð Þ I � KsH Φ is stable. It signifies that the filter is stable and the estimation error is bounded. The columns of U1, i.e., the left unstable SVs of Φ, serve as a basis for seeking appropriate correction in the filtering algorithm.

On the other hand, in practice, for extreme HdS, one cannot compute all elements of U<sup>1</sup> but only some of its subset U<sup>0</sup> <sup>1</sup> ∈ U1. Using U<sup>0</sup> <sup>1</sup> instead of U<sup>1</sup> cannot guarantee a filter stability. The ensemble forecasting has been proposed as an approach to prevent a possible large error in the forecast and requires a knowledge on the rapidly growing directions of the PE. In this context, the OPs appear to be important which allow to search the rapid growing directions of the PE by model integration of OPs. They (i.e., OPs) are infact the unstable right SVs (RSV).

#### 2.3. Optimal perturbation

Let δx be a given perturbation, representing an error (uncertainty, deterministic, or stochastic) around the true system state <sup>x</sup><sup>∗</sup>, i.e., <sup>b</sup>xf <sup>¼</sup> <sup>x</sup><sup>∗</sup> <sup>þ</sup> <sup>δ</sup>x, (at some instant <sup>k</sup>). The prediction of the system state <sup>b</sup>xp can be obtained by forwarding the numerical model (1) on the basis of <sup>b</sup>xf filtered estimate, i.e., <sup>b</sup>xp <sup>¼</sup> <sup>Φ</sup>bxf . We have then

$$
\widehat{\mathbf{x}}\_p = \Phi \widehat{\mathbf{x}}\_f = \Phi[\mathbf{x}^\* + \delta \mathbf{x}] = \Phi \mathbf{x}^\* + \Phi \delta \mathbf{x},\tag{6}
$$

One sees the perturbation δx in the initial system state "grows" into Φδx which represents uncertainty in the forecast.

In general, the perturbation δx may be any element in the n-dimensional space, R<sup>n</sup>, i.e., δx∈R<sup>n</sup>. For e ð Þl <sup>f</sup> ≔ δx ð Þl <sup>f</sup> —a sample of the filtered error (FE)ef , integrating the model by ef results in e ð Þl <sup>p</sup> ≔ Φδx ð Þl <sup>f</sup> a sample for the PE ep. By generating the ensemble of perturbations Efð Þ L ≔ e ð Þl <sup>f</sup> ; l ¼ 1;…; L n o according to the distribution of ef , one can produce the ensemble of PE samples Epð Þ L ≔ e ð Þl <sup>p</sup> ; l ¼ 1;…; L n o and use them to estimate the distribution of the PE ep. This serves as a basis for the particle filtering [11].

The ensemble-based filtering (EnBF) algorithm is simplified for the standard linear filtering problems with ef being of zero mean and the error covariance matrix (ECM) P. This technique is aimed to approximate the ECM without solving the matrix Ricatti equation (Ensemble KF— EnKF, [12]. Mention that at the present, one can generate only about O(100) samples at each assimilation instant. This ensemble size is too small compared to the system dimension. That is why it is important to have a good strategy for selecting the "optimal" samples (perturbations) to better approximate the ECM in the filtering algorithm.

### 2.3.1. Optimal deterministic perturbation

Introduce

(5)

<sup>Φ</sup> <sup>¼</sup> U DVT, D <sup>¼</sup> diag½ � <sup>σ</sup>1; <sup>σ</sup>2; …; <sup>σ</sup><sup>n</sup> , <sup>σ</sup><sup>1</sup> <sup>≥</sup> <sup>σ</sup>2…<sup>≥</sup> <sup>σ</sup>n, U ¼ ½ � U1; U<sup>2</sup> , D ¼ block diag ½ � D1; D<sup>2</sup> , V ¼ ½ � V1; V<sup>2</sup>

where U1, V<sup>1</sup> ∈ R�n<sup>1</sup> , Dn<sup>1</sup> ∈ Rn1�n<sup>1</sup> , n<sup>1</sup> is the number of all unstable and neutral SVs of Φ. In the future, for simplicity, unless otherwise stated, we say on the set of all unstable SVs as that

Suppose the system is s-detectable (detectability of all the columns of U<sup>1</sup> or V1). From the research by Hoang et al. [10], there exists a stabilizing gain Ks (sufficient but not necessary condition) with Pr ¼ U<sup>1</sup> (see Eq. (4)) such that the transition matrix L ¼ ð Þ I � KsH Φ is stable. It signifies that the filter is stable and the estimation error is bounded. The columns of U1, i.e., the left unstable SVs of Φ, serve as a basis for seeking appropriate correction in the filtering

On the other hand, in practice, for extreme HdS, one cannot compute all elements of U<sup>1</sup> but

ensemble forecasting has been proposed as an approach to prevent a possible large error in the forecast and requires a knowledge on the rapidly growing directions of the PE. In this context, the OPs appear to be important which allow to search the rapid growing directions of the PE

Let δx be a given perturbation, representing an error (uncertainty, deterministic, or stochastic) around the true system state <sup>x</sup><sup>∗</sup>, i.e., <sup>b</sup>xf <sup>¼</sup> <sup>x</sup><sup>∗</sup> <sup>þ</sup> <sup>δ</sup>x, (at some instant <sup>k</sup>). The prediction of the system state <sup>b</sup>xp can be obtained by forwarding the numerical model (1) on the basis of <sup>b</sup>xf—

One sees the perturbation δx in the initial system state "grows" into Φδx which represents

In general, the perturbation δx may be any element in the n-dimensional space, R<sup>n</sup>, i.e., δx∈R<sup>n</sup>. For

according to the distribution of ef , one can produce the ensemble of PE samples Epð Þ L ≔

The ensemble-based filtering (EnBF) algorithm is simplified for the standard linear filtering problems with ef being of zero mean and the error covariance matrix (ECM) P. This technique is aimed to approximate the ECM without solving the matrix Ricatti equation (Ensemble KF— EnKF, [12]. Mention that at the present, one can generate only about O(100) samples at each

and use them to estimate the distribution of the PE ep. This serves as a basis for

<sup>f</sup> —a sample of the filtered error (FE)ef , integrating the model by ef results in e

a sample for the PE ep. By generating the ensemble of perturbations Efð Þ L ≔ e

by model integration of OPs. They (i.e., OPs) are infact the unstable right SVs (RSV).

<sup>1</sup> instead of U<sup>1</sup> cannot guarantee a filter stability. The

<sup>b</sup>xp <sup>¼</sup> <sup>Φ</sup>bxf <sup>¼</sup> <sup>Φ</sup> <sup>x</sup><sup>∗</sup> ½ �¼ <sup>þ</sup> <sup>δ</sup><sup>x</sup> <sup>Φ</sup>x<sup>∗</sup> <sup>þ</sup> <sup>Φ</sup>δx, (6)

ð Þl <sup>p</sup> ≔ Φδx

<sup>f</sup> ; l ¼ 1;…; L n o

ð Þl

ð Þl <sup>f</sup> —

<sup>1</sup> ∈ U1. Using U<sup>0</sup>

including all unstable and neutral SVs.

64 Perturbation Methods with Applications in Science and Engineering

algorithm.

only some of its subset U<sup>0</sup>

2.3. Optimal perturbation

uncertainty in the forecast.

e ð Þl <sup>f</sup> ≔ δx ð Þl

> e ð Þl

<sup>p</sup> ; l ¼ 1;…; L n o

the particle filtering [11].

filtered estimate, i.e., <sup>b</sup>xp <sup>¼</sup> <sup>Φ</sup>bxf . We have then

$$\left\{ \mathbf{S}(\delta \mathbf{x}) : \delta \mathbf{x}, \|\delta \mathbf{x}\|\_{2} = <\delta \mathbf{x}, \delta \mathbf{x} > = \mathbf{1} \right\} \tag{7}$$

where k k: <sup>2</sup> denotes the Euclidean vector norm (here < :, : > denotes the dot product). Let Φ have the SVD (5).

Definition 2.1. The ODP δx<sup>o</sup> is the solution of the extremal problem

$$J(\delta \mathbf{x}) = \|\Phi \delta \mathbf{x}\|\_2 \to \mathbf{max}\_{\delta \mathbf{x}} \tag{8}$$

under the constraint (7). One can prove

Lemma 2.1. The optimal perturbation in the sense (8) and (7) is

$$
\delta \mathfrak{x}^{\rho} = (+/-)\upsilon\_1,
$$

where v<sup>1</sup> is the first right SV of Φ.

2.3.2. Subspaces of ODPs

Introduce

$$\Phi\_1 \coloneqq \Phi - \sigma\_1 u\_1 v\_1^T$$

Consider the optimization problem

$$J\_1(\delta \mathbf{x}) = \|\Phi\_1 \delta \mathbf{x}\|\_2 \to \max\_{\delta \mathbf{x}} \tag{9}$$

under the constraint (7). Similar to the proof of Lemma 2.1, one can prove

Lemma 2.2. The optimal perturbation in the sense (9) and (7) is

$$
\delta \mathfrak{x}^o = (+/-)\upsilon\_2
$$

where v<sup>2</sup> is the second right SV of Φ.

By iteration, for

$$\Phi\_{\mathbf{i}} \coloneqq \Phi\_{\mathbf{i}-1} - \sigma\_{\mathbf{i}} u\_{\mathbf{i}} v\_{\mathbf{i}}^T, \mathbf{i} = 1, \dots, n - 1; \Phi\_0 = \Phi. \tag{10}$$

applying Lemma 2.2 with slight modifications, one finds that the OPs for Φi, i ¼ 0, 1, …, n � 1 are ð Þ þ=� vi, i ¼ 1, 2,…n:.

Theorem 2.1. The optimal perturbation for the matrix Φ<sup>i</sup>�<sup>1</sup>, i ¼ 1, …, n is ð Þ þ=� vi where vi is the ith leading right SV of Φ<sup>i</sup>�<sup>1</sup>, i ¼ 1, …, n.

i. Instability: As the filter gain is estimated stochastically during the optimization process, the filter may become unstable due to the stochastic character of the filter gain.

On Optimal and Simultaneous Stochastic Perturbations with Application to Estimation of High-Dimensional…

http://dx.doi.org/10.5772/intechopen.77273

67

ii. Reduction of tuning parameters: For extreme HdS, the number of elements in the filter gain is still very high. Reduction of the number of tunning gain elements is necessary.

Interest on stability of the AF arises soon after the AF has been introduced. The study on the filter stability shows that it is possible to provide a filter stability when the system is detectable [8]. For the different parameterized stabilizing gain structures based on a subspace of unstable and neutral EVs, see [8]. As the EVs may be complex and their computation is unstable (Lanczos [14]), in [8], it is proved that one can also ensure a stability of the filter if the space of projection is constructed from a set of unstable and neutral SchVs of the system dynamics. The unstable and neutral real SchVs are referred to as SchVs associated with the unstable and neutral eigenvalues of the system dynamics. The advantage of the real SchVs is that they are real, orthonormal, and their computation is stable. Moreover, the algorithm for estimating dominant SchVs is simple which is based on the power iteration procedure (Sampling-P, see [15]). As to the unstable SVs, although they are real and orthonormal, their computation requires an adjoint operator (the transpose matrix Φ<sup>T</sup>). Construction of adjoint code (AC) is a time-consuming and tedious process. Approximating leading SVs without

EV1ð Þ <sup>x</sup>; <sup>λ</sup> : <sup>x</sup><sup>∈</sup> Cn; k k<sup>x</sup> <sup>2</sup> <sup>¼</sup> <sup>1</sup>; <sup>λ</sup><sup>∈</sup> <sup>C</sup><sup>1</sup> : <sup>Φ</sup><sup>x</sup> <sup>¼</sup> <sup>λ</sup><sup>x</sup> : (11)

; <sup>j</sup>λ<sup>j</sup> <sup>&</sup>lt; <sup>j</sup>λ1<sup>j</sup> : (13)

ð Þ <sup>δ</sup>x; <sup>λ</sup> <sup>∈</sup>EV1ð Þ <sup>δ</sup>x; <sup>λ</sup> , (12)

The subspace of <sup>x</sup><sup>∈</sup> Cn satisfying <sup>Φ</sup><sup>x</sup> <sup>¼</sup> <sup>λ</sup><sup>x</sup> for some <sup>λ</sup> <sup>∈</sup>C<sup>1</sup> is known as an invariant subspace of Φ: the matrix Φx acts on to stretch the vector x but conserves the direction of x. Consider the

Jð Þ¼ δx k k Φδx <sup>2</sup> ! maxδx,

It is seen that the optimal solution is the first EV xeið Þ1 of Φ with the largest magnitude equal to

For a symmetric matrix, the EI-OP coincides with the SOP. The EI-OP is not unique.

EV2ð Þ <sup>x</sup>; <sup>λ</sup> : <sup>x</sup><sup>∈</sup> Cn; k k<sup>x</sup> <sup>2</sup> <sup>¼</sup> <sup>1</sup> : <sup>Φ</sup><sup>x</sup> <sup>¼</sup> <sup>λ</sup>x; <sup>λ</sup> <sup>∈</sup>C<sup>1</sup>

∣λ1∣. We will call λ<sup>1</sup> a first optimal EV perturbation (denoted as EI-OP).

one finds the second EI-OP xeið Þ2 . In a similar way, by defining

3.2. Leading EVs and SchVs as optimal perturbations

(AC) can be done on the basis of Algorithms 5.2.

Let Φ be diagonalizable. Introduce the set

By solving the optimization problem (8) s.t.

optimization problem

3.3. EVs as optimal perturbations in the invariant subspace

The OP for Φ<sup>i</sup>�<sup>1</sup> will be called the i th OP for Φ (or the i th SOP—singular OP).

Comment 2.2. The OPs, presented above, are optimal in the sense of the Euclidean norm k k: <sup>2</sup>. In practice, there is a need to normalize the state vector (using the inverse of the covariance matrix <sup>M</sup>). The normalization is done by changing <sup>δ</sup>x<sup>0</sup> <sup>¼</sup> <sup>M</sup>�1=<sup>2</sup> δx, and all the results presented above remain valid s.t. the new δx<sup>0</sup> ,

$$\|\delta \mathbf{x}'\|\_2 = < M^{-1/2} \delta \mathbf{x}, M^{-1/2} \delta \mathbf{x}> = < \delta \mathbf{x}, M^{-1} \delta \mathbf{x}> = \|\delta \mathbf{x}\|\_{M^{-1}}$$

The weighted norm k k δx <sup>M</sup>�<sup>1</sup> is known as the Mahanalobis norm.

As to the PE, a normalization is also applied in order to have a possibility to compare different variables like density, temperature, velocity … In this situation, the norm for y ¼ Φδx may be seminorm [5].
