*C t*( ) Moore-Penrose pseudoinverse of *C*(*t*).

Approximate Wiener-Hopf factor.

ˆ *<sup>H</sup>* The adjoint of ˆ

ˆ *<sup>H</sup>* The inverse of ˆ *<sup>H</sup>* .

<sup>ˆ</sup> { } *<sup>H</sup>* The causal part of <sup>ˆ</sup> *<sup>H</sup>* .

Systems", *Automatica*, vol. 9, pp. 151-162, 1973

*Control*, vol. 6, no. 2, pp. 189 – 199, 1967.

*Information Theory*, vol. 20, no. 2, pp. 146 – 181, Mar., 1974.

*Control*, McGraw-Hill Book Company, New York, 1971.

**Problem 5.** Suppose 0 is a system parameterised by ( ) 0 *A t I I* , show that () () *<sup>T</sup> PtA t - A*() () *tPt =* <sup>0</sup> ( ) *<sup>H</sup> P t +* <sup>1</sup> <sup>0</sup> *P t*( ) *.* 

**Problem 6.** The optimum minimum-variance smoother was developed by finding the solution that minimises 2 2 <sup>2</sup> *<sup>H</sup> y y* . Use the same completing-the-square approach to find the optimum minimum-variance filter. (Hint: Find the solution that minimises 2 2 <sup>2</sup> *<sup>T</sup> y y* .)

**Problem 7 [9].** Derive the output estimation minimum-variance filter by finding a solution Let *a* , *b = 1, c*  and *d* = 0 denote the time-invariant state-space parameters of the plant . Denote the error covariance, gain of the Kalman filter and gain of the maximumlikelihood smoother by *p*, *k* and *g*, respectively. Show that

$$H\_1(\mathbf{s}) = (k(\mathbf{s} - a + k\mathbf{c})^{\cdot 1})^{\cdot 1}$$

$$H\_2(\mathbf{s}) = (\mathbf{c}\otimes\mathbf{k}(\mathbf{-s} - a + \mathbf{g})^{\cdot 1}(\mathbf{s} - a + k\mathbf{c})^{\cdot 1})$$

$$H\_3(\mathbf{s}) = (kc(-a + kc)(\mathbf{s} - a + kc)^{\cdot 1}(-\mathbf{s} - a + kc)^{\cdot 1})$$

$$H\_4(\mathbf{s}) = ((-a + kc)^2 - (-a + kc - k)^2)(\mathbf{s} - a + kc)^{\cdot 1}(-\mathbf{s} - a + kc)^{\cdot 1}$$

are the transfer functions of the Kalman filter, maximum-likelihood smoother, the Fraser-Potter smoother and the minimum variance smoother, respectively.

#### **Problem 8.**


#### **6.8 Glossary**


<sup>&</sup>quot;Once a new technology rolls over you, if you're not part of the steamroller, you're part of the road." *Stewart Brand*

Smoothing, Filtering and Prediction:

, show that () () *<sup>T</sup> PtA t -* 

*<sup>T</sup> y y* .)

0 *A t I I* 

*<sup>H</sup> y y* . Use the same completing-the-square approach to find the

<sup>146</sup> Estimating the Past, Present and Future

**Problem 6.** The optimum minimum-variance smoother was developed by finding the

**Problem 7 [9].** Derive the output estimation minimum-variance filter by finding a solution Let *a* , *b = 1, c*  and *d* = 0 denote the time-invariant state-space parameters of the

. Denote the error covariance, gain of the Kalman filter and gain of the maximum-

*H*1(*s*) = *k*(*s*–*a*+*kc*)-1, *H*2(*s*) = *cgk*(–*s*–*a*+*g*)-1(*s*–*a*+*kc*)-1, *H*3(*s*) = *kc*(–*a* + *kc*)(*s a + kc)*-1*(*–*s* –*a* + *kc)*-1, *H*4(*s*) = ((–*a* + *kc*)2 – (–*a* + *kc* – *k*)2)(*s* – *a* + *kc*)-1(–*s* –*a* + *kc*)-1

are the transfer functions of the Kalman filter, maximum-likelihood smoother, the Fraser-

(i) Develop a state-space formulation of an approximate Wiener-Hopf factor for the

(ii) Use the matrix inversion lemma to obtain the inverse of the approximate Wiener-

*p*(*x*(*t*)) Probability density function of a continuous random variable *x*(*t*).

*f*(*x*(*t*)) Cumulative distribution function or likelihood function of *x*(*t*).

Estimate of *x*(*t*) at time *t* given data at fixed time lag τ.

"Once a new technology rolls over you, if you're not part of the steamroller, you're part of the road."

*xt T* ˆ(| ) Estimate of *x*(*t*) at time *t* given data over a fixed interval *T*. *wt T* ˆ (| ) Estimate of *w*(*t*) at time *t* given data over a fixed interval *T*.

case when the plant includes a nonzero direct feedthrough matrix (that is, *D*(*t*) ≠ 0).

The random variable *x*(*t*) has a normal distribution with mean *μ*

0 is a system parameterised by ( )

optimum minimum-variance filter. (Hint: Find the solution that minimises 2 2 <sup>2</sup>

**Problem 5.** Suppose

*A*() () *tPt =* <sup>0</sup> ( ) *<sup>H</sup> P t*

plant 

**Problem 8.**

**6.8 Glossary** 

( )~ ( , ) *xx xt R* 

*xt t* ˆ(| ) 

*Stewart Brand*

solution that minimises 2 2 <sup>2</sup>

 *+* <sup>1</sup> <sup>0</sup> *P t*( ) *.* 

likelihood smoother by *p*, *k* and *g*, respectively. Show that

Potter smoother and the minimum variance smoother, respectively.

Hopf factor for the minimum-variance smoother.

and covariance *Rxx*.


#### **6.9 References**


<sup>&</sup>quot;I don't want to be left behind. In fact, I want to be here before the action starts." *Kerry Francis Bullmore Packer*


<sup>&</sup>quot;In times of profound change, the learners inherit the earth, while the learned find themselves beautifully equipped to deal with a world that no longer exists." *Al Rogers*

Chapter title

Author Name

## **Discrete-Time Smoothing**

#### 1 **7.1. Introduction**

Smoothing, Filtering and Prediction:

<sup>148</sup> Estimating the Past, Present and Future

[11] D. J. N. Limebeer, B. D. O. Anderson, P. Khargonekar and M. Green, "A Game Theoretic Approach to H Control for Time-varying Systems", *SIAM Journal on Control* 

[12] G. Freiling, G. Jank and H. Abou-Kandil, "Generalized Riccati Difference and Differential Equations", *Linear Algebra And Its Applications*, pp. 291 – 303, 199 [13] L. E. Zachrisson, "On Optimal Smoothing of Continuous Time Kalman Processes,

[14] U. Shaked and Y. Theodore, "H∞ – Optimal Estimation: A Tutorial", *Proc. 31st IEEE* 

[15] J. S. Meditch, "On Optimal Linear Smoothing Theory", *Information and Control*, vol. 10,

[16] P. S. Maybeck, *Stochastic Models, Estimation and Control*, Vol. 2, Academic press, New

[17] T. Kailath, A. H. Sayed and B. Hassibi, *Linear Estimation*, Pearson Prentice Hall, New

[18] B. D. O. Anderson, "Properties of Optimal Linear Smoothing", *IEEE Transactions on* 

[19] S. Chirarattananon and B. D. O. Anderson, "Stable Fixed-Lag Smoothing of Continuous-time Processes", IEEE *Transactions on Information Theory*, vol. 20, no. 1, pp.

[22] T. Kailath and P. Frost, "An Innovations Approach to Least-Squares Estimation Part II: Linear Smoothing in Additive White Noise", *IEEE Transactions on Automat. Control*, vol.

[23] F. L. Lewis, L. Xie and D. Popa, *Optimal and Robust Estimation With an Introduction to Stochastic Control Theory*, Second Edition, CRC Press, Taylor & Francis Group, 2008. [24] F. A. Badawi, A. Lindquist and M. Pavon, "A Stochastic Realization Approach to the Smoothing Problem", *IEEE Transactions on Automatic Control*, vol. 24, no. 6, Dec. 1979. [25] J. A. Rice, *Mathematical Statistics and Data Analysis* (Second Edition), Duxbury Press,

[26] R. G. Brown and P. Y. C. Hwang, *Introduction to Random Signals and Applied Kalman* 

[27] M. Green and D. J. N. Limebeer, *Linear Robust Control*, Prentice-Hall Inc, Englewood

"In times of profound change, the learners inherit the earth, while the learned find themselves

*Filtering*, Second Edition, John Wiley & Sons, Inc., New York, 1992.

beautifully equipped to deal with a world that no longer exists." *Al Rogers*

[20] A. Gelb, *Applied Optimal Estimation*, The Analytic Sciences Corporation, USA, 1974. [21] L. Ljung and T. Kailath, "A Unified Approach to Smoothing Formulas", *Automatica*, vol.

*Conference on Decision Control*, Ucson, Arizona, pp. 2278 – 2286, Dec. 1992.

*and Optimization*, vol. 30, no. 2, pp. 262 – 283, 1992.

*Information Sciences*, vol. 1, pp. 143 – 172, 1969.

*Automatic Control*, vol. 14, no. 1, pp. 114 – 115, Feb. 1969.

pp. 598 – 615, 1967.

York, 1982.

Jersey, 2000.

25 – 36, Jan. 1974.

12, pp. 147 – 157, 197

13, no. 6, pp. 655 – 660, 1968.

Belmont, California, 1995.

Cliffs, New Jersey, 1995.

Observations are invariably accompanied by measurement noise and optimal filters are the usual solution of choice. Filter performances that fall short of user expectations motivate the pursuit of smoother solutions. Smoothers promise useful mean-square-error improvement at mid-range signal-to-noise ratios, provided that the assumed model parameters and noise statistics are correct.

Discrete-Time Smoothing 149

In general, discrete-time filters and smoothers are more practical than the continuous-time counterparts. Often a designer may be able to value-add by assuming low-order discretetime models which bear little or no resemblance to the underlying processes. Continuoustime approaches may be warranted only when application-specific performance considerations outweigh the higher overheads.

This chapter canvasses the main discrete-time fixed-point, fixed-lag and fixed interval smoothing results [1] – [9]. Fixed-point smoothers [1] calculate an improved estimate at a prescribed past instant in time. Fixed-lag smoothers [2] – [3] find application where small end-to-end delays are tolerable, for example, in press-to-talk communications or receiving public broadcasts. Fixed-interval smoothers [4] – [9] dispense with the need to fine tune the time of interest or the smoothing lags. They are suited to applications where processes are staggered such as delayed control or off-line data analysis. For example, in underground coal mining, smoothed position estimates and control signals can be calculated while a longwall shearer is momentarily stationary at each end of the face [9]. Similarly, in exploration drilling, analyses are typically carried out post-data acquisition.

The smoother descriptions are organised as follows. Section 7.2 sets out two prerequisites: time-varying adjoint systems and Riccati difference equation comparison theorems. Fixedpoint, fixed-lag and fixed-interval smoothers are discussed in Sections 7.3, 7.4 and 7.5, respectively. It turns out that the structures of the discrete-time smoothers are essentially the same as those of the previously-described continuous-time versions. Differences arise in the calculation of Riccati equation solutions and the gain matrices. Consequently, the treatment

<sup>&</sup>quot;An inventor is simply a person who doesn't take his education too seriously. You see, from the time a person is six years old until he graduates from college he has to take three or four examinations a year. If he flunks once, he is out. But an inventor is almost always failing. He tries and fails maybe a thousand times. If he succeeds once then he's in. These two things are diametrically opposite. We often say that the biggest job we have is to teach a newly hired employee how to fail intelligently. We have to train him to experiment over and over and to keep on trying and failing until he learns what will work." *Charles Franklin Kettering*

Let *t k* 1 = <sup>1</sup> 1 1

*t k t k T tk tk tk*

*Q A A CC* 

1 11 1 (, , , ) *PACQ tk tk tk tk*

*<sup>k</sup>* <sup>1</sup> *J* = <sup>1</sup> <sup>1</sup>

*Theorem 1: [10]: Suppose that* 

*for a t ≥ 0 and for all k ≥ 0. Then* 

*(i) there exists a Pt ≥ Pt*1 and

of improvement." *Winston Leonard Spencer-Churchill*

1 1 1

*(ii)* <sup>1</sup> <sup>1</sup>

*Then Pt k ≥ Pt k* 1 *for all k ≥ 0.* 

*Pt k =* 1 11 1 (, , , ) *PACQ tk tk tk tk*

operators.

*for all k ≥ 0.* 

comparison theorem.

1 1

and define

initial solution *Pt k* 1 but different state-space parameters

<sup>1</sup>

1 1 1 1 1

1 11 1 <sup>1</sup> ( , , , )( ,,,) *P A C Q P ACQ tk tk tk tk tk tk tk tk*

*Theorem 2: [11], [8]: With the above definitions, suppose for a t ≥ 0 and for all k ≥ 0 that:* 

*T T*

.

*t k t k tk tk T T tk tk tk tk tk tk*

*Q A QA A C C A CC* 

<sup>1</sup> <sup>1</sup> ( ,,, ) *<sup>T</sup> P A C Q AP A Q tk tk tk tk tk tk tk tk*

denote the Hamiltonian matrix corresponding to

. Consider a second Riccati operator employing the same

<sup>1</sup> <sup>1</sup> <sup>1</sup> ( ) *<sup>T</sup> <sup>T</sup> <sup>T</sup> Atk tk tk tk tk tk tk tk tk P C ICP C CP A*

.

(8)

. Then

, in which *I* is an identity matrix of

(7)

0

appropriate dimensions. It is known that solutions of (6) are monotonically dependent on

The following theorem, which is due to Wimmer [10], compares the above two Riccati

*t k t k tk tk T T tk tk tk tk tk tk*

The above result underpins the following more general Riccati difference equation

*Proof: Assumption (i) is the k = 0 case for an induction argument. For the inductive step, denote* 

"Although personally I am quite content with existing explosives, I feel we must not stand in the path

and *Pt k* 1 = (, , , ) *PACQ tk tk tk tk*

*Q A QA A C C A CC* 

*T T*

*<sup>J</sup> <sup>I</sup>* 

0 *I*

*t k t k*

1 1 1

*T*

*A CC Q A* 

*T t k tk tk T*

is somewhat condensed. It is reaffirmed that the above-mentioned smoothers outperform the Kalman filter and the minimum-variance smoother provides the best performance.

#### **7.2. Prerequisites**

#### **7.2.1Time-varying Adjoint Systems**

Consider a linear time-varying system, , operating on an input, *w*, namely, *y* = *w*. Here, *w* denotes the set of *wk* over an interval *k* [0, *N*]. It is assumed that : *<sup>p</sup>* → *<sup>q</sup>* has the state-space realisation

$$\mathbf{x}\_{k+1} = A\_k \mathbf{x}\_k + B\_k \mathbf{z}\_{k\text{-}\prime} \tag{1}$$

$$\mathbf{y}\_k = \mathbf{C}\_k \mathbf{x}\_k + D\_k \mathbf{w}\_k \,. \tag{2}$$

As before, the adjoint system, *<sup>H</sup>* , satisfies

$$<\!y \,, \mathcal{G} \, w \rhd = \! \! \mathcal{G} \, ^H y \,, w \rhd \tag{3}$$

for all *y <sup>q</sup>* and *w <sup>p</sup>* .

*Lemma 1: In respect of the system described by (1) – (2), with x0 = 0, the adjoint system <sup>H</sup> having the realisation* 

$$\mathcal{L}\_{k-1} = \mathbf{A}\_k^T \mathcal{L}\_k - \mathbf{C}\_k^T \boldsymbol{\mu}\_k \tag{4}$$

$$\mathbf{z}\_k = -\mathbf{B}\_k^\mathrm{T} \boldsymbol{\zeta}\_k + \mathbf{D}\_k^\mathrm{T} \boldsymbol{u}\_k \,\tag{5}$$

*with* 0 *<sup>N</sup> , satisfies (3).* 

A proof appears in [7] and proceeds similarly to that within Lemma 1 of Chapter 2. The simplification *Dk* = 0 is assumed below unless stated otherwise.

#### **7.2.2 Riccati Equation Comparison**

The ensuing performance comparisons of filters and smoothers require methods for comparing the solutions of Riccati difference equations which are developed below. Simplified Riccati difference equations which do not involve the *Bk* and measurement noise covariance matrices are considered initially. A change of variables for the more general case is stated subsequently.

Suppose there exist *At k* <sup>1</sup> *n n* , *Ct k* <sup>1</sup> *<sup>p</sup> <sup>n</sup>* , *Qt k* <sup>1</sup> = <sup>1</sup> *<sup>T</sup> Qt k n n* and *Pt k* 1 = <sup>1</sup> *<sup>T</sup> Pt k n n* for a *t* ≥ 0 and *k* ≥ 0. Following the approach of Wimmer [10], define the Riccati operator

$$\begin{split} \Phi(P\_{t+k-1}, A\_{t+k-1}, \tilde{\mathbf{C}}\_{t+k-1}, \tilde{\mathbf{Q}}\_{t+k-1}) &= A\_{t+k-1} P\_{t+k-1} A\_{t+k-1}^T + \tilde{\mathbf{Q}}\_{t+k-1} \\ &- A\_{t+k-1} P\_{t+k-1} \tilde{\mathbf{C}}\_{t+k-1}^T (I + \tilde{\mathbf{C}}\_{t+k-1} P\_{t+k-1} \tilde{\mathbf{C}}\_{t+k-1}^T)^{-1} \tilde{\mathbf{C}}\_{t+k-1} P\_{t+k-1} A\_{t+k-1}^T. \end{split} \tag{6}$$

<sup>&</sup>quot;If you're not failing every now and again, it's a sign you're not doing anything very innovative." *(Woody) Allen Stewart Konigsberg*

Smoothing, Filtering and Prediction:

*n n* and *Pt k* 1 = <sup>1</sup>

*w*.

> (1) (2)

*<sup>T</sup> Pt k*

(6)

(4) (5)

→ *<sup>q</sup>* has

, operating on an input, *w*, namely, *y* =

*y*, *w*> (3)

 *described by (1) – (2), with x0 = 0, the adjoint system <sup>H</sup>*

<sup>150</sup> Estimating the Past, Present and Future

is somewhat condensed. It is reaffirmed that the above-mentioned smoothers outperform the Kalman filter and the minimum-variance smoother provides the best performance.

Here, *w* denotes the set of *wk* over an interval *k* [0, *N*]. It is assumed that : *<sup>p</sup>*

*k kk k k* <sup>1</sup> *x Ax Bw , k kk k k y Cx Dw .*

, satisfies

1

simplification *Dk* = 0 is assumed below unless stated otherwise.

1 11 1 11 1 <sup>1</sup> (, , , ) *<sup>T</sup> P A C Q APA Q tk tk tk tk tk tk tk tk*

<sup>1</sup>

 *w*> =< *<sup>H</sup>* 

*T T*

A proof appears in [7] and proceeds similarly to that within Lemma 1 of Chapter 2. The

The ensuing performance comparisons of filters and smoothers require methods for comparing the solutions of Riccati difference equations which are developed below. Simplified Riccati difference equations which do not involve the *Bk* and measurement noise covariance matrices are considered initially. A change of variables for the more general case

*<sup>p</sup> <sup>n</sup>* , *Qt k* <sup>1</sup>

*n n* for a *t* ≥ 0 and *k* ≥ 0. Following the approach of Wimmer [10], define the Riccati

"If you're not failing every now and again, it's a sign you're not doing anything very innovative."

 = <sup>1</sup> *<sup>T</sup> Qt k*

.

11 1 11 1 11 1 ( ) *<sup>T</sup> <sup>T</sup> <sup>T</sup> A P C IC P C C P A tk tk tk tk tk tk tk tk tk* 

 *k kk kk A Cu , T T k kk kk z B Du ,*

<*y*, 

**7.2. Prerequisites** 

the state-space realisation

As before, the adjoint system, *<sup>H</sup>*

*Lemma 1: In respect of the system* 

for all *y <sup>q</sup>* and *w <sup>p</sup>* .

*, satisfies (3).* 

is stated subsequently.

*(Woody) Allen Stewart Konigsberg*

**7.2.2 Riccati Equation Comparison** 

Suppose there exist *At k* <sup>1</sup> *n n* , *Ct k* <sup>1</sup>

*having the realisation* 

*with* 0 *<sup>N</sup>* 

operator

**7.2.1Time-varying Adjoint Systems**  Consider a linear time-varying system, Let *t k* 1 = <sup>1</sup> 1 1 1 1 *T t k tk tk T t k t k A CC Q A* denote the Hamiltonian matrix corresponding to 1 11 1 (, , , ) *PACQ tk tk tk tk* and define 0 0 *I <sup>J</sup> <sup>I</sup>* , in which *I* is an identity matrix of appropriate dimensions. It is known that solutions of (6) are monotonically dependent on *<sup>k</sup>* <sup>1</sup> *J* = <sup>1</sup> <sup>1</sup> 1 1 1 *T t k t k T tk tk tk Q A A CC* . Consider a second Riccati operator employing the same

initial solution *Pt k* 1 but different state-space parameters

$$\begin{split} \Phi(P\_{t+k-1}, A\_{t+k}, \tilde{\mathbf{C}}\_{t+k}, \tilde{\mathbf{Q}}\_{t+k}) &= A\_{t+k} P\_{t+k-1} A\_{t+k}^T + \tilde{\mathbf{Q}}\_{t+k} \\ &- A\_{t+k} P\_{t+k-1} \tilde{\mathbf{C}}\_{t+k}^T (I + \tilde{\mathbf{C}}\_{t+k} P\_{t+k-1} \tilde{\mathbf{C}}\_{t+k}^T)^{-1} \tilde{\mathbf{C}}\_{t+k} P\_{t+k-1} A\_{t+k}^T . \end{split} \tag{7}$$

The following theorem, which is due to Wimmer [10], compares the above two Riccati operators.

*Theorem 1: [10]: Suppose that* 

$$
\begin{bmatrix}
\tilde{\boldsymbol{Q}}\_{t\ast k-1} & \boldsymbol{A}\_{t\ast k-1}^{\boldsymbol{T}} \\
\boldsymbol{A}\_{t\ast k-1} & -\tilde{\boldsymbol{C}}\_{t\ast k-1}^{\boldsymbol{T}}\tilde{\boldsymbol{C}}\_{t\ast k-1}
\end{bmatrix} \geq \begin{bmatrix}
\tilde{\boldsymbol{Q}}\_{t\ast k} & \boldsymbol{A}\_{t\ast k}^{\boldsymbol{T}} \\
\boldsymbol{A}\_{t\ast k} & -\tilde{\boldsymbol{C}}\_{t\ast k}^{\boldsymbol{T}}\tilde{\boldsymbol{C}}\_{t\ast k}
\end{bmatrix}.
$$

*for a t ≥ 0 and for all k ≥ 0. Then* 

$$\Phi(P\_{t+k-1}, A\_{t+k-1}, \tilde{\mathbf{C}}\_{t+k-1}, \tilde{Q}\_{t+k-1}) \ge \Phi(P\_{t+k-1}, A\_{t+k}, \tilde{\mathbf{C}}\_{t+k}, \tilde{Q}\_{t+k}) \tag{8}$$

*for all k ≥ 0.* 

The above result underpins the following more general Riccati difference equation comparison theorem.

*Theorem 2: [11], [8]: With the above definitions, suppose for a t ≥ 0 and for all k ≥ 0 that:* 

$$(i) \qquad \text{there exists a } P\_t \text{ } \succeq P\_{t+1} \text{ and } i$$

$$
\begin{array}{cc}
\text{(ii)} & \begin{bmatrix}
\tilde{\boldsymbol{Q}}\_{t\ast k-1} & \boldsymbol{A}\_{t\ast k-1}^{\boldsymbol{T}} \\
\boldsymbol{A}\_{t\ast k-1} & -\tilde{\boldsymbol{\mathsf{C}}}\_{t\ast k-1}^{\boldsymbol{T}}\tilde{\boldsymbol{\mathsf{C}}}\_{t\ast k-1}
\end{bmatrix} \geq \begin{bmatrix}
\tilde{\boldsymbol{Q}}\_{t\ast k} & \boldsymbol{A}\_{t\ast k}^{\boldsymbol{T}} \\
\boldsymbol{A}\_{t\ast k} & -\tilde{\mathbf{C}}\_{t\ast k}^{\boldsymbol{T}}\tilde{\boldsymbol{\mathsf{C}}}\_{t\ast k}
\end{bmatrix}.
\end{array}$$

*Then Pt k ≥ Pt k* 1 *for all k ≥ 0.* 

*Proof: Assumption (i) is the k = 0 case for an induction argument. For the inductive step, denote Pt k =* 1 11 1 (, , , ) *PACQ tk tk tk tk* and *Pt k* 1 = (, , , ) *PACQ tk tk tk tk* . Then

<sup>&</sup>quot;Although personally I am quite content with existing explosives, I feel we must not stand in the path of improvement." *Winston Leonard Spencer-Churchill*

is the corrected error covariance and

partitioned form

 ( ) 1/ 1/ 1/

*k k*

() () () () () ()

1/ 1/

*T T*

*L I*

*T k kk k T*

/ 1 <sup>1</sup>

*kk k*

*C*

The sequences (21) – (22) can be initialised with

Similarly, the state predictions follow from (13),

built, they'll want something new." *Steven Paul Jobs*

 

corrections are obtained from (12), namely,

/ 1

*I*

*kk kk*

 

0

*A P I*

*a kk kk*

/1 /1

0

in which the gains are given by

*<sup>P</sup> <sup>P</sup>*

*T*

*k kk kk*

() () () () ()

1/ / ( ) () *<sup>a</sup> a a aT a aT P AP A BQB k k k kk k k kk*

() () () () () () ()

/1 /1

*k kk kk*

*AP C CP C R*

/ 1

*T k kk k k*

1/ /

 

( )

 

*kk kk*

*T*

<sup>0</sup> ( ) 0 0

*k k T T T*

*A C K K CP C R*

() () () () () () () () ()

/ 1 / 1 ( ) ( )( ) ( ) *a a aT a a aT aT a aT AP A AP C K BQB k kk k k kk k k k kk*

/ 1 ( ) ( )( ) ( ) *a a aT aT aT a aT AP A C K BQB k kk k k k k kk*

is the predicted error covariance. The above Riccati difference equation is written in the

/ 1

<sup>0</sup> ( ) <sup>0</sup> <sup>0</sup> *T T*

,

 1/ = *P* /

*k k k kk k k*

( ) /1 /1 1

*a k k kk kk k T k k kk k k*

*<sup>K</sup> <sup>A</sup> <sup>P</sup> <sup>C</sup> <sup>K</sup> CP C R*

/1 /1

 

see also [1]. The predicted error covariance components can be found from (18), *viz.,*

*<sup>T</sup> <sup>T</sup> P AP A BQB k k k kk k k k k* , 1/ / 1( ) *T TT k k kk k k k A C K* ,

1/ / 1 / 1 *T T k k kk kk k k C L* .

/ /1 / 1 ˆ ˆ ( ˆ ) *kk kk k k k kk x x L z Cx* ,

/ /1 / 1 ˆ ˆ ( <sup>ˆ</sup> )

*kk kk k k k kk L z Cx* .

"You can't just ask customers what they want and then try to give that to them. By the time you get it

/ /1 /1( )( ) *a a a aT aT P P PC L kk kk kk k k* (15)

1

/ 1

0

*B Q B* 

*k T k k*

 and 1/ = *P* /

0

,

(16)

(17)

(18)

(19)

(20) (21) (22)

(23) (24)

. The state

$$P\_{t+k} - P\_{t+k+1} = (\Phi(P\_{t+k-1}, A\_{t+k-1}, \tilde{\mathbf{C}}\_{t+k-1}, \tilde{\mathbf{Q}}\_{t+k-1}) - \Phi(P\_{t+k-1}, A\_{t+k}, \tilde{\mathbf{C}}\_{t+k}, \tilde{\mathbf{Q}}\_{t+k})) \tag{9}$$

$$+ (\Phi(P\_{t+k-1}, A\_{t+k}, \tilde{\mathbf{C}}\_{t+k}, \tilde{\mathbf{Q}}\_{t+k}) - \Phi(P\_{t+k}, A\_{t+k}, \tilde{\mathbf{C}}\_{t+k}, \tilde{\mathbf{Q}}\_{t+k}))$$

*The first term on the right-hand-side of (9) is non-negative by virtue of Assumption (ii) and Theorem 1. By appealing to Theorem 2 of Chapter 5, the second term on the right-hand-side of (9) is nonnegative and thus Pt k − Pt k* <sup>1</sup> *≥ 0. �* 

A change of variables [8] *Ck* = 1/ 2 *R C k k* and *Qk* = *<sup>T</sup> BQB k kk* , allows the application of Theorem 2 to the more general forms of Riccati differential equations.

#### **7.3 Fixed-Point Smoothing**

#### **7.3.1 Solution Derivation**

The development of a discrete-time fixed-point smoother follows the continuous-time case. An innovation by Zachrisson [12] involves transforming the smoothing problem into a filtering problem that possesses an augmented state. Following the approach in [1], consider

an augmented state vector ( ) *<sup>a</sup> <sup>k</sup> x* = *<sup>k</sup> k x* for the signal model

$$\mathbf{x}\_{k+1}^{(a)} = A\_k^{(a)} \mathbf{x}\_k^{(a)} + B\_k^{(a)} \mathbf{w}\_{k'} \tag{10}$$

$$\mathbf{z}\_k = \mathbf{C}\_k^{(a)} \mathbf{x}\_k^{(a)} + \mathbf{v}\_k \tag{11}$$

where ( ) *<sup>a</sup> Ak* = 0 0 *Ak I* , ( ) *<sup>a</sup> Bk* = 0 *Bk* and ( ) *<sup>a</sup> Ck* = [*Ck* 0]. It can be seen that the first component

of ( ) *<sup>a</sup> <sup>k</sup> x* is *xk*, the state of the system *xk+1* = *Akxk* + *Bkwk*, *yk* = *Ckxk* + *vk*. The second component, *k* , equals *xk* at time *k* = *τ*, that is, *<sup>k</sup>* = *xτ*. The objective is to calculate an estimate ˆ *k* of *<sup>k</sup>* at time *k* = *τ* from measurements *zk* over *k* [0, *N*]. A solution that minimises the variance of the estimation error is obtained by employing the standard Kalman filter recursions for the signal model (10) – (11). The predicted and corrected states are respectively obtained from

$$
\hat{\mathbf{x}}\_{k/k}^{(a)} = (I - L\_k^{(a)} \mathbf{C}\_k^{(a)}) \hat{\mathbf{x}}\_{k/k - 1}^{(a)} + L\_k^{(a)} \mathbf{z}\_{k \text{ } \mathbf{z}} \tag{12}
$$

$$
\hat{\mathfrak{x}}\_{k+1/k}^{(a)} = A\_k^{(a)} \hat{\mathfrak{x}}\_{k/k}^{(a)} \tag{13}
$$

$$= (A\_k^{(a)} - K\_k^{(a)} \mathbf{C}\_k^{(a)}) \hat{\mathbf{x}}\_{k \,/\, k-1}^{(a)} + K\_k^{(a)} \mathbf{z}\_k \,. \tag{14}$$

where *Kk* = () () *a a Ak k L* is the predictor gain, ( ) *<sup>a</sup> Lk* = () () () () / 1 / 1 ( )( ( ) *a a T a a aT P C CP C kk k k kk k* + *Rk*)-1 is the filter gain,

<sup>&</sup>quot;Never before in history has innovation offered promise of so much to so many in so short a time." *William Henry (Bill) Gates III*

Smoothing, Filtering and Prediction:

= *<sup>T</sup> BQB k kk* , allows the application of Theorem

and ( ) *<sup>a</sup> Ck* = [*Ck* 0]. It can be seen that the first component

/ 1 / 1 ( )( ( ) *a a T a a aT P C CP C kk k k kk k* + *Rk*)-1 is the filter

= *xτ*. The objective is to calculate an estimate ˆ

/ 1 ( )ˆ *a aa a <sup>a</sup> Ak k k kk k k KC x Kz* ,

(9)

(10)

(11)

(12) (13) (14)

*k* of *<sup>k</sup>* at

<sup>152</sup> Estimating the Past, Present and Future

*The first term on the right-hand-side of (9) is non-negative by virtue of Assumption (ii) and Theorem 1. By appealing to Theorem 2 of Chapter 5, the second term on the right-hand-side of (9) is nonnegative and thus Pt k − Pt k* <sup>1</sup> *≥ 0. �* 

The development of a discrete-time fixed-point smoother follows the continuous-time case. An innovation by Zachrisson [12] involves transforming the smoothing problem into a filtering problem that possesses an augmented state. Following the approach in [1], consider

for the signal model

*<sup>k</sup> x* is *xk*, the state of the system *xk+1* = *Akxk* + *Bkwk*, *yk* = *Ckxk* + *vk*. The second component,

time *k* = *τ* from measurements *zk* over *k* [0, *N*]. A solution that minimises the variance of the estimation error is obtained by employing the standard Kalman filter recursions for the signal model (10) – (11). The predicted and corrected states are respectively obtained from

"Never before in history has innovation offered promise of so much to so many in so short a time."

<sup>1</sup> 1 11 1 <sup>1</sup> ( ( , , , ) ( , , , )) *P P P A C Q P ACQ tk tk tk tk tk tk tk tk tk tk*

<sup>1</sup> ( ( , , , ) ( , , , )) *P ACQ PACQ tk tk tk tk tk tk tk tk*

and *Qk*

= 1/ 2 *R C k k*

*<sup>k</sup> x* = *<sup>k</sup>*

1

0 *Bk*

() () () () ()

where *Kk* = () () *a a Ak k L* is the predictor gain, ( ) *<sup>a</sup> Lk* = () () () ()

*k x* 

() () () ()

*a aa a k kk k k x Ax Bw* , () () *a a k kk k z Cx v* ,

( ) () () () () / / 1 ˆ ( )ˆ *<sup>a</sup> aa a a k k k k kk k k x I LC x Lz* , () () () 1/ / ˆ ˆ *<sup>a</sup> a a k k k kk x Ax*

2 to the more general forms of Riccati differential equations.

A change of variables [8] *Ck*

**7.3 Fixed-Point Smoothing 7.3.1 Solution Derivation** 

an augmented state vector ( ) *<sup>a</sup>*

0

, ( ) *<sup>a</sup> Bk* =

*I* 

0 *Ak*

, equals *xk* at time *k* = *τ*, that is, *<sup>k</sup>*

where ( ) *<sup>a</sup> Ak* =

of ( ) *<sup>a</sup>*

*k* 

gain,

*William Henry (Bill) Gates III*

$$P\_{k'/k}^{(a)} = P\_{k'/k-1}^{(a)} - P\_{k'/k-1}^{(a)} \left(\mathbf{C}\_k^{(a)}\right)^T \left(L\_k^{(a)}\right)^T \tag{15}$$

is the corrected error covariance and

$$P\_{k+1\wedge k}^{(a)} = A\_k^{(a)} P\_{k\wedge k}^{(a)} (A\_k^{(a)})^\top + B\_k^{(a)} Q\_k (B\_k^{(a)})^\top \tag{16}$$

$$= A\_k^{(a)} P\_{k\wedge k-1}^{(a)} (A\_k^{(a)})^\top - A\_k^{(a)} P\_{k\wedge k-1}^{(a)} (\mathbf{C}\_k^{(a)})^\top (K\_k^{(a)})^\top + B\_k^{(a)} Q\_k (B\_k^{(a)})^\top$$

$$= A\_k^{(a)} P\_{k\wedge k-1}^{(a)} \left( \left( A\_k^{(a)} \right)^\top - \left( \mathbf{C}\_k^{(a)} \right)^\top (K\_k^{(a)})^\top \right) + B\_k^{(a)} Q\_k (B\_k^{(a)})^\top \tag{17}$$

is the predicted error covariance. The above Riccati difference equation is written in the partitioned form

$$\begin{aligned} \boldsymbol{P}\_{k+1/k}^{(d)} &= \begin{bmatrix} \boldsymbol{P}\_{k+1/k} & \boldsymbol{\Sigma}\_{k+1/k}^{\top} \\ \boldsymbol{\Sigma}\_{k+1/k} & \boldsymbol{\Omega}\_{k+1/k} \end{bmatrix} \\ &= \begin{bmatrix} \boldsymbol{A}\_{k} & \mathbf{0} \\ \mathbf{0} & I \end{bmatrix} \begin{bmatrix} \boldsymbol{P}\_{k/k-1} & \boldsymbol{\Sigma}\_{k/k-1}^{\top} \\ \boldsymbol{\Sigma}\_{k/k-1} & \boldsymbol{\Omega}\_{k/k-1} \end{bmatrix} \\ &\times \begin{bmatrix} \begin{bmatrix} \boldsymbol{A}\_{k}^{\top} & \mathbf{0} \\ \mathbf{0} & I \end{bmatrix} - \begin{bmatrix} \mathbf{C}\_{k}^{\top} \\ \mathbf{0} \end{bmatrix} \begin{bmatrix} \boldsymbol{K}\_{k}^{\top} & \underline{\mathbf{K}}\_{k}^{\top} \end{bmatrix} \begin{bmatrix} \mathbf{C}\_{k} \boldsymbol{P}\_{k/k-1} \underline{\mathbf{C}}\_{k}^{\top} + \mathbf{R}\_{k} \end{bmatrix}^{-1} \end{aligned} \tag{18}$$

in which the gains are given by

$$\begin{split} \mathbf{K}\_{k}^{(a)} &= \begin{bmatrix} \mathbf{K}\_{k} \\ \underline{\mathbf{L}}\_{k} \end{bmatrix} = \begin{bmatrix} \mathbf{A}\_{k} & \mathbf{0} \\ \mathbf{0} & I \end{bmatrix} \begin{bmatrix} P\_{k \times k-1} & \boldsymbol{\Sigma}\_{k \times k-1}^{T} \\ \boldsymbol{\Sigma}\_{k \times k-1} & \boldsymbol{\Pi}\_{k \times k-1} \end{bmatrix} \begin{bmatrix} \mathbf{C}\_{k}^{T} \\ \mathbf{0} \end{bmatrix} \mathbf{C}\_{k} \mathbf{P}\_{k \times k-1} \mathbf{C}\_{k}^{T} + \mathbf{R}\_{k} \mathbf{I}^{-1} \\ &= \begin{bmatrix} \mathbf{A}\_{k} \mathbf{P}\_{k \times k-1} \mathbf{C}\_{k}^{T} \\ \boldsymbol{\Sigma}\_{k \times k-1} \mathbf{C}\_{k}^{T} \end{bmatrix} \big( \mathbf{C}\_{k} \mathbf{P}\_{k \times k-1} \mathbf{C}\_{k}^{T} + \mathbf{R}\_{k} \big)^{-1}, \end{split} \tag{19}$$

see also [1]. The predicted error covariance components can be found from (18), *viz.,*

$$P\_{k+1/k} = A\_k P\_{k/k} A\_k^T + B\_k Q\_k B\_k^T \, \prime \tag{20}$$

$$
\Sigma\_{k+1/k} = \Sigma\_{k/k-1} (A\_k^T - \mathbb{C}\_k^T \mathbb{K}\_k^T) \, , \tag{21}
$$

$$
\Delta \Omega\_{k+1/k} = \Omega\_{k/k-1} - \Sigma\_{k/k-1} \mathbf{C}\_k^\top \underline{\mathbf{L}}\_k^\top. \tag{22}
$$

The sequences (21) – (22) can be initialised with 1/ = *P* / and 1/ = *P* / . The state corrections are obtained from (12), namely,

$$
\hat{\mathbf{x}}\_{k/k} = \hat{\mathbf{x}}\_{k/k-1} + \mathbf{L}\_k (\mathbf{z}\_k - \mathbf{C}\_k \hat{\mathbf{x}}\_{k/k-1}) \, \tag{23}
$$

$$
\hat{\tilde{\xi}}\_{k/k} = \hat{\tilde{\xi}}\_{k/k-1} + \underline{\mathbf{L}}\_{k} (\mathbf{z}\_{k} - \mathbf{C}\_{k} \hat{\mathbf{x}}\_{k/k-1}) \,. \tag{24}
$$

Similarly, the state predictions follow from (13),

<sup>&</sup>quot;You can't just ask customers what they want and then try to give that to them. By the time you get it built, they'll want something new." *Steven Paul Jobs*

0

and

versus *k* for Example 1.

**7.4 Fixed-Lag Smoothing 7.4.1 High-order Solution** 

Consider the signal model

1

2

3

Σ*k/k*

4

5

6

5 10 15

1

Discrete-time fixed-lag smoothers calculate state estimates, / ˆ *kNk x* , at time *k* given a delay of

solution approach is to construct an augmented signal model that includes delayed states and then apply the standard Kalman filter recursions, see [1] – [3] and the references therein.

0 0

1

 

*k k*

*x x*

 

*k N*

 

*x*

00 0 0

1

0 0

*N* steps. The objective is to minimise {( *E xk N* − / /1 ˆ )( *kNk kN x x* − / ˆ ) }*<sup>T</sup>*

1 2

*k N k k N k*

*x I x x I x*

*k N N k N*

*x I x*

0 0

*k k k k*

*x A x B*

<sup>2</sup> 00 0

"If the only tool you have is a hammer, you tend to see every problem as a nail." *Abraham Maslow*

*k k k k*

*z C x v*

2

3

Ω*k/k*

4

5

6

5 10 15

*kNk x* . A common

. (31)

(30)

*k*

*k k* 1/ 1 versus *k* for Example 1.

Figure 1(b). Smoother estimation variances

R=0.01,0.05,0.1,0.2,0.5,1,5

*k*

Figure 1(a). Smoother estimation variances *k k*/

1

1

R=0.01,0.05,0.1,0.2,0.5,1,5

$$
\hat{\mathbf{x}}\_{k+1/k} = \mathbf{A}\_k \hat{\mathbf{x}}\_{k/k} \tag{25}
$$

$$
\hat{\tilde{\xi}}\_{k \ast 1/k} = \hat{\tilde{\xi}}\_{k/k} \,. \tag{26}
$$

In summary, the fixed-point smoother estimates for *k* ≥ *τ* are given by (24), which is initialised by / <sup>ˆ</sup> = / *x*ˆ . The smoother gain is calculated as <sup>1</sup> / 1 / 1 ( ) *<sup>T</sup> <sup>T</sup> L C CP C R k kk k k kk k k* , where *k k*/ 1 is given by (21).

#### **7.3.2 Performance**

It follows from the above that *k k kk* 1/ / and so

$$
\Omega\_{k \times 1/k + 1} = \Omega\_{k/k} - \Sigma\_{k/k - 1} \mathbb{C}\_k^\top \underline{L}\_k^\top. \tag{27}
$$

Next, it is argued that the discrete-time fixed-point smoother provides a performance improvement over the filter.

*Lemma 2 [1]: In respect of the fixed point smoother (24),* 

$$P\_{\mathbf{r}/\mathbf{r}} \ge \mathfrak{Q}\_{\mathbf{k}/\mathbf{k}}\,. \tag{28}$$

*Proof: The recursion (22) may be written as the sum* 

$$\boldsymbol{\Omega}\_{k+1/k+1} = \boldsymbol{\Omega}\_{\tau/\tau} - \sum\_{l=\tau}^{k} \boldsymbol{\Sigma}\_{l/l=1} \boldsymbol{\mathsf{C}}\_{l}^{T} \left( \mathbf{C}\_{l} \boldsymbol{P}\_{l/l=1} \mathbf{C}\_{l}^{T} + \boldsymbol{R} \right)^{-1} \boldsymbol{\mathsf{C}}\_{l} \boldsymbol{\Sigma}\_{l/l=1} \tag{29}$$

$$\text{where } \Omega\_{\boldsymbol{\tau}/\boldsymbol{\tau}} = P\_{\boldsymbol{\tau}/\boldsymbol{\tau}}. \text{ Hence, } P\_{\boldsymbol{\tau}/\boldsymbol{\tau}} - \Omega\_{k+1/k+1} = \sum\_{l=\boldsymbol{\tau}}^{k} \Sigma\_{l/l-1} \mathbb{C}\_{i}^{T} (\mathbb{C}\_{i} P\_{l/l-1} \mathbb{C}\_{i}^{T} + R)^{-1} \mathbb{C}\_{i} \Sigma\_{l/l-1} \geq 0. \tag{1}$$

*Example 1.* Consider a first-order time-invariant plant, in which *A* = 0.9, *B* = 1, *C* = 0.1 and *Q* = 1. An understanding of a fixed-point smoother's performance can be gleaned by examining the plots of the *k k*/ and *k k*/ sequences shown in Fig. 1(a) and (b), respectively. The bottom lines of the figures correspond to measurement noise covariances of *R* = 0.01 and the top lines correspond to *R* = 5. It can be seen for this example, that the *k k*/ have diminishing impact after about 15 samples beyond the point of interest. From Fig. 1(b), it can be seen that smoothing appears most beneficial at mid-range measurement noise power, such as *R* = 0.2, since the plots of *k k*/ become flatter for *R* ≥ 1 and*R* ≤ 0.05.

<sup>&</sup>quot;There are no big problems, there are just a lot of little problems." *Henry Ford*

Smoothing, Filtering and Prediction:

/ 1 / 1 ( ) *<sup>T</sup> <sup>T</sup> L C CP C R k kk k k kk k k* ,

(25) (26)

<sup>154</sup> Estimating the Past, Present and Future

In summary, the fixed-point smoother estimates for *k* ≥ *τ* are given by (24), which is

Next, it is argued that the discrete-time fixed-point smoother provides a performance

. The smoother gain is calculated as <sup>1</sup>

*T T k k kk kk k k C L* . (27)

1

*T T ii i i ii i*

*C CP C*

<sup>+</sup><sup>1</sup>

, (29)

/ / *k k* . (28)

/ 1 ) *R Ci ii*

≥ 0.

1/ / ˆ ˆ *k k k kk x Ax* ,

1/ / ˆ ˆ *k k kk*

1/ 1 / / 1

1/ 1 / / 1 / 1 / 1 ( ) *<sup>k</sup> <sup>T</sup> <sup>T</sup> k k ii i i ii i i ii*

*C CP C R C*

 − *k k* 1/ 1 = / 1 / 1 ( *k*

*Example 1.* Consider a first-order time-invariant plant, in which *A* = 0.9, *B* = 1, *C* = 0.1 and *Q* = 1. An understanding of a fixed-point smoother's performance can be gleaned by examining the plots of the *k k*/ and *k k*/ sequences shown in Fig. 1(a) and (b), respectively. The bottom lines of the figures correspond to measurement noise covariances of *R* = 0.01 and the top lines correspond to *R* = 5. It can be seen for this example, that the *k k*/ have diminishing impact after about 15 samples beyond the point of interest. From Fig. 1(b), it can be seen that smoothing appears most beneficial at mid-range measurement noise power,

*i*

*P* 

*i*

such as *R* = 0.2, since the plots of *k k*/ become flatter for *R* ≥ 1 and*R* ≤ 0.05.

"There are no big problems, there are just a lot of little problems." *Henry Ford*

 

> /

.

initialised by / <sup>ˆ</sup>

**7.3.2 Performance** 

where

 / = *P* /

 

where *k k*/ 1 is given by (21).

improvement over the filter.

 = / *x*ˆ 

It follows from the above that *k k kk* 1/ / and so

*Lemma 2 [1]: In respect of the fixed point smoother (24),* 

*Proof: The recursion (22) may be written as the sum* 

. Hence, *P*

Figure 1(a). Smoother estimation variances *k k*/ versus *k* for Example 1. Figure 1(b). Smoother estimation variances *k k* 1/ 1 versus *k* for Example 1.

#### **7.4 Fixed-Lag Smoothing**

#### **7.4.1 High-order Solution**

Discrete-time fixed-lag smoothers calculate state estimates, / ˆ *kNk x* , at time *k* given a delay of *N* steps. The objective is to minimise {( *E xk N* − / /1 ˆ )( *kNk kN x x* − / ˆ ) }*<sup>T</sup> kNk x* . A common solution approach is to construct an augmented signal model that includes delayed states and then apply the standard Kalman filter recursions, see [1] – [3] and the references therein. Consider the signal model

$$
\begin{bmatrix} \mathbf{x}\_{k+1} \\ \mathbf{x}\_{k} \\ \mathbf{x}\_{k-1} \\ \vdots \\ \mathbf{x}\_{k+1-N} \end{bmatrix} = \begin{bmatrix} A\_{k} & \mathbf{0} & \cdots & & \mathbf{0} \\ I\_{N} & & & \mathbf{0} \\ \mathbf{0} & I\_{N} & & & \vdots \\ \vdots & & \ddots & & \\ \mathbf{0} & \mathbf{0} & & & I\_{N} & \mathbf{0} \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{k} \\ \mathbf{x}\_{k-1} \\ \mathbf{x}\_{k-2} \\ \vdots \\ \mathbf{x}\_{k-N} \end{bmatrix} + \begin{bmatrix} B\_{k} \\ \mathbf{0} \\ \mathbf{0} \\ \vdots \\ \mathbf{0} \end{bmatrix} \tag{30}
$$

and

$$\mathbf{x}\_{k} = \begin{bmatrix} \mathbf{C}\_{k} & \mathbf{0} & \mathbf{0} & \cdots & \mathbf{0} \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{k} \\ \mathbf{x}\_{k-1} \\ \mathbf{x}\_{k-2} \\ \vdots \\ \mathbf{x}\_{k-N} \end{bmatrix} + \mathbf{v}\_{k} \tag{31}$$

<sup>&</sup>quot;If the only tool you have is a hammer, you tend to see every problem as a nail." *Abraham Maslow*

*Proof:* 

**7.4.3Performance** 

Two facts that stem from (36) are stated below.

*Lemma 3: In respect of the fixed-lag smoother (33) – (36), the following applies. (i) The error-performance improves with increasing smoothing lag.* 

*(ii) The observation follows by recognising that* (1,1) *Pk k* 1/ *=* {( *E xk −* / ˆ )( *kk k x x −* / ˆ ) }*<sup>T</sup>*

The most commonly used fixed-interval smoother is undoubtedly the solution reported by Rauch [5] in 1963 and two years later with Tung and Striebel [6]. Although this smoother does not minimise the error variance, it has two desirable attributes. First, it is a lowcomplexity state estimator. Second, it can provide close to optimal performance whenever

The smoother involves two passes. In the first (forward) pass, filtered state estimates, / ˆ *k k x* ,

where *Lk* = / 1 / 1 ( *<sup>T</sup> <sup>T</sup> P C CP C kk k k kk k* + *Rk*)-1 is the filter gain, *Kk* = *AkLk* is the predictor gain, in

the second backward pass, Rauch, Tung and Striebel calculate smoothed state estimates,

1 / 1 1/ *<sup>T</sup> G PAP k kk k k k*

is the smoother gain. The above sequence is initialised by / ˆ *k N x* = / ˆ *k k x* at *k* = *N*. In the first public domain appearance of (39), Rauch [5] referred to a Lockheed Missile and Space

"For every problem there is a solution which is simple, clean and wrong." *Henry Louis Mencken*

/ 1 ) *R CP k k kk* 

/ /1 / 1 ˆ ˆ ( ˆ ) *kk kk k k k kk x x L z Cx* ,

1/ / ˆ ˆ *k k k kk x Ax* ,

*within (i).*

1/ 1/ ( *i T <sup>T</sup> P C CP C k k k kk k k* + 1 ( ,0)

1/ ) *<sup>i</sup> R CP k kk k* 

and *Pk+1/k* = /

(40)

/ / 1/ 1/ ˆˆ ˆ ˆ ( ) *kN kk k k N k k x x Gx x* , (39)

*k k x*

(37) (38)

*<sup>T</sup> Ak kk k P A* + *<sup>T</sup> BQB k kk* . In

within (36) that the

*(ii) The fixed-lag smoothers outperform the Kalman filter.* 

*(i) The claim follows by inspection of (34) and (36).* 

It can also be seen from the term ( ,0) (0,0)

benefit of smoothing diminishes as *Rk* becomes large.

**7.5.1 The Maximum-Likelihood Smoother** 

the accompanying assumptions are reasonable.

which *Pk/k* = *Pk/k-1* − / 1 / 1 ( *<sup>T</sup> <sup>T</sup> P C CP C kk k k kk k* + <sup>1</sup>

/ ˆ *k N x* , from the beautiful one-line recursion

**7.5 Fixed-Interval Smoothing** 

**7.5.1.1 Solution Derivation** 

are calculated from

where

By applying the Kalman filter recursions to the above signal model, the predicted states are obtained as

$$
\begin{bmatrix}
\hat{\mathbf{x}}\_{k+1} \\
\hat{\mathbf{x}}\_{k} \\
\hat{\mathbf{x}}\_{k-1} \\
\vdots \\
\hat{\mathbf{x}}\_{k+1-N}
\end{bmatrix} = \begin{bmatrix}
A\_{k} & \mathbf{0} & \cdots & & \mathbf{0} \\
I\_{N} & & & \mathbf{0} \\
\mathbf{0} & I\_{N} & & \vdots \\
\vdots & & \ddots & & \vdots \\
\mathbf{0} & \mathbf{0} & & & I\_{N} & \mathbf{0}
\end{bmatrix} \begin{bmatrix}
\hat{\mathbf{x}}\_{k} \\
\hat{\mathbf{x}}\_{k-1} \\
\vdots \\
\hat{\mathbf{x}}\_{k-2} \\
\vdots \\
\hat{\mathbf{x}}\_{k-N}
\end{bmatrix} + \begin{bmatrix}
K\_{0,k} \\
K\_{1,k} \\
K\_{2,k} \\
\vdots \\
K\_{N,k}
\end{bmatrix} (\mathbf{z}\_{k} - \mathbf{C}\_{k} \hat{\mathbf{x}}\_{k'/k-1}) \,\tag{32}
$$

where *K0,k*, *K1,k*, *K2,k*, …, *KN,k* denote the submatrices of the predictor gain. Two important observations follow from the above equation. First, the desired smoothed estimates 1/ ˆ *k k x* … 1/ ˆ *kN k x* are contained within the one-step-ahead prediction (32). Second, the fixed lagsmoother (32) inherits the stability properties of the original Kalman filter.

#### **7.4.2 Reduced-order Solution**

Equation (32) is termed a high order solution because the dimension of the above augmented state matrix is ( 2) ( 2) *N nN n* . Moore [1] – [3] simplified (32) to obtain elegant reduced order solution structures as follows. Let

$$\begin{bmatrix} P\_{k+1/k}^{(0,0)} & P\_{k+1/k}^{(0,1)} & \cdots & P\_{k+1/k}^{(0,N)} \\ P\_{k+1/k}^{(1,0)} & P\_{k+1/k}^{(1,1)} & & \vdots \\ \vdots & & \ddots & & \\ P\_{k+1/k}^{(N,0)} & \cdots & & P\_{k+1/k}^{(N,N)} \end{bmatrix} \prime \ P\_{k+1/k}^{(i,j)} = (P\_{k+1/k}^{(j,i)})^T \prime$$

denote the predicted error covariance matrix. For 0 ≤ *i* ≤ *N*, the smoothed states within (32) are given by

$$
\hat{\mathbf{x}}\_{k-i/k} = \hat{\mathbf{x}}\_{k-i/k-1} + \mathbf{K}\_{i+1,k} (\mathbf{z}\_k - \mathbf{C}\_k \hat{\mathbf{x}}\_{k/k-1}) \, \tag{33}
$$

where

$$K\_{i+1,k} = P\_{k+1/k}^{(i,0)} \mathbf{C}\_k^T \left( \mathbf{C}\_k P\_{k+1/k}^{(0,0)} \mathbf{C}\_k^T + R\_k \right)^{-1}. \tag{34}$$

Recursions for the error covariance submatrices of interest are

$$P\_{k+1/k}^{(
\prime+1,0)} = P\_{k+1/k}^{(
\prime,0)} (A\_k - K\_{0,k} \mathbb{C}\_k)^T,\tag{35}$$

$$P\_{k+1/k}^{(l+1,l+1)} = P\_{k+1/k}^{(l,l)} - P\_{k+1/k}^{(l,0)} \mathbb{C}\_k \mathbf{K}\_{l+1,k}^{\top} \,. \tag{36}$$

Another rearrangement of (33) − (34) to reduce the calculation cost further is described in [1].

<sup>&</sup>quot;You have to seek the simplest implementation of a problem solution in order to know when you've reached your limit in that regard. Then it's easy to make tradeoffs, to back off a little, for performance reasons." *Stephen Gary Wozniak*

#### **7.4.3Performance**

Two facts that stem from (36) are stated below.

*Lemma 3: In respect of the fixed-lag smoother (33) – (36), the following applies.* 


*Proof:* 

Smoothing, Filtering and Prediction:

, (32)

(35) (36)

<sup>156</sup> Estimating the Past, Present and Future

By applying the Kalman filter recursions to the above signal model, the predicted states are

where *K0,k*, *K1,k*, *K2,k*, …, *KN,k* denote the submatrices of the predictor gain. Two important observations follow from the above equation. First, the desired smoothed estimates 1/ ˆ *k k x* … 1/ ˆ *kN k x* are contained within the one-step-ahead prediction (32). Second, the fixed lag-

Equation (32) is termed a high order solution because the dimension of the above augmented state matrix is ( 2) ( 2) *N nN n* . Moore [1] – [3] simplified (32) to obtain

*N*

, (, ) ( ,) 1/ 1/ ( ) *i j j i <sup>T</sup> P P kk kk* ,

/ / 1 1, / 1 ˆ ˆ ( ˆ ) *kik kik i k k k kk x x K z Cx* , (33)

 . (34)

denote the predicted error covariance matrix. For 0 ≤ *i* ≤ *N*, the smoothed states within (32)

( ,0) (0,0) 1

1, 1/ 1/ ( ) *i T <sup>T</sup> K P C CP C R i k k k k kk k k k*

1/ 1/ 0, ( ) *<sup>i</sup> <sup>i</sup> <sup>T</sup> P P A KC k k k k k kk*

( 1, 1) ( , ) ( ,0) 1/ 1/ 1/ 1, *i i ii i T P P P CK k k k k k kki k*

,

.

Another rearrangement of (33) − (34) to reduce the calculation cost further is described in

"You have to seek the simplest implementation of a problem solution in order to know when you've reached your limit in that regard. Then it's easy to make tradeoffs, to back off a little, for performance

1 2 2, / 1

ˆ 0 ˆ ( ˆ )

*k N k k k k kk*

*x I x K z Cx*

1 1,

1 0,

smoother (32) inherits the stability properties of the original Kalman filter.

(0,0) (0,1) (0, ) 1/ 1/ 1/

*kk kk k k*

*PP P*

*P P*

 

( ,0) (,) 1/ 1/

*N N N k k k k*

*k k k k k N k k*

*x A x K x I x K*

ˆ 0 0 ˆ ˆ 0 ˆ

ˆ 00 0 ˆ

elegant reduced order solution structures as follows. Let

(1,0) (1,1) 1/ 1/

*P P*

*kk kk*

Recursions for the error covariance submatrices of interest are

( 1,0) ( ,0)

1 ,

*k N N k N N k*

*x I x K*

obtained as

are given by

where

[1].

reasons." *Stephen Gary Wozniak*

**7.4.2 Reduced-order Solution** 


It can also be seen from the term ( ,0) (0,0) 1/ 1/ ( *i T <sup>T</sup> P C CP C k k k kk k k* + 1 ( ,0) 1/ ) *<sup>i</sup> R CP k kk k* within (36) that the benefit of smoothing diminishes as *Rk* becomes large.

#### **7.5 Fixed-Interval Smoothing**

#### **7.5.1 The Maximum-Likelihood Smoother**

#### **7.5.1.1 Solution Derivation**

The most commonly used fixed-interval smoother is undoubtedly the solution reported by Rauch [5] in 1963 and two years later with Tung and Striebel [6]. Although this smoother does not minimise the error variance, it has two desirable attributes. First, it is a lowcomplexity state estimator. Second, it can provide close to optimal performance whenever the accompanying assumptions are reasonable. whenever

The smoother involves two passes. In the first (forward) pass, filtered state estimates, / ˆ *k k x* , are calculated from

$$
\hat{\hat{\mathbf{x}}}\_{k/k} = \hat{\mathbf{x}}\_{k/k-1} + L\_k (\mathbf{z}\_k - \mathbf{C}\_k \hat{\mathbf{x}}\_{k/k-1}) \, \tag{37}
$$

$$
\hat{\mathfrak{X}}\_{k+1/k} = A\_k \hat{\mathfrak{X}}\_{k/k} \tag{38}
$$

where *Lk* = / 1 / 1 ( *<sup>T</sup> <sup>T</sup> P C CP C kk k k kk k* + *Rk*)-1 is the filter gain, *Kk* = *AkLk* is the predictor gain, in which *Pk/k* = *Pk/k-1* − / 1 / 1 ( *<sup>T</sup> <sup>T</sup> P C CP C kk k k kk k* + <sup>1</sup> / 1 ) *R CP k k kk* and *Pk+1/k* = / *<sup>T</sup> Ak kk k P A* + *<sup>T</sup> BQB k kk* . In the second backward pass, Rauch, Tung and Striebel calculate smoothed state estimates, / ˆ *k N x* , from the beautiful one-line recursion

$$
\hat{\mathbf{x}}\_{k \wedge N} = \hat{\mathbf{x}}\_{k \wedge k} + \mathbf{G}\_k (\hat{\mathbf{x}}\_{k \ast 1 \wedge N} - \hat{\mathbf{x}}\_{k \ast 1 \wedge k}) \,, \tag{39}
$$

where

$$\mathbf{G}\_{\mathbf{k}} = \mathbf{P}\_{\mathbf{k}/\mathbf{k}} \mathbf{A}\_{\mathbf{k}+\mathbf{1}}^{\top} \mathbf{P}\_{\mathbf{k}+\mathbf{1}/\mathbf{k}}^{-1} \tag{40}$$

is the smoother gain. The above sequence is initialised by / ˆ *k N x* = / ˆ *k k x* at *k* = *N*. In the first public domain appearance of (39), Rauch [5] referred to a Lockheed Missile and Space

<sup>&</sup>quot;For every problem there is a solution which is simple, clean and wrong." *Henry Louis Mencken*

1/ /

*x A BQB x*

*CRC A*

Employing (46) within (51) and rearranging leads to (39).

and (46) within (48) to obtain

images is described in [14].

/ ˆ *k k x* , respectively. It is assumed that

It is straightforward to show that (52) implies

Subtracting *xk* from both sides of (39) gives

investigating different solutions." *Joey Reiman*

Denote Σ*k/N* = 1/ 1/ { } ˆ ˆ*<sup>T</sup> Ex x k Nk N* . The assumption (53) implies

**7.5.1.3 Performance** 

By simplifying

ˆ ˆ 0 *<sup>T</sup> k N k k kk k N*

/ 1/

1 1

 

where λ*k/N <sup>n</sup>* is an auxiliary variable that proceeds backward in time *k*. The form (48) – (49) avoids potential numerical difficulties that may be associated with calculating <sup>1</sup> *Pk k*/ 1

1 1

In time-invariant problems, steady state solutions for *Pk k*/ and *Pk k* 1/ can be used to precalculate the gain (40) before running the smoother. For example, the application of a time-invariant version of the Rauch-Tung-Striebel smoother for the restoration of blurred

An expression for the smoother error covariance is developed below following the approach of [6], [13]. Define the smoother and filter error states as *k N*/ *x* = *xk* – / ˆ *k N x* and *k k*/ *x* = *xk* –

"Great thinkers think inductively, that is, they create solutions and then seek out the problems that the solutions might solve; most companies think deductively, that is, defining a problem and then

/ / { }0 ˆ*<sup>T</sup> Ex x kk kk* ,

1/ 1/ { }0 ˆ*<sup>T</sup> Ex x k Nk N* ,

1/ / { }0 ˆ*<sup>T</sup> Ex x k N kN* .

*T*

*CR z*

*kN k k k k k k k k k k k x Gx A BQB P x* . (51)

1/ 1/ 1 1 1/ { }{ } ˆ ˆ*<sup>T</sup> <sup>T</sup> Ex x Ex x P k kk k k k kk* . (55)

1/ 1/ 1 1 1/ { }{ } ˆ ˆ*<sup>T</sup> <sup>T</sup> Ex x Ex x k Nk N k k kN* . (56)

/ 1/ / / ˆ ˆ *kN k k N kk k k kk x Gx x GAx* . (57)

, (48)

, (50)

(49)

(52) (53) (54)

 .

*T T T k N kk k k k N kkk*

To confirm the equivalence of (39) and (48) – (49), use the Bryson-Frazier formula [15]

1/ 1/ 1 1/ 1/ *xx P* ˆ ˆ *k N k k k kk N*

/ 1/ 1 1/ 1/ ˆ ˆ ˆ

Company Technical Report co-authored with Tung and Striebel. Consequently, (39) is commonly known as the Rauch-Tung-Striebel smoother. This smoother was derived in [6] using the maximum-likelihood method and an outline is provided below.

The notation *xk* ~ ( , *Rxx*) means that a discrete-time random variable *xk* with mean *μ* and covariance *Rxx* has the normal (or Gaussian) probability density function

$$p(\mathbf{x}\_k) = \frac{1}{\left(2\pi\right)^{n/2} \left| R\_{\text{xx}} \right|^{1/2}} \exp\left\{ -0.5(\mathbf{x}\_k - \boldsymbol{\mu})^T R\_{\text{xx}}^{-1} (\mathbf{x}\_k - \boldsymbol{\mu}) \right\}. \tag{41}$$

Rauch, Tung and Striebel assumed that [6]:

$$
\hat{\mathbf{x}}\_{k+1/N} \sim \mathcal{N}'(\mathbf{A}\_k \hat{\mathbf{x}}\_{k/N}, \mathbf{B}\_k \mathbf{Q}\_k \mathbf{B}\_k^T). \tag{42}
$$

$$
\hat{\mathbf{x}}\_{k/N} \sim \mathcal{N}'(\hat{\mathbf{x}}\_{k/k}, \mathbf{P}\_{k/k}).\tag{43}
$$

From the approach of [6], setting the partial derivative of the logarithm of the joint density function to zero results in

$$0 = \frac{\partial(\hat{\mathbf{x}}\_{k+1/N} - A\_{k}\hat{\mathbf{x}}\_{k/N})^{\top}}{\partial\hat{\mathbf{x}}\_{k/N}} (B\_{k}Q\_{k}B\_{k}^{\top})^{-1} (\hat{\mathbf{x}}\_{k+1/N} - A\_{k}\hat{\mathbf{x}}\_{k/N})^{\top} \frac{n!}{r!(n-r)!} + \frac{\partial(\hat{\mathbf{x}}\_{k/N} - \hat{\mathbf{x}}\_{k/N})^{\top}}{\partial\hat{\mathbf{x}}\_{k/N}} P\_{k/k}^{-1} (\hat{\mathbf{x}}\_{k/N} - \hat{\mathbf{x}}\_{k/N}) \dots$$

Rearranging the above equation leads to

$$\hat{\mathbf{x}}\_{k\wedge N} = \left(I + P\_{k\wedge k}A\_k^\top (B\_k Q\_k B\_k^\top)^{-1} A\_k\right)^{-1} \left(\hat{\mathbf{x}}\_{k\wedge N} + P\_{k\wedge k}A\_k^\top B\_k Q\_k B\_k^\top\right)^{-1} \hat{\mathbf{x}}\_{k\wedge N}\tag{44}$$

From the Matrix Inversion Lemma

$$\left(I + P\_{k/k}A\_k^T(B\_kQ\_kB\_k^T)^{-1}A\_k\right)^{-1} = I - G\_kA\_k\,.\tag{45}$$

The solution (39) is found by substituting (45) into (44). Some further details of Rauch, Tung and Striebel's derivation appear in [13].

#### **7.5.1.2 Alternative Forms**

The smoother gain (40) can be calculated in different ways. Assuming that *Ak* is nonsingular, it follows from *Pk k* 1/ = / *<sup>T</sup> Ak kk k P A* + *<sup>T</sup> BQB k kk* that <sup>1</sup> / 1 1/ *<sup>T</sup> PAP kk k k k* = <sup>1</sup> ( *Ak I* − 1 1/ ) *<sup>T</sup> BQB P k kkk k* and

$$\mathbf{G}\_k = A\_k^{-1} (I - B\_k \mathbf{Q}\_k B\_k^T P\_{k+1/k}^{-1}) \,. \tag{46}$$

In applications where difficulties exist with inverting *Pk k* 1/ , it may be preferable to calculate

$$P\_{k/k-1}^{-1} = P\_{k/k}^{-1} - \mathbf{C}\_k^T \mathbf{R}\_k^{-1} \mathbf{C}\_k \ . \tag{47}$$

It is shown in [15] that the filter (37) – (38) and the smoother (39) can be written in the following Hamiltonian form

<sup>&</sup>quot;Error is the discipline through which we advance." *William Ellery Channing*

Smoothing, Filtering and Prediction:

(42) (43)

( *Ak I* −

<sup>158</sup> Estimating the Past, Present and Future

Company Technical Report co-authored with Tung and Striebel. Consequently, (39) is commonly known as the Rauch-Tung-Striebel smoother. This smoother was derived in [6]

*Rxx*) means that a discrete-time random variable *xk* with mean *μ*

. (41)

1/ / // /

*k k k k N k kN kk kN kN*

*kk k k k k k k k I P A BQB A I G A* . (45)

*<sup>T</sup> Ak kk k P A* + *<sup>T</sup> BQB k kk* that <sup>1</sup>

. (46)

. (47)

/ 1 1/ *<sup>T</sup> PAP kk k k k* = <sup>1</sup>

<sup>1</sup>

 *T*

using the maximum-likelihood method and an outline is provided below.

and covariance *Rxx* has the normal (or Gaussian) probability density function

<sup>1</sup> ( ) exp 0.5( ) ( )

1/ / ˆ ~ ( , ). ˆ *<sup>T</sup> k N k kN k k k x Ax BQB* 

> / / / ˆ ˆ ~ ( , ). *k N kk kk x xP*

From the approach of [6], setting the partial derivative of the logarithm of the joint density

1/ / <sup>1</sup> / / <sup>1</sup>

(ˆ ˆ ) !( ) ˆ ˆ <sup>0</sup> ( )(ˆ ˆ ) ( ) ˆ ˆ <sup>ˆ</sup> ! ! <sup>ˆ</sup> *T T*

1 1

The solution (39) is found by substituting (45) into (44). Some further details of Rauch, Tung

The smoother gain (40) can be calculated in different ways. Assuming that *Ak* is non-

In applications where difficulties exist with inverting *Pk k* 1/ , it may be preferable to calculate

It is shown in [15] that the filter (37) – (38) and the smoother (39) can be written in the

1 1 1/ ( ) *<sup>T</sup> G A I BQB P k k k kkk k*

1 1 1

/1 / *<sup>T</sup> P P CRC kk kk k k k* 

"Error is the discipline through which we advance." *William Ellery Channing*

*x Ax n xx BQB x Ax Px x*

.

1 1 1

*k N kk k k k k k kN kk k k k k kN x I P A BQB A x P A BQB x* . (44)

/ /

/ / / / / ˆ ( ( ) )(ˆ ) ) ˆ *<sup>T</sup> <sup>T</sup> T T <sup>T</sup>*

*k N k N*

*k N k kN T T kN kN*

*x rn r x* 

/ ( ( )) *<sup>T</sup> <sup>T</sup>*

*<sup>k</sup> <sup>k</sup> xx k <sup>n</sup>*

*p x x Rx*

1/ 2 / 2

*R*

*xx*

(2 )

The notation *xk* ~

function to zero results in

 ( , 

Rauch, Tung and Striebel assumed that [6]:

Rearranging the above equation leads to

and Striebel's derivation appear in [13].

singular, it follows from *Pk k* 1/ = /

**7.5.1.2 Alternative Forms** 

and

following Hamiltonian form

1 1/ ) *<sup>T</sup> BQB P k kkk k* 

From the Matrix Inversion Lemma

$$
\begin{bmatrix}
\hat{\boldsymbol{x}}\_{k+1/N} \\
\boldsymbol{\mathcal{A}}\_{k/N}
\end{bmatrix} = \begin{bmatrix}
\boldsymbol{A}\_{k} & \boldsymbol{B}\_{k}\boldsymbol{Q}\_{k}\boldsymbol{B}\_{k}^{T} \\
\end{bmatrix} \begin{bmatrix}
\boldsymbol{\hat{x}}\_{k/N} \\
\boldsymbol{\hat{\mathcal{A}}}\_{k+1/N}
\end{bmatrix} + \begin{bmatrix}
\boldsymbol{0} \\
\boldsymbol{\mathcal{C}}\_{k}^{T}\boldsymbol{R}\_{k}^{-1}\boldsymbol{z}\_{k}
\end{bmatrix} \tag{48}
$$

$$\begin{array}{ccccc} \perp\_{\mathbb{K}/N} & \perp & \square\_{\mathbb{K}} \mathsf{K}\_{\mathbb{k}} \mathsf{C}\_{\mathbb{k}} & A\_{\mathbb{k}} & \parallel \mathsf{A}\_{\mathbb{k}+1/N} \mathsf{}\_{\mathbb{k}} & \parallel \mathsf{C}\_{\mathbb{k}} \mathsf{K}\_{\mathbb{k}} \mathsf{z}\_{\mathbb{k}} \end{array} \tag{49}$$
  $\text{where } \lambda\_{\mathbb{k}N} \in \mathbb{R}^\* \text{ is an auxiliary variable that proceeds backwards in time } k. \text{ The form (48) -}$ 

(49) avoids potential numerical difficulties that may be associated with calculating <sup>1</sup> *Pk k*/ 1 .

To confirm the equivalence of (39) and (48) – (49), use the Bryson-Frazier formula [15]

$$
\hat{\mathbf{x}}\_{k+1/N} = \hat{\mathbf{x}}\_{k+1/k+1} + P\_{k+1/k} \mathcal{A}\_{k+1/N, \prime} \tag{50}
$$

and (46) within (48) to obtain

$$
\hat{\mathbf{x}}\_{k/N} = \mathbf{G}\_k \hat{\mathbf{x}}\_{k+1/k+1} + A\_k^{-1} \mathbf{B}\_k \mathbf{Q}\_k \mathbf{B}\_k^T \mathbf{P}\_{k+1/k}^{-1} \hat{\mathbf{x}}\_{k+1/k} \,. \tag{51}
$$

Employing (46) within (51) and rearranging leads to (39).

In time-invariant problems, steady state solutions for *Pk k*/ and *Pk k* 1/ can be used to precalculate the gain (40) before running the smoother. For example, the application of a time-invariant version of the Rauch-Tung-Striebel smoother for the restoration of blurred images is described in [14].

#### **7.5.1.3 Performance**

An expression for the smoother error covariance is developed below following the approach of [6], [13]. Define the smoother and filter error states as *k N*/ *x* = *xk* – / ˆ *k N x* and *k k*/ *x* = *xk* – / ˆ *k k x* , respectively. It is assumed that

$$E\{\tilde{\mathbf{x}}\_{k/k}\hat{\mathbf{x}}\_{k/k}^T\} = \mathbf{0},\tag{52}$$

$$E\{\tilde{\mathbf{x}}\_{k+1/N}\hat{\mathbf{x}}\_{k+1/N}^T\} = \mathbf{0} \tag{53}$$

$$E\{\tilde{\mathfrak{x}}\_{k+1/N}\hat{\mathfrak{x}}\_{k/N}^T\} = \mathbf{0} \,. \tag{54}$$

It is straightforward to show that (52) implies

$$E\{\hat{\mathbf{x}}\_{k+1/k}\hat{\mathbf{x}}\_{k+1/k}^T\} = E\{\mathbf{x}\_{k+1}\mathbf{x}\_{k+1}^T\} - P\_{k+1/k}\,. \tag{55}$$

Denote Σ*k/N* = 1/ 1/ { } ˆ ˆ*<sup>T</sup> Ex x k Nk N* . The assumption (53) implies

$$E\{\hat{\mathbf{x}}\_{k\ast1/N}\hat{\mathbf{x}}\_{k\ast1/N}^{\mathrm{T}}\} = E\{\mathbf{x}\_{k\ast1}\mathbf{x}\_{k\ast1}^{\mathrm{T}}\} - \Sigma\_{k\ast1/N}\,. \tag{56}$$

Subtracting *xk* from both sides of (39) gives

$$
\tilde{\mathfrak{X}}\_{k/N} - \mathbf{G}\_k \hat{\mathfrak{x}}\_{k+1/N} = \tilde{\mathfrak{X}}\_{k/k} - \mathbf{G}\_k \mathbf{A}\_k \hat{\mathfrak{x}}\_{k/k} \,. \tag{57}
$$

By simplifying

<sup>&</sup>quot;Great thinkers think inductively, that is, they create solutions and then seek out the problems that the solutions might solve; most companies think deductively, that is, defining a problem and then investigating different solutions." *Joey Reiman*

*v*, where *y*2 =

2 1

estimation problems

2 2 { } *H H Q*

<sup>2</sup> { }*<sup>H</sup>* 

*Vinci*

*y*1 – <sup>1</sup> *y*ˆ .

*ei ei* where

2 2 *H*

**7.5.3 Minimum-Variance Smoothers** 

**7.5.3.1 Optimal Unrealisable Solutions** 

and *i* =

that produces output estimates <sup>1</sup> *y*ˆ of a reference system *y*1 =

*v w* 

+ *R*. From Lemma 8 of Chapter 6, the smoother solution

From Lemma 9 of Chapter 6, the (causal) filter solution

1 and

1

1/2

"Life is pretty simple. You do stuff. Most fails. Some works. You do more of what works. If it works big, others quickly copy it. Then you do something else. The trick is the doing something else." *Leonardo da* 

1/ 2 *k k kk k k k k k x A K x*

: *<sup>p</sup>* → *<sup>p</sup>* is introduced in [7], [13] and defined by

that use an approximate Wiener-Hopf factor in place of Δ are presented below.

 *C z* 

best-possible performance, namely, it minimises <sup>2</sup>

1 = 

*H* 

**7.5.3.2 Non-causal Output Estimation**  Suppose that the time-varying linear system

1

2 2 <sup>2</sup> { }*<sup>H</sup> y y* <sup>=</sup> <sup>2</sup> { }*<sup>H</sup>* 

*ei ei* <sup>≥</sup> <sup>2</sup>

Wiener-Hopf factor ˆ

difficulty in obtaining Δ when

Consider again the estimation problem depicted in Fig. 1 of Chapter 6, where *w* and *v* are now discrete-time inputs. As in continuous-time, it is desired to construct a solution

The following discussion is perfunctory since it is a regathering of the results from Chapter

*H H*

2 and the optimal smoother simplifies to

achieves the best-possible filter performance, that is, it minimises

in which : *<sup>p</sup>* → *<sup>p</sup>* , is known as the Wiener-Hopf factor, which satisfies *<sup>H</sup>* = *<sup>H</sup>*

6. Recall that the output estimation error is generated by *ei e i*

2 1 2

*ei Q*

 , (64)

*H* 

<sup>2</sup> are time-varying systems. Realisable solutions

<sup>2</sup> has the realisation (1) – (2). An approximate

, (65)

*<sup>H</sup> ee*<sup>=</sup> <sup>2</sup>

*ei ei* . The optimal smoother outperforms the optimal filter since

*ei ei* . The above solutions are termed unrealisable because of the

<sup>2</sup>*w* . The objective is to minimise the energy of the output estimation error *e* =

. It has been shown previously that *<sup>H</sup>*

is

*ei* =

*H*

<sup>1</sup>*w* from observations *z* = *y*2 +

, where

*ei ei =* 1 1

achieves the

*OE* = *I* − <sup>1</sup> ( ) *<sup>H</sup> R* .

=

*ei ei* . For example, in output

*OE* = <sup>1</sup> 2 2 { } *H H Q*

 *ei ei +* 

> *Q*

 **�**

 = <sup>1</sup> 2 2 ( ) *H H Q*

$$E\{ (\tilde{\mathbf{x}}\_{k/N} - \mathbf{G}\_k \hat{\mathbf{x}}\_{k+1/N})(\tilde{\mathbf{x}}\_{k/N} - \mathbf{G}\_k \hat{\mathbf{x}}\_{k+1/N})^T \} = E\{ (\tilde{\mathbf{x}}\_{k/k} - \mathbf{G}\_k A\_k \hat{\mathbf{x}}\_{k/k})(\tilde{\mathbf{x}}\_{k/k} - \mathbf{G}\_k A\_k \hat{\mathbf{x}}\_{k/k})^T \} \tag{58}$$

and using (52), (54) – (56) yields

$$
\Sigma\_{k/N} = P\_{k/k} - G\_k (P\_{k \ast 1/k} - \Sigma\_{k \ast 1/N}) G\_k^T \ . \tag{59}
$$

It can now be shown that the smoother performs better than the Kalman filter.

*Lemma 4: Suppose that the sequence (59) is initialised with* 

$$
\Sigma\_{\mathbb{N}\ast 1/N} = P\_{\mathbb{N}\ast 1/N} \, \prime \, \tag{60}
$$

*Then k N*/ *≤ Pk k*/ *for 1 ≤ k ≤ N.* 

*Proof: The condition (60) implies N N*/ *= PN N*/ *, which is the initial step for an induction argument. For the induction step, (59) is written as* 

$$\boldsymbol{\Sigma}\_{k/N} = \boldsymbol{P}\_{k/k-1} - \boldsymbol{P}\_{k/k-1} \boldsymbol{\mathsf{C}}\_{k}^{\top} (\mathbf{C}\_{k} \boldsymbol{P}\_{k/k-1} \mathbf{C}\_{k}^{\top} + \boldsymbol{R}) \mathbf{C}\_{k} \boldsymbol{P}\_{k/k-1} - \boldsymbol{G}\_{k} (\boldsymbol{P}\_{k+1/k} - \boldsymbol{\Sigma}\_{k+1/N}) \mathbf{G}\_{k}^{\top} \tag{61}$$

*and thus k N* 1/ *≤ Pk k* 1/ *implies k N*/ *≤ Pk k*/ 1 *and k N*/ *≤ Pk k*/ *. �* 

#### **7.5.2 The Fraser-Potter Smoother**

Forward and backward estimates may be merged using the data fusion formula described in Lemma 7 of Chapter 6. A variation of the Fraser-Potter discrete-time fixed-interval smoother [4] derived by Monzingo [16] is advocated below.

In the first pass, a Kalman filter produces corrected state estimates / ˆ *k k x* and corrected error covariances *Pk k*/ from the measurements. In the second pass, a Kalman filter is employed to calculate predicted "backward" state estimates *k k* 1/ and predicted "backward" error covariances *k k* 1/ from the time-reversed measurements. The smoothed estimate is given by [16]

$$
\hat{\mathbf{x}}\_{k/N} = (P\_{k/k}^{-1} + \Sigma\_{k/k-1}^{-1})^{-1} (P\_{k/k}^{-1}\hat{\mathbf{x}}\_{k/k} + \Sigma\_{k/k-1}^{-1}\tilde{\mathbf{g}}\_{k/k-1})\,. \tag{62}
$$

Alternatively, Kalman filters could be used to derive predicted quantities, / 1 ˆ *k k x* and *Pk k*/ 1 , from the measurements, and backward corrected quantities *k k*/ and *k k*/ . Smoothed estimates may then be obtained from the linear combination

$$
\hat{\mathbf{x}}\_{k/N} = (P\_{k/k-1}^{-1} + \Sigma\_{k/k}^{-1})^{-1} (P\_{k/k-1}^{-1} \hat{\mathbf{x}}\_{k/k-1} + \Sigma\_{k/k}^{-1} \tilde{\mathbf{g}}\_{k/k}) \,. \tag{63}
$$

It is observed that the fixed-point smoother (24), the fixed-lag smoother (32), maximumlikelihood smoother (39), the smoothed estimates (62) − (63) and the minimum-variance smoother (which is described subsequently) all use each measurement *zk* once.

Note that Fraser and Potter's original smoother solution [4] and Monzingo's variation [16] are *ad hoc* and no claims are made about attaining a prescribed level of performance.

<sup>&</sup>quot;No great discovery was ever made without a bold guess." *Isaac Newton*

Smoothing, Filtering and Prediction:

<sup>160</sup> Estimating the Past, Present and Future

*Proof: The condition (60) implies N N*/ *= PN N*/ *, which is the initial step for an induction argument.* 

*and thus k N* 1/ *≤ Pk k* 1/ *implies k N*/ *≤ Pk k*/ 1 *and k N*/ *≤ Pk k*/ *. �* 

Forward and backward estimates may be merged using the data fusion formula described in Lemma 7 of Chapter 6. A variation of the Fraser-Potter discrete-time fixed-interval smoother

In the first pass, a Kalman filter produces corrected state estimates / ˆ *k k x* and corrected error covariances *Pk k*/ from the measurements. In the second pass, a Kalman filter is employed to

covariances *k k* 1/ from the time-reversed measurements. The smoothed estimate is given

Alternatively, Kalman filters could be used to derive predicted quantities, / 1 ˆ *k k x* and *Pk k*/ 1 ,

It is observed that the fixed-point smoother (24), the fixed-lag smoother (32), maximumlikelihood smoother (39), the smoothed estimates (62) − (63) and the minimum-variance

Note that Fraser and Potter's original smoother solution [4] and Monzingo's variation [16]

1 1 11 1 / / /1 / / /1 /1 ˆ ( )( ˆ ) *kN kk kk kk kk kk kk x P P x*

1 11 1 1 / /1 / /1 /1 / / ˆ ( )( ˆ ) *kN kk kk kk kk kk kk x P P x*

are *ad hoc* and no claims are made about attaining a prescribed level of performance.

smoother (which is described subsequently) all use each measurement *zk* once.

from the measurements, and backward corrected quantities *k k*/

estimates may then be obtained from the linear combination

"No great discovery was ever made without a bold guess." *Isaac Newton*

. (62)

. (63)

/ /1 /1 / 1 / 1 1/ 1/ ( ) ( ) *<sup>T</sup> <sup>T</sup> <sup>T</sup> kN kk kk k k kk k P P C CP C RCP G P k kk k k k k N k G* (61)

It can now be shown that the smoother performs better than the Kalman filter.

*Lemma 4: Suppose that the sequence (59) is initialised with* 

and using (52), (54) – (56) yields

*Then k N*/ *≤ Pk k*/ *for 1 ≤ k ≤ N.* 

*For the induction step, (59) is written as* 

**7.5.2 The Fraser-Potter Smoother** 

by [16]

[4] derived by Monzingo [16] is advocated below.

calculate predicted "backward" state estimates *k k* 1/

/ 1/ / 1/ / / / / {( ˆ )( ˆ ) } {( ˆ )( ˆ ) } *<sup>T</sup> <sup>T</sup> E x Gx x Gx E x GAx x GAx kN k k N kN k k N kk k k kk kk k k kk* (58)

/ / 1/ 1/ ( ) *<sup>T</sup> kN kk k k k k N k P GP G* . (59)

*NN NN* 1/ *P* 1/ , (60)

and predicted "backward" error

and *k k*/ . Smoothed

#### **7.5.3 Minimum-Variance Smoothers**

#### **7.5.3.1 Optimal Unrealisable Solutions**

Consider again the estimation problem depicted in Fig. 1 of Chapter 6, where *w* and *v* are now discrete-time inputs. As in continuous-time, it is desired to construct a solution is that produces output estimates <sup>1</sup> *y*ˆ of a reference system *y*1 =<sup>1</sup>*w* from observations *z* = *y*2 + *v*, where *y*2 = <sup>2</sup>*w* . The objective is to minimise the energy of the output estimation error *e* = *y*1 – <sup>1</sup> *y*ˆ .

The following discussion is perfunctory since it is a regathering of the results from Chapter 6. Recall that the output estimation error is generated by *ei e i*  **�**, where *ei* = 2 1  and *i* = *v w* . It has been shown previously that *<sup>H</sup>* *ei ei =* 1 1 *H* *ei ei + H ei ei* where

$$\mathcal{R}\_{i12} = \mathcal{H}\Delta - \mathcal{G}\_1^\prime \mathcal{Q} \mathcal{G}\_2^{\prime H} \Lambda^{-H} \,, \tag{64}$$

in which : *<sup>p</sup>* → *<sup>p</sup>* , is known as the Wiener-Hopf factor, which satisfies *<sup>H</sup>* = *<sup>H</sup>* *Q* + *R*. From Lemma 8 of Chapter 6, the smoother solution = <sup>1</sup> 2 2 ( ) *H H Q*  achieves the best-possible performance, namely, it minimises <sup>2</sup> *<sup>H</sup> ee*<sup>=</sup> <sup>2</sup> *H* *ei ei* . For example, in output estimation problems 1 = 2 and the optimal smoother simplifies to *OE* = *I* − <sup>1</sup> ( ) *<sup>H</sup> R* . From Lemma 9 of Chapter 6, the (causal) filter solution *OE* = <sup>1</sup> 2 2 { } *H H Q*  = 1 2 2 { } *H H Q*  achieves the best-possible filter performance, that is, it minimises 2 2 <sup>2</sup> { }*<sup>H</sup> y y* <sup>=</sup> <sup>2</sup> { }*<sup>H</sup>* *ei ei* . The optimal smoother outperforms the optimal filter since <sup>2</sup> { }*<sup>H</sup>* *ei ei* <sup>≥</sup> <sup>2</sup> *H* *ei ei* . The above solutions are termed unrealisable because of the difficulty in obtaining Δ when 1 and <sup>2</sup> are time-varying systems. Realisable solutions that use an approximate Wiener-Hopf factor in place of Δ are presented below.

#### **7.5.3.2 Non-causal Output Estimation**

Suppose that the time-varying linear system <sup>2</sup> has the realisation (1) – (2). An approximate Wiener-Hopf factor ˆ : *<sup>p</sup>* → *<sup>p</sup>* is introduced in [7], [13] and defined by

$$
\begin{bmatrix} \mathbf{x}\_{k+1} \\ \boldsymbol{\delta}\_{k} \end{bmatrix} = \begin{bmatrix} \mathbf{A}\_{k} & \mathbf{K}\_{k} \mathbf{Q}\_{k}^{1/2} \\ \mathbf{C}\_{k} & \mathbf{Q}\_{k}^{1/2} \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{k} \\ \mathbf{z}\_{k} \end{bmatrix} \tag{65}
$$

<sup>&</sup>quot;Life is pretty simple. You do stuff. Most fails. Some works. You do more of what works. If it works big, others quickly copy it. Then you do something else. The trick is the doing something else." *Leonardo da Vinci*

Figure 2. Block diagram of the output estimation smoother

*k k kk C x* / 1  *+* 1/ 2

**7.5.3.3 Causal Output Estimation** 

minimum-variance smoother (67)

*Proof: Denote the one-step-head prediction error by k k* <sup>1</sup> *x* / *= xk −* / <sup>1</sup> ˆ*k k x . The output of (66) may be* 

ˆ <sup>1</sup>

<sup>1</sup> ˆ ( ) *<sup>H</sup>*

*αk*

*βk*

/ ˆ *kN k k k y zR*

Σ

*Rk*

*zk*

х

*The first term on the right-hand-side of (70) is zero since it pertains to the prediction error of the* 

*recursion (68) is initialized with ζN = 0, it follows that E{ζk} = 0, which implies E{ζk} = − KkE{ζk} +* 

 *= 0. Thus, from (69),* / { } ˆ *E y = E{z k N k} = E{yk}, since it is assumed that E{vk} = 0. �*

The minimum-variance (Kalman) filter is obtained by taking the causal part of the optimum

"The practical success of an idea, irrespective of its inherent merit, is dependent on the attitude of the contemporaries. If timely it is quickly adopted; if not, it is apt to fare like a sprout lured out of the ground by warm sunshine, only to be injured and retarded in its growth by the succeeding frost." *Nicola* 

*k k k kk k k E v* . (70)

(71)

*k = 0. Since the* 

*k k v and therefore* 

*Kalman filter. The second term is zero since it is assumed that E{vk} = 0. Thus* { } *E*

ˆ ˆ <sup>1</sup> {} {}*<sup>H</sup> OE <sup>k</sup> I R*

1/ 2 1/ 2 / 1 {} { } {} *E CEx*

*Lemma 5* / { } ˆ *E yk N* = { } *E yk* .

*k =* 1/ 2

*written as* 

1/ 2 { } *k k E* 

*Tesla*

where *Kk* = / 1 ( *<sup>T</sup> AP C k kk k* + <sup>1</sup> ) *<sup>T</sup> BQD kkk k* is the predictor gain, Ω*k* = / 1 *<sup>T</sup> CP C k kk k* + *<sup>T</sup> DQD kkk* + *Rk* and *Pk k*/ 1 evolves from the Riccati difference equation *Pk k* 1/ = / 1 *<sup>T</sup> Ak kk k P A* − / 1 ( *<sup>T</sup> Ak kk k P C* + / 1 )( *<sup>T</sup> <sup>T</sup> BQD C P C k k k k kk k* + *<sup>T</sup> DQD kkk* + <sup>1</sup> / 1 ) ( *<sup>T</sup> R CP A k k kk k* + ) *<sup>T</sup> DQB kkk* + *<sup>T</sup> BQB k kk* . The inverse of the system (65), denoted by ˆ <sup>1</sup> , is obtained using the Matrix Inversion Lemma

$$
\begin{bmatrix}
\hat{\mathbf{x}}\_{k+1/k} \\
\mathbf{a}\_{k}
\end{bmatrix} = \begin{bmatrix}
\mathbf{A}\_{k} - \mathbf{K}\_{k}\mathbf{C}\_{k} & \mathbf{K}\_{k} \\
\end{bmatrix} \begin{bmatrix}
\hat{\mathbf{x}}\_{k/k-1} \\
\mathbf{z}\_{k}
\end{bmatrix}.\tag{66}
$$

The optimal output estimation smoother can be approximated as

$$\begin{aligned} \mathcal{H}\_{\text{OE}} &= I - R \big( \hat{\Delta}^H \big)^{-1} \\ &= I - R \hat{\Delta}^{-H} \hat{\Delta}^{-1} \big. \end{aligned} \tag{67}$$

A state-space realisation of (67) is given by (66),

$$
\begin{bmatrix} \tilde{\mathcal{L}}\_{k-1} \\ \mathcal{B}\_{k} \end{bmatrix} = \begin{bmatrix} \boldsymbol{A}\_{k}^{\mathrm{T}} - \mathbf{C}\_{k}^{\mathrm{T}} \mathbf{K}\_{k}^{\mathrm{T}} & \mathbf{C}\_{k}^{\mathrm{T}} \boldsymbol{\Omega}\_{k}^{-1/2} \\\ -\mathbf{K}\_{k}^{\mathrm{T}} & \boldsymbol{\Omega}\_{k}^{-1/2} \end{bmatrix} \begin{bmatrix} \tilde{\boldsymbol{\mathcal{L}}}\_{k} \\ \boldsymbol{\alpha}\_{k} \end{bmatrix}, \quad \tilde{\boldsymbol{\mathcal{G}}}\_{\mathcal{N}} = \mathbf{0} \tag{68}
$$

and

$$
\hat{\mathcal{Y}}\_{k \wedge \mathcal{N}} = \mathcal{z}\_k - \mathcal{R}\_k \mathcal{B}\_k \,. \tag{69}
$$

Note that Lemma 1 is used to obtain the realisation (68) of ˆ *<sup>H</sup>* = ˆ <sup>1</sup> ( ) *<sup>H</sup>* from (66). A block diagram of this smoother is provided in Fig. 2. The states / 1 ˆ *k k x* within (66) are immediately recognisable as belonging to the one-step-ahead predictor. Thus, the optimum realisable solution involves a cascade of familiar building blocks, namely, a Kalman predictor and its adjoint.

*Procedure 1.* The above output estimation smoother can be implemented via the following three-step procedure.


Step 3. Calculate the smoothed output estimate from (69).

It is shown below that / ˆ *k N y* is an unbiased estimate of *yk*.

<sup>&</sup>quot;When I am working on a problem, I never think about beauty but when I have finished, if the solution is not beautiful, I know it is wrong." *Richard Buckminster Fuller*

Smoothing, Filtering and Prediction:

. (66)

(67)

+ ) *<sup>T</sup> DQB kkk* + *<sup>T</sup> BQB k kk* . The inverse of the

. (69)

*<sup>T</sup> CP C k kk k* + *<sup>T</sup> DQD kkk* + *Rk*

*<sup>T</sup> Ak kk k P A* − / 1 ( *<sup>T</sup> Ak kk k P C* +

<sup>162</sup> Estimating the Past, Present and Future

and *Pk k*/ 1 evolves from the Riccati difference equation *Pk k* 1/ = / 1

system (65), denoted by ˆ <sup>1</sup> , is obtained using the Matrix Inversion Lemma

*x A KC K x*

The optimal output estimation smoother can be approximated as

 *C C z* 

( )

*T TT T k k kk k k k*

*A CK C K*

*k k k k*

/ ˆ *kN k k k y zR*

/ 1 ) ( *<sup>T</sup> R CP A k k kk k* 

1/ / 1 1/ 2 1/ 2 ˆ ˆ *k k k kk <sup>k</sup> k k k kk kk k*

> <sup>1</sup> *<sup>H</sup> OE I R* ˆ ˆ

> > 1/ 2

*T N*

Note that Lemma 1 is used to obtain the realisation (68) of ˆ *<sup>H</sup>* = ˆ <sup>1</sup> ( ) *<sup>H</sup>* from (66). A block diagram of this smoother is provided in Fig. 2. The states / 1 ˆ *k k x* within (66) are immediately recognisable as belonging to the one-step-ahead predictor. Thus, the optimum realisable solution involves a cascade of familiar building blocks, namely, a Kalman predictor and its

*Procedure 1.* The above output estimation smoother can be implemented via the following

Step 2. In lieu of the adjoint system (68), operate (66) on the time-reversed transpose of α*k*.

"When I am working on a problem, I never think about beauty but when I have finished, if the solution

Then take the time-reversed transpose of the result to obtain *βk*.

It is shown below that / ˆ *k N y* is an unbiased estimate of *yk*.

 

1/ 2 , 0

(68)

is the predictor gain, Ω*k* = / 1

where *Kk* = / 1 ( *<sup>T</sup> AP C k kk k* + <sup>1</sup> ) *<sup>T</sup> BQD kkk k*

/ 1 )( *<sup>T</sup> <sup>T</sup> BQD C P C k k k k kk k* + *<sup>T</sup> DQD kkk* + <sup>1</sup>

<sup>1</sup> ˆ ˆ *<sup>H</sup> I R* .

A state-space realisation of (67) is given by (66),

1

Step 1. Operate ˆ <sup>1</sup> on *zk* using (66) to obtain α*k*.

Step 3. Calculate the smoothed output estimate from (69).

is not beautiful, I know it is wrong." *Richard Buckminster Fuller*

and

adjoint.

three-step procedure.

Figure 2. Block diagram of the output estimation smoother

*Lemma 5* / { } ˆ *E yk N* = { } *E yk* .

*Proof: Denote the one-step-head prediction error by k k* <sup>1</sup> *x* / *= xk −* / <sup>1</sup> ˆ*k k x . The output of (66) may be written as k =* 1/ 2 *k k kk C x* / 1  *+* 1/ 2 *k k v and therefore* 

$$E\{\mathbf{z}\_k\} = \boldsymbol{\Omega}\_k^{-1/2} \mathbf{C}\_k E\{\tilde{\mathbf{x}}\_{k/k=1}\} + \boldsymbol{\Omega}\_k^{-1/2} E\{\boldsymbol{v}\_k\} \,. \tag{70}$$

*The first term on the right-hand-side of (70) is zero since it pertains to the prediction error of the Kalman filter. The second term is zero since it is assumed that E{vk} = 0. Thus* { } *E k = 0. Since the recursion (68) is initialized with ζN = 0, it follows that E{ζk} = 0, which implies E{ζk} = − KkE{ζk} +* 1/ 2 { } *k k E = 0. Thus, from (69),* / { } ˆ *E y = E{z k N k} = E{yk}, since it is assumed that E{vk} = 0. �*

#### **7.5.3.3 Causal Output Estimation**

The minimum-variance (Kalman) filter is obtained by taking the causal part of the optimum minimum-variance smoother (67)

$$\{\mathcal{H}\_{\rm OE}\}\_{\ast} = I - R\_k \{\hat{\Delta}^{-H}\}\_{\ast} \hat{\Delta}^{-1} \tag{71}$$

<sup>&</sup>quot;The practical success of an idea, irrespective of its inherent merit, is dependent on the attitude of the contemporaries. If timely it is quickly adopted; if not, it is apt to fare like a sprout lured out of the ground by warm sunshine, only to be injured and retarded in its growth by the succeeding frost." *Nicola Tesla*

where # <sup>1</sup> ( ) *T T C CC C k kk k*

**7.5.3.6 Performance** 

Similarly, let *ε* = <sup>0</sup>

requires the identity

*Oppenheimer*

requested in the problems.

has the realisation

*H* 

Let *γ* =

given by

Thus, the minimum-variance smoother for state estimation is given by (66) and (74) − (75). As remarked in Chapter 6, some numerical model order reduction may be required. In the special case of *Ck* being of rank *n* and *Dk* = 0, state estimates can be calculated from (69) and

The characterisation of smoother performance requires the following additional notation.

<sup>0</sup> *w* denote the output of the linear time-varying system having the realisation

/ / ˆ ˆ *kN k kN x C y* . (76)

. (79)

is given by

*.* (82)

*.* (83)

*,* (84)

*H* 

(77) (78)

(80) (81)

0 *y* is

, which from Lemma 1

#

*xk kk k* <sup>1</sup> *Ax w* ,

*k k x* ,

*w A k k kk* <sup>1</sup> 

*T*

*k k* .

1 *<sup>T</sup> u A k k kk* 

0 0 *<sup>H</sup> T HT C BQB C R k k kk k k* 

*<sup>T</sup> <sup>H</sup> <sup>T</sup> <sup>H</sup> P APA AP PA P k kk k kk k k <sup>k</sup>* 

 *k kk k A u* ,

1

It follows that the output of the inverse system *u* = <sup>0</sup>

The exact Wiener-Hopf factor may now be written as

*u* denote the output of the adjoint system <sup>0</sup>

where *Ak n n* . By inspection of (77) – (78), the output of the inverse system *w* = <sup>1</sup>

The subsequent lemma, which relates the exact and approximate Wiener-Hopf factors,

in which *Pk* is an arbitrary matrix of appropriate dimensions. A verification of (84) is

"The theory of our modern technique shows that nothing is as practical as the theory." *Julius Robert* 

1 1 0 0 0 0

*H* 

is the Moore-Penrose pseudo-inverse.

$$=I - R\_k \Omega\_k^{-1/2} \hat{\Delta}^{-1}.$$

To confirm this linkage between the smoother and filter, denote *Lk* = / 1 ( *<sup>T</sup> CP C k kk k* + <sup>1</sup> ) *<sup>T</sup> DQD kkk k* and use (71) to obtain

$$\hat{\boldsymbol{y}}\_{k\times k} = \mathbf{z}\_k - \mathbf{R}\_k \boldsymbol{\Omega}\_k^{-1/2} \boldsymbol{\alpha}\_k$$

$$= \mathbf{R}\_k \boldsymbol{\Omega}\_k^{-1} \mathbf{C}\_k \mathbf{x}\_k + (I - \mathbf{R}\_k \boldsymbol{\Omega}\_k^{-1/2}) \mathbf{z}\_k \tag{72}$$

$$= (\mathbf{C}\_k - \mathbf{L}\_k \mathbf{C}\_k) \mathbf{x}\_k + \mathbf{L}\_k \mathbf{z}\_k.$$

which is identical to (34) of Chapter 4.

#### **7.5.3.4 Input Estimation**

As discussed in Chapter 6, the optimal realisable smoother for input estimation is

$$\mathcal{H}\_{\rm lE} = \mathbb{Q} \mathcal{G}\_2^{\rm H} \hat{\Delta}^{-H} \hat{\Delta}^{-1} \,. \tag{73}$$

The development of a state-space realisation for / ˆ*w = k N* <sup>2</sup> *H H* ˆ *Q k* makes use of the formula for the cascade of two systems described in Chapter 6. The smoothed input estimate is realised by

$$
\begin{bmatrix}
\boldsymbol{\tilde{\xi}}\_{k-1} \\
\boldsymbol{\mathcal{\gamma}}\_{k-1} \\
\boldsymbol{\hat{w}}\_{k-1/N}
\end{bmatrix} = \begin{bmatrix}
\boldsymbol{A}\_{k}^{T} - \mathbf{C}\_{k}^{T}\mathbf{K}\_{k} & \mathbf{0} & \mathbf{C}\_{k}^{T}\boldsymbol{\Omega}\_{k}^{-1/2} \\
\mathbf{C}\_{k}^{T}\mathbf{K}\_{k}^{T} & \mathbf{A}\_{k}^{T} & -\mathbf{C}\_{k}^{T}\boldsymbol{\Omega}\_{k}^{-1/2} \\
\end{bmatrix} \begin{bmatrix}
\boldsymbol{\tilde{\xi}}\_{k} \\
\boldsymbol{\mathcal{\gamma}}\_{k} \\
\boldsymbol{\alpha}\_{k}
\end{bmatrix} \tag{74}
$$

in which *<sup>k</sup> <sup>n</sup>* is an auxiliary state.

*Procedure 2.* The above input estimator can be implemented via the following three steps.


#### **7.5.3.5 State Estimation**

Smoothed state estimates can be obtained by defining the reference system <sup>1</sup> = *I* which yields

$$\begin{aligned} \hat{\mathbf{x}}\_{k+1/N} &= A\_k \hat{\mathbf{x}}\_{k/N} + B\_k \hat{w}\_{k/N} \\ &= A\_k \hat{\mathbf{x}}\_{k/N} + B\_k \mathbf{Q} \mathbf{g}\_2^H \hat{\mathbf{\Lambda}}^{-H} \mathbf{a}\_k \end{aligned} \tag{75}$$

<sup>&</sup>quot;Doubt is the father of invention." *Galileo Galilei*

Smoothing, Filtering and Prediction:

(72)

<sup>164</sup> Estimating the Past, Present and Future

To confirm this linkage between the smoother and filter, denote *Lk* = / 1 ( *<sup>T</sup> CP C k kk k* +

1/ 2

1

formula for the cascade of two systems described in Chapter 6. The smoothed input estimate

0

*Procedure 2.* The above input estimator can be implemented via the following three steps.

Step 2. Operate the adjoint of (74) on the time-reversed transpose of α*k*. Then take the time-

*T T T k k kk k k k T T T T k k k k k k k T T T T k N k k k kk k k k k*

*A CK C CK A C*

*w QDK QB QD*

Step 1. Operate ˆ <sup>1</sup> on the measurements *zk* using (66) to obtain α*k*.

Smoothed state estimates can be obtained by defining the reference system

1/ / / ˆ ˆˆ *k N k kN k kN x Ax Bw*

.

. (73)

1/ 2

*k*

1/ 2

1/ 2

*H H* ˆ *Q*

 *k* 

makes use of the

, (74)

<sup>1</sup> = *I* which

(75)

/ ˆ *kk k k k k y zR*

As discussed in Chapter 6, the optimal realisable smoother for input estimation is

2 *H H* ˆ ˆ *IE <sup>Q</sup>*

1/ 2 1 <sup>ˆ</sup> *k k I R* .

<sup>1</sup> 1/ 2 ( ) *R Cx I R z k k kk kk k*

The development of a state-space realisation for / ˆ*w = k N* <sup>2</sup>

( ) *C LC x Lz k k k k kk* ,

and use (71) to obtain

which is identical to (34) of Chapter 4.

1

1

1/

*<sup>n</sup>* is an auxiliary state.

reversed transpose of the result.

/ <sup>2</sup> <sup>ˆ</sup> <sup>ˆ</sup> *H H Ax BQ k kN k*

"Doubt is the father of invention." *Galileo Galilei*

ˆ

**7.5.3.4 Input Estimation** 

<sup>1</sup> ) *<sup>T</sup> DQD kkk k*

is realised by

in which *<sup>k</sup>* 

yields

**7.5.3.5 State Estimation** 

Thus, the minimum-variance smoother for state estimation is given by (66) and (74) − (75). As remarked in Chapter 6, some numerical model order reduction may be required. In the special case of *Ck* being of rank *n* and *Dk* = 0, state estimates can be calculated from (69) and

$$
\ddot{\mathbf{x}}\_{k/N} = \mathbf{C}\_k^\* \ddot{\mathbf{y}}\_{k/N} \,. \tag{76}
$$

where # <sup>1</sup> ( ) *T T C CC C k kk k* is the Moore-Penrose pseudo-inverse.

#### **7.5.3.6 Performance**

The characterisation of smoother performance requires the following additional notation. Let *γ* = <sup>0</sup> *w* denote the output of the linear time-varying system having the realisation

$$\mathbf{x}\_{k+1} = A\_k \mathbf{x}\_k + \mathbf{w}\_{k-1} \tag{77}$$

$$\mathbf{x}\_k = \mathbf{x}\_k \tag{78}$$

where *Ak n n* . By inspection of (77) – (78), the output of the inverse system *w* = <sup>1</sup> 0 *y* is given by

$$
\Delta w\_k = \mathcal{Y}\_{k+1} - A\_k \mathcal{Y}\_k \,. \tag{79}
$$

Similarly, let *ε* = <sup>0</sup> *H u* denote the output of the adjoint system <sup>0</sup> *H* , which from Lemma 1 has the realisation

$$
\mathcal{L}\_{k-1} = A\_k^\top \mathcal{L}\_k - \mathfrak{u}\_{k\text{ \textquotedblleft}} \tag{80}
$$

$$
\mathfrak{E}\_k = -\mathfrak{L}\_k \,. \tag{81}
$$

It follows that the output of the inverse system *u* = <sup>0</sup> *H* is given by

$$
\mu\_k = \varepsilon\_{k-1} - A\_k^T \varepsilon\_k \,. \tag{82}
$$

The exact Wiener-Hopf factor may now be written as

$$
\Delta\Delta^H = \mathbf{C}\_k \mathcal{G}\_0 \mathbf{B}\_k \mathbf{Q}\_k \mathbf{B}\_k^\top \mathbf{G}\_0^H \mathbf{C}\_k^\top + \mathbf{R}\_k \ . \tag{83}
$$

The subsequent lemma, which relates the exact and approximate Wiener-Hopf factors, requires the identity

$$P\_k - A\_k P\_k A\_k^T = A\_k P\_k \mathcal{G}\_0^{-H} + \mathcal{G}\_0^{-1} P\_k A\_k^T + \mathcal{G}\_0^{-1} P\_k \mathcal{G}\_0^{-H} \tag{84}$$

in which *Pk* is an arbitrary matrix of appropriate dimensions. A verification of (84) is requested in the problems.

<sup>&</sup>quot;The theory of our modern technique shows that nothing is as practical as the theory." *Julius Robert Oppenheimer*

**Lemma 6 [7]:** In respect of the signal model (1) – (2) with Dk = 0, E{wk} = E{vk} = 0, { }*<sup>T</sup> Ewwj <sup>k</sup>* = *Qk jk* , { }*<sup>T</sup> Evvj <sup>k</sup>* = *Rk jk* , { }*<sup>T</sup> Ewvk k* = 0 and the quantities defined above,

$$
\hat{\Delta}\hat{\boldsymbol{\Delta}}^{\boldsymbol{H}} = \boldsymbol{\Delta}\boldsymbol{\Delta}^{\boldsymbol{H}} + \mathbb{C}\_{k}\mathcal{G}\_{0} \left(\boldsymbol{P}\_{k/k-1} - \boldsymbol{P}\_{k+1/k}\right)\mathcal{G}\_{0}^{\boldsymbol{H}}\boldsymbol{C}\_{k}^{\boldsymbol{T}}.\tag{85}
$$

*Proof: The approximate Wiener-Hopf factor may be written as* ˆ  *=* 1/ 2 *C K k kk* <sup>0</sup>  *+* 1/ 2 *<sup>k</sup> . Using Pk k*/ 1 *−* / 1 *<sup>T</sup> Ak kk k P A = −* / 1 / 1 ( *<sup>T</sup> <sup>T</sup> Ak kk k k kk k P C CP C +* <sup>1</sup> / 1 ) *<sup>T</sup> R CP A k k kk k + <sup>T</sup> BQB k kk + Pk k*/ 1 *− Pk k* 1/ *within (84) and simplifying leads to (85). □*

It can be seen from (85) that the approximate Wiener-Hopf factor approaches the exact Wiener-Hopf factor whenever the estimation problem is locally stationary, that is, when the model and noise parameters vary sufficiently slowly so that *Pk k*/ 1 *Pk k* 1/ . Under these conditions, the smoother (69) achieves the best-possible performance, as is shown below.

*Lemma 7 [7]: The smoother (67) satisfies* 

$$\mathcal{R}\_{i12} = R\_i[(\Delta \boldsymbol{\Lambda}^H)^{-1} - (\Delta \boldsymbol{\Lambda}^H - \mathbf{C}\_1 \mathcal{G}\_0 (P\_{k/k-1} - P\_{k \times 1/k}) \mathcal{G}\_0^H \mathbf{C}\_k^T)^{-1}] \Delta,\tag{86}$$

*Proof: Substituting (67) into (64) yields* 

$$\mathcal{R}\_{\hat{\imath}\hat{\imath}2} = \mathbb{R}\_{\hat{\imath}}[(\Delta\boldsymbol{\Delta}^{H})^{-1} - (\hat{\Delta}\hat{\boldsymbol{\Delta}}^{H})^{-1}]\boldsymbol{\Delta}\,. \tag{87}$$

*The result (86) is now immediate from (85) and (87). □*

Some conditions under which *Pk k* 1/ asymptotically approaches *Pk k*/ 1 and the smoother (67) attaining optimal performance are set out below.

*Lemma 8 [8]: Suppose* 


$$\text{(iii)}\quad \begin{bmatrix} B\_{t\ast k-l}Q\_{t\ast k-l}B\_{t\ast k-l}^{\top} & A\_{t\ast k-l}^{\top} \\ A\_{t\ast k-l} & -\mathbf{C}\_{t\ast k-l}^{\top}R\_{t\ast k-l}^{-1}\mathbf{C}\_{t\ast k-l} \end{bmatrix} \ge \begin{bmatrix} B\_{t\ast k}Q\_{t\ast k}B\_{t\ast k}^{\top} & A\_{t\ast k}^{\top} \\ A\_{t\ast k} & -\mathbf{C}\_{t\ast k}^{\top}R\_{t\ast k}^{-1}\mathbf{C}\_{t\ast k} \end{bmatrix} \text{ for all } k \ge 0. \text{}$$

*Then the smoother (67) achieves* 

$$\lim\_{t \to \infty} \left\| \mathcal{R}\_{t1} \mathcal{R}\_{t1}^H \right\|\_2 = 0 \,. \tag{88}$$

*Proof: Conditions i) and ii) together with Theorem 1 imply Pk k*/ 1 *≥ Pk k* 1/ *for all k ≥ 0 and Pk k*/ 1 *= 0. The claim (88) is now immediate from Theorem 2. □*

An example that illustrates the performance benefit of the minimum-variance smoother (67) is described below.

$$\mathbf{u}$$

*Example 2 [9]*. The nominal drift rate of high quality inertial navigation systems is around one nautical mile per hour, which corresponds to position errors of about 617 m over a twenty minute period. Thus, inertial navigation systems alone cannot be used to control underground mining equipment. An approach which has been found to be successful in

reported by an inertial navigation system are combined with external odometer measurements, *dk*. The dead reckoning position estimates in the x-y-z plane are calculated as

( ) sin( )

A filter or a smoother may then be employed to improve the noisy position estimates

1000 realisations of Gaussian measurement noise added to position estimates calculated from (89). The mean-square error exhibited by the minimum-variance filter and smoother operating on the noisy dead reckoning estimates are shown in Fig. 3. It can be seen that filtering the noisy dead reckoning positions can provide a significant mean-square-error improvement. The figure also demonstrates that the smoother can offer a few dB of further

−10 <sup>0</sup> <sup>10</sup> <sup>20</sup> −30

Figure 3. Mean-square-error of the position estimate versus input signal to noise ratio for Example 2: (i)

"I do not think that the wireless waves that I have discovered will have any practical application."

(iii)

SNR [dB]

sin( )

sin( )

 *<sup>k</sup>* , *<sup>k</sup>* and

*.* (89)

*k k k k k k k k k*

 

 

 

 

*A w*

1

 

(0 , 0.01)*, i* = 1…3*.* Simulations were conducted with

1

1

*k ,* 

(1)

*w*

*w*

(2)

*,*with

(3)

underground mines is called dead reckoning, where the Euler angles,

1 1

(i)

(ii)

noisy dead reckoning data, (ii) Kalman filter, and (iii) minimum-variance smoother (69).

*y y dd*

*k k k k k kk k k k k*

1

*x x*

*z z*

calculated from (89). Euler angles were generated using

and ( )*<sup>i</sup> wk ~* 

improvement at mid-range signal-to-noise ratios.

−25 −20 −15 −10 −5 0

MSE [m]

1

0.95 0 0 0 0.95 0 0 0 0.95

*A*

*Heinrich Rudolf Hertz*

<sup>&</sup>quot;Whoever, in the pursuit of science, seeks after immediate practical utility may rest assured that he seeks in vain." *Hermann Ludwig Ferdinand von Helmholtz*

Smoothing, Filtering and Prediction:

*.* (85)

 *+ <sup>T</sup> BQB k kk + Pk k*/ 1 *− Pk k* 1/

, (86)

 *+ <sup>T</sup> BQB k kk ; and* 

 *for all k ≥ 0.* 

<sup>0</sup>  *+* 1/ 2 *<sup>k</sup> . Using* 

 *=* 1/ 2 *C K k kk* 

<sup>166</sup> Estimating the Past, Present and Future

**Lemma 6 [7]:** In respect of the signal model (1) – (2) with Dk = 0, E{wk} = E{vk} = 0, { }*<sup>T</sup> Ewwj <sup>k</sup>*

*within (84) and simplifying leads to (85). □* It can be seen from (85) that the approximate Wiener-Hopf factor approaches the exact Wiener-Hopf factor whenever the estimation problem is locally stationary, that is, when the model and noise parameters vary sufficiently slowly so that *Pk k*/ 1 *Pk k* 1/ . Under these conditions, the smoother (69) achieves the best-possible performance, as is shown below.

1 1

1 1

*The result (86) is now immediate from (85) and (87). □* Some conditions under which *Pk k* 1/ asymptotically approaches *Pk k*/ 1 and the smoother (67)

*(i) for a t > 0 that there exist solutions Pt ≥ Pt+1 ≥ 0 of the Riccati difference equation* 

*T T T T tk tk tk t k tk tk tk t k*

 

*A C R C A CRC* 

*T T t k tk tk tk t k tk tk tk*

*Proof: Conditions i) and ii) together with Theorem 1 imply Pk k*/ 1 *≥ Pk k* 1/ *for all k ≥ 0 and Pk k*/ 1 *= 0. The claim (88) is now immediate from Theorem 2. □*

An example that illustrates the performance benefit of the minimum-variance smoother (67)

"Whoever, in the pursuit of science, seeks after immediate practical utility may rest assured that he

1 1

*<sup>T</sup> Ak kk k P A −* / 1 / 1 ( *<sup>T</sup> <sup>T</sup> Ak kk k k kk k P C CP C +* <sup>1</sup>

*BQB A BQB A*

2 2 <sup>2</sup> lim <sup>0</sup> *<sup>H</sup> ei ei <sup>t</sup>* 

1 111

<sup>2</sup> 0 / 1 1/ 0 [( ) ( ( ) )] *<sup>H</sup> <sup>H</sup> H T ei k R CP P C k kk k k k*

> <sup>2</sup> ˆ ˆ [( ) ( ) ] *<sup>H</sup> <sup>H</sup> ei k R*

0 / 1 1/ 0 ˆ ˆ ( ) *H H H T CP P C k kk k k k* 

, { }*<sup>T</sup> Ewvk k* = 0 and the quantities defined above,

/ 1 ) *<sup>T</sup> R CP A k k kk k* 

. (87)

/ 1 ) *<sup>T</sup> R CP A k k kk k* 

*.* (88)

= *Qk jk* 

*Pk k*/ 1 *−* / 1

, { }*<sup>T</sup> Evvj <sup>k</sup>* = *Rk jk*

*Lemma 7 [7]: The smoother (67) satisfies* 

*Proof: Substituting (67) into (64) yields* 

attaining optimal performance are set out below.

*(iii)* 1 11 <sup>1</sup>

seeks in vain." *Hermann Ludwig Ferdinand von Helmholtz*

*Lemma 8 [8]: Suppose* 

*(ii) Pk k* 1/ = / 1

*Then the smoother (67) achieves* 

is described below.

*Proof: The approximate Wiener-Hopf factor may be written as* ˆ

*<sup>T</sup> Ak kk k P A = −* / 1 / 1 ( *<sup>T</sup> <sup>T</sup> Ak kk k k kk k P C CP C +* <sup>1</sup>

*Example 2 [9]*. The nominal drift rate of high quality inertial navigation systems is around one nautical mile per hour, which corresponds to position errors of about 617 m over a twenty minute period. Thus, inertial navigation systems alone cannot be used to control underground mining equipment. An approach which has been found to be successful in underground mines is called dead reckoning, where the Euler angles, *<sup>k</sup>* , *<sup>k</sup>* and *k ,*  reported by an inertial navigation system are combined with external odometer measurements, *dk*. The dead reckoning position estimates in the x-y-z plane are calculated as

$$
\begin{bmatrix} \mathbf{x}\_{k+1} \\ \mathbf{y}\_{k+1} \\ \mathbf{z}\_{k+1} \end{bmatrix} = \begin{bmatrix} \mathbf{x}\_{k} \\ \mathbf{y}\_{k} \\ \mathbf{z}\_{k} \end{bmatrix} + (d\_{k} - d\_{k-1}) \begin{bmatrix} \sin(\theta\_{k}) \\ \sin(\phi\_{k}) \\ \sin(\phi\_{k}) \end{bmatrix}. \tag{89}
$$

A filter or a smoother may then be employed to improve the noisy position estimates

calculated from (89). Euler angles were generated using (1) 1 (2) 1 (3) 1 *k k k k k k k k k w A w w ,*with

0.95 0 0 0 0.95 0 0 0 0.95 *A* and ( )*<sup>i</sup> wk ~*  (0 , 0.01)*, i* = 1…3*.* Simulations were conducted with 

1000 realisations of Gaussian measurement noise added to position estimates calculated from (89). The mean-square error exhibited by the minimum-variance filter and smoother operating on the noisy dead reckoning estimates are shown in Fig. 3. It can be seen that filtering the noisy dead reckoning positions can provide a significant mean-square-error improvement. The figure also demonstrates that the smoother can offer a few dB of further improvement at mid-range signal-to-noise ratios.

Figure 3. Mean-square-error of the position estimate versus input signal to noise ratio for Example 2: (i) noisy dead reckoning data, (ii) Kalman filter, and (iii) minimum-variance smoother (69).

<sup>&</sup>quot;I do not think that the wireless waves that I have discovered will have any practical application." *Heinrich Rudolf Hertz*

Signals and

RTS smoother

FP

Optimal minimum-variance

smoother

**7.7 Conclusion** 

*zk* = *y*2,*k* + *vk* of a system

smoother

system

filter.

/ ˆ *k k x* and *k k*/ 1 previously calculated by forward and backward

Kalman filters.

Table 1. Main results for discrete-time fixed-interval smoothing.

*E*{*wk*} = *E*{v*k*} = 0. { }*<sup>T</sup> Ewwk k* = *Qk >* 0 and { }*<sup>T</sup> Evvk k* = *Rk >* 0 are known. *Ak*, *Bk*, *C*1*,k* and *C*2*,k* are known.

Assumes that the filtered and smoothed states are normally distributed. 1/ ˆ *k k x* previously calculated by Kalman

ASSUMPTIONS MAIN RESULTS

*k kk k k* <sup>1</sup> *x Ax Bw* 2, 2, *k kk y C x k kk* 2, *z y v* 1, 1, *k kk y C x*

/ / 1/ 1/ ˆˆ ˆ ˆ ( ) *kN kk k k N k k x x Gx x* 1, / 1, / ˆ ˆ *kN k kN y C x*

1 1 11 1 / / /1 / / /1 /1 ˆ ( )( ˆ ) *kN kk kk kk kk kk kk x P P x*

 1, / 1, / ˆ ˆ *kN k kN y C x*

1/ 2, / 1 1/ 2 1/ 2 2, *k k k kk k k k k kkk k*

*x A KC K x*

1 2, 2,

<sup>2</sup> realised by *xk*+1 = *Akxk* + *Bkwk* and *y*2,*<sup>k</sup>* = *C*2,*kxk*. Monzingo modified

 *C z* 

*T TT T k k kk k k k T k k k k*

2, / ˆ *kN k k k y zR*

 

1/ / / ˆ ˆˆ *k N k kN k kN x Ax Bw* 1, / 1, / ˆ ˆ *kN k kN y C x*

/ <sup>2</sup> <sup>ˆ</sup> <sup>ˆ</sup> *H H w Q k N*

*A CK C K*

Solutions to the fixed-point and fixed-lag smoothing problems can be found by applying the standard Kalman filter recursions to augmented systems. Where possible, it is shown that the smoother error covariances are less than the filter error covariance, namely, the fixed-

Table 1 summarises three common fixed-interval smoothers that operate on measurements

the Fraser-Potter smoother solution so that each measurement is only used once. Rauch, Tung and Striebel employed the maximum-likelihood method to derive their fixed-interval

point and fixed-lag smoothers provide a performance improvement over the filter.

"To invent, you need a good imagination and a pile of junk." *Thomas Alva Edison*

 1/ 2

1/ 2

*k* #### **7.6 Performance Comparison**

It has been demonstrated by the previous examples that the optimal fixed-interval smoother provides a performance improvement over the maximum-likelihood smoother. The remaining example of this section compares the behaviour of the fixed-lag and the optimum fixed-interval smoother.

Figure 4. Mean-square-error versus measurement noise covariance for Example 3: (i) Kalman filter, (ii) fixed-lag smoothers, and (iii) optimal minimum-variance smoother (67).

*Example 3*. Simulations were conducted for a first-order output estimation problem, in which *A* = 0.95, *B* = 1, *C* = 0.1, *Q* = 1, *R* = {0.01, 0.02, 0.5, 1, 1.5, 2} and *N* = 20,000. The meansquare-errors exhibited by the Kalman filter and the optimum fixed-interval smoother (69) are indicated by the top and bottom solid lines of Fig. 4, respectively. Fourteen fixed-lag smoother output error covariances, ( 1, 1) 1/ *ii T CP C k k* , *i* = 2 … 15, were calculated from (35) – (36) and are indicated by the dotted lines of Fig. 4. The figure illustrates that the fixed-lag smoother mean-square errors are bounded above and below by those of the Kalman filter and optimal fixed-interval smoother, respectively. Thus, an option for asymptotically attaining optimal performance is to employ Moore's reduced-order fixed-lag solution [1] – [3] with a sufficiently long lag.

<sup>&</sup>quot;You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles. Do you understand this? And radio operates exactly the same way: you send signals here, they receive them there. The only difference is that there is no cat." *Albert Einstein*

Smoothing, Filtering and Prediction:

<sup>168</sup> Estimating the Past, Present and Future

It has been demonstrated by the previous examples that the optimal fixed-interval smoother provides a performance improvement over the maximum-likelihood smoother. The remaining example of this section compares the behaviour of the fixed-lag and the optimum

(i)

0 0.5 1 1.5 2

(ii)

(iii)

, *i* = 2 … 15, were calculated from (35) – (36)

R

Figure 4. Mean-square-error versus measurement noise covariance for Example 3: (i) Kalman filter, (ii) fixed-lag smoothers, and (iii) optimal minimum-variance smoother (67). *Example 3*. Simulations were conducted for a first-order output estimation problem, in which *A* = 0.95, *B* = 1, *C* = 0.1, *Q* = 1, *R* = {0.01, 0.02, 0.5, 1, 1.5, 2} and *N* = 20,000. The meansquare-errors exhibited by the Kalman filter and the optimum fixed-interval smoother (69) are indicated by the top and bottom solid lines of Fig. 4, respectively. Fourteen fixed-lag

and are indicated by the dotted lines of Fig. 4. The figure illustrates that the fixed-lag smoother mean-square errors are bounded above and below by those of the Kalman filter and optimal fixed-interval smoother, respectively. Thus, an option for asymptotically attaining optimal performance is to employ Moore's reduced-order fixed-lag solution [1] –

"You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles. Do you understand this? And radio operates exactly the same way: you send

signals here, they receive them there. The only difference is that there is no cat." *Albert Einstein*

1/ *ii T CP C k k* 

**7.6 Performance Comparison** 

0.01

[3] with a sufficiently long lag.

smoother output error covariances, ( 1, 1)

0.02

0.03

0.04

0.05

MSE

0.06

0.07

0.08

fixed-interval smoother.


Table 1. Main results for discrete-time fixed-interval smoothing.

### **7.7 Conclusion**

Solutions to the fixed-point and fixed-lag smoothing problems can be found by applying the standard Kalman filter recursions to augmented systems. Where possible, it is shown that the smoother error covariances are less than the filter error covariance, namely, the fixedpoint and fixed-lag smoothers provide a performance improvement over the filter.

Table 1 summarises three common fixed-interval smoothers that operate on measurements *zk* = *y*2,*k* + *vk* of a system <sup>2</sup> realised by *xk*+1 = *Akxk* + *Bkwk* and *y*2,*<sup>k</sup>* = *C*2,*kxk*. Monzingo modified the Fraser-Potter smoother solution so that each measurement is only used once. Rauch, Tung and Striebel employed the maximum-likelihood method to derive their fixed-interval

<sup>&</sup>quot;To invent, you need a good imagination and a pile of junk." *Thomas Alva Edison*

(iii) Use *Gk* = <sup>1</sup>

**Problem 3.** Let *α* =

{ }*<sup>T</sup> Evvj <sup>k</sup>* = *Rk jk*

 0 /1 ( *C P k kk* 

**Problem 6.** From

**7.9 Glossary** 

~ (, ) *<sup>k</sup> xx x R* 

1 2 *H H Q*

(,) 1/

 *<sup>H</sup>* for the case where *Dk* ≠ 0.

1/ 0 ) *H T P C kk k*

estimation and state estimation problems.

1 0 0 *<sup>H</sup> Pk* .

the smoothed estimate within

.

*ei* = 2 1 

covariance *Rxx*.

[0, *N*].

*k k*/ Error covariance of the fixed-point smoother.

*i i Pk k* Error covariance of the fixed-lag smoother.

Wiener-Hopf factor.
