**2. Model description**

2 Will-be-set-by-IN-TECH

Communication Theory). Due to the multiplicative noise component, even if the additive noises are Gaussian, systems with uncertain observations are always non-Gaussian and hence, as occurs in other kinds of non-Gaussian linear systems, the least-squares estimator is not a linear function of the observations and, generally, it is not easily obtainable by a recursive algorithm; for this reason, research in this kind of systems has focused special attention on the

In some cases, the variables modeling the uncertainty in the observations can be assumed to be independent and, then, the distribution of the multiplicative noise is fully determined by the probability that each particular observation contains the signal. As it was shown by Nahi [17] (who was the first who analyzed the least-squares linear filtering problem in this kind of systems assuming that the state and observation additive noises are uncorrelated) the knowledge of the aforementioned probabilities allows to derive estimation algorithms with a recursive structure similar to the Kalman filter. Later on, Monzingo [16] completed these results by analyzing the least-squares smoothing problem and, subsequently, [3] and [4] generalized the least-squares linear filtering and smoothing algorithms considering that the

However, there exist many real situations where this independence assumption of the Bernoulli variables modeling the uncertainty is not satisfied; for example, in signal transmission models with stand-by sensors in which any failure in the transmission is detected immediately and the old sensor is then replaced, thus avoiding the possibility of the signal being missing in two successive observations. This different situation was considered by [9] by assuming that the variables modeling the uncertainty are correlated at consecutive time instants, and the proposed least-squares linear filtering algorithm provides the signal estimator at any time from those in the two previous instants. Later on, the state estimation problem in discrete-time systems with uncertain observations, has been widely studied under different hypotheses on the additive noises involved in the state and observation equations and, also, under several hypotheses on the multiplicative noise modeling the uncertainty in

On the other hand, there are many engineering application fields (for example, in communication systems) where sensor networks are used to obtain all the available information on the system state and its estimation must be carried out from the observations provided by all the sensors (see [6] and references therein). Most papers concerning systems with uncertain observations transmitted by multiple sensors assume that all the sensors have the same uncertainty characteristics. In the last years, this situation has been generalized by several authors considering uncertain observations whose statistical properties are assumed not to be the same for all the sensors. This is a realistic assumption in several application fields, for instance, in networked communication systems involving heterogeneous measurement devices (see e.g. [14] and [8], among others). In [7] it is assumed that the uncertainty in each sensor is modeled by a sequence of independent Bernoulli random variables, whose statistical properties are not necessarily the same for all the sensors. Later on, in [10] and [1] the independence restriction is weakened; specifically, different sequences of Bernoulli random variables correlated at consecutive sampling times are considered to model the uncertainty at each sensor. This form of correlation covers practical situations where the signal cannot be missing in two successive observations. In [2] the least-squares linear and quadratic problems are addressed when the Bernoulli variables describing the uncertainty in

search of suboptimal estimators for the signal (mainly linear ones).

additive noises of the state and the observation are correlated.

the observations (see e.g. [22] - [13], among others).

Consider linear discrete-time stochastic systems with uncertain observations coming from multiple sensors, whose mathematical modeling is accomplished by the following equations.

The state equation is given by

$$\mathbf{x}\_{k} = F\_{k-1}\mathbf{x}\_{k-1} + w\_{k-1}\quad k \ge 1,\tag{1}$$

where {*xk*; *k* ≥ 0} is an *n*-dimensional stochastic process representing the system state, {*wk*; *k* ≥ 0} is a white noise process and *Fk*, for *k* ≥ 0, are known deterministic matrices.

#### 4 Will-be-set-by-IN-TECH 4 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>5</sup>

We consider scalar uncertain observations {*y<sup>i</sup> <sup>k</sup>*; *k* ≥ 1}, *i* = 1, . . . ,*r*, coming from *r* sensors and perturbed by noises whose statistical properties are not necessarily the same for all the sensors. Specifically, we assume that, in each sensor and at any time *k*, the observation *y<sup>i</sup> k*, perturbed by an additive noise, can have no information about the state (thus being only noise) with a known probability. That is,

$$y\_k^i = \begin{cases} H\_k^i x\_k + v\_{k'}^i \text{ with probability } \overline{\theta}\_k^i\\ v\_{k'}^i \text{ with probability } 1 - \overline{\theta}\_k^i \end{cases}$$

where, for *<sup>i</sup>* <sup>=</sup> 1, . . . ,*r*, {*v<sup>i</sup> <sup>k</sup>*; *k* ≥ 1} is the observation additive noise process of the *i*-th sensor and *H<sup>i</sup> <sup>k</sup>*, for *k* ≥ 1, are known deterministic matrices of compatible dimensions. If we introduce {*θi <sup>k</sup>*; *<sup>k</sup>* <sup>≥</sup> <sup>1</sup>}, *<sup>i</sup>* <sup>=</sup> 1, . . . ,*r*, sequences of Bernoulli random variables with *<sup>P</sup>*[*θ<sup>i</sup> <sup>k</sup>* = 1] = *θ i <sup>k</sup>*, the observations of the state can be rewritten as

$$y\_k^i = \theta\_k^i H\_k^i \mathbf{x}\_k + v\_{k'}^i \quad k \ge 1, \quad i = 1, \dots, r. \tag{2}$$

**Hypothesis 3.** The observation additive noise {*vk*; *k* ≥ 1} is a zero-mean white process with

Finally, we assume the following hypothesis on the independence of the initial state and

**Hypothesis 5.** The initial state *x*<sup>0</sup> and the noise processes {*wk*; *k* ≥ 0}, {*vk*; *k* ≥ 1} and

**Remark 2.** For the derivation of the estimation algorithms a matrix product, called *Hadamard product*, which is simpler than the conventional product, will be considered. Let *A*, *B* ∈ *Mmn*, the Hadamard product (denoted by ◦) of *A* and *B* is defined as [*A* ◦ *B*]*ij* = *AijBij*. From this

**Remark 3.** Several authors assume that the observations available for the estimation come either from multiple sensors with identical uncertainty characteristics or from a single sensor (see [20] for the case when the uncertainty is modeled by independent variables, and [19] for the case when such variables are correlated at consecutive sampling times). Nevertheless, in the last years, this situation has been generalized by some authors considering multiple sensors featuring different uncertainty characteristics (see e.g. [7] for the case of independent uncertainty, and [1] for situations where the uncertainty in each sensor is modeled by variables correlated at consecutive sampling times). We analyze the state estimation problem for the class of linear discrete-time systems with uncertain observation (3), which, as established in Hypothesis 4, are characterized by the fact that the uncertainty at any sampling time *k* depends only on the uncertainty at the previous time *k* − *m*; this form of correlation allows us to consider certain models in which the signal cannot be absent in *m* + 1 consecutive

As mentioned above, our aim in this chapter is to obtain the least-squares linear estimator, *<sup>x</sup><sup>k</sup>*/*L*, of the signal *xk* based on the observations {*y*1,..., *yL*}, with *<sup>L</sup>* <sup>≥</sup> *<sup>k</sup>*, by recursive formulas. Specifically, the problem is to derive recursive algorithms for the least-squares linear filter (*L* = *k*) and fixed-point smoother (fixed *k* and *L* > *k*) of the state using uncertain observations (3). For this purpose, we use an innovation approach as described in [11].

definition it is easily deduced (see [7]) the next property that will be needed later.

*<sup>E</sup>*[Θ*kGm*×*m*Θ*s*] = *<sup>E</sup>*[*θkθ<sup>T</sup>*

*<sup>E</sup>*[(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Gm*×*m*(Θ*<sup>s</sup>* <sup>−</sup> <sup>Θ</sup>*s*)] = *<sup>K</sup><sup>θ</sup>*

*For any random matrix Gm*×*<sup>m</sup> independent of* {Θ*k*; *k* ≥ 1}*, the following equality is satisfied*

*<sup>k</sup>* and *θ j*

*<sup>k</sup>*; *k* ≥ 1} is a sequence of Bernoulli random variables with

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

> *<sup>k</sup>* ,..., *<sup>θ</sup><sup>r</sup> k*)*T*

*<sup>s</sup>* ] ◦ *E*[*Gm*×*m*].

*<sup>s</sup>* are independent for |*k* − *s*| �= 0, *m* and

*<sup>k</sup>*,*<sup>s</sup>* ◦ *E*[*Gm*×*m*]. (4)

, the covariance matrices of

5

*Cov*[*vk*] = *Rk*, ∀*k* ≥ 1.

*<sup>k</sup>* = 1] = *θ*

*P*[*θ<sup>i</sup>*

*Cov*[*θ<sup>i</sup> k*, *θ j*

noises:

observations.

**Hypothesis 4.** For *<sup>i</sup>* <sup>=</sup> 1, . . . ,*r*, {*θ<sup>i</sup>*

*<sup>k</sup>*. For *i*, *j* = 1, . . . ,*r*, the variables *θ<sup>i</sup>*

*k*,*s* .

Particularly, denoting Θ*<sup>k</sup>* = *E*[Θ*k*], it is immediately clear that

**3. Least-squares linear estimation problem**

*<sup>s</sup>*] are known for <sup>|</sup>*<sup>k</sup>* <sup>−</sup> *<sup>s</sup>*<sup>|</sup> <sup>=</sup> 0, *<sup>m</sup>*. Defining *<sup>θ</sup><sup>k</sup>* = (*θ*<sup>1</sup>

*i*

*θ<sup>k</sup>* and *θ<sup>s</sup>* will be denoted by *K<sup>θ</sup>*

{*θk*; *k* ≥ 1} are mutually independent.

**Remark 1.** If *θ<sup>i</sup> <sup>k</sup>* = 1, which occurs with known probability *θ i <sup>k</sup>*, the state *xk* is present in the observation *y<sup>i</sup> <sup>k</sup>* coming from the *<sup>i</sup>*-th sensor at time *<sup>k</sup>*, whereas if *<sup>θ</sup><sup>i</sup> <sup>k</sup>* = 0 such observation only contains additive noise, *v<sup>i</sup> <sup>k</sup>*, with probability 1 − *θ i <sup>k</sup>*. This probability is called *false alarm probability* and it represents the probability that only noise is observed or, equivalently, that *y<sup>i</sup> k* does not contain the state.

The aim is to address the state estimation problem considering all the available observations coming from the *r* sensors. For convenience, denoting *yk* = (*y*<sup>1</sup> *<sup>k</sup>*,..., *<sup>y</sup><sup>r</sup> <sup>k</sup>*)*T*, *vk* = (*v*<sup>1</sup> *<sup>k</sup>*,..., *<sup>v</sup><sup>r</sup> <sup>k</sup>*)*T*, *Hk* = (*H*1*<sup>T</sup> <sup>k</sup>* ,..., *<sup>H</sup>rT <sup>k</sup>* )*<sup>T</sup>* and <sup>Θ</sup>*<sup>k</sup>* <sup>=</sup> *Diag*(*θ*<sup>1</sup> *<sup>k</sup>* ,..., *<sup>θ</sup><sup>r</sup> <sup>k</sup>*), Equation (2) is equivalent to the following stacked observation equation

$$y\_k = \Theta\_k H\_k \mathbf{x}\_k + v\_{k\prime} \quad k \ge 1. \tag{3}$$

#### **2.1. Model hypotheses**

In order to analyze the least-squares linear estimation problem of the state *xk* from the observations *y*1,..., *yL*, with *L* ≥ *k*, some considerations must be taken into account. On the one hand, it is known that the linear estimator of *xk*, is the orthogonal projection of *xk* onto the space of *n*-dimensional random variables obtained as linear transformations of the observations *y*1,..., *yL*, which requires the existence of the second-order moments of such observations. On the other hand, we consider that the variables describing the uncertainty in the observations are correlated in instants that differ *m* units of time to cover many practical situations where the independence assumption on such variables is not realistic. Specifically, the following hypotheses are assumed:

**Hypothesis 1.** The initial state *x*<sup>0</sup> is a random vector with *E*[*x*0] = *x*<sup>0</sup> and *Cov*[*x*0] = *P*0.

**Hypothesis 2.** The state noise {*wk*; *k* ≥ 0} is a zero-mean white sequence with *Cov*[*wk*] = *Qk*, ∀*k* ≥ 0.

**Hypothesis 3.** The observation additive noise {*vk*; *k* ≥ 1} is a zero-mean white process with *Cov*[*vk*] = *Rk*, ∀*k* ≥ 1.

**Hypothesis 4.** For *<sup>i</sup>* <sup>=</sup> 1, . . . ,*r*, {*θ<sup>i</sup> <sup>k</sup>*; *k* ≥ 1} is a sequence of Bernoulli random variables with *P*[*θ<sup>i</sup> <sup>k</sup>* = 1] = *θ i <sup>k</sup>*. For *i*, *j* = 1, . . . ,*r*, the variables *θ<sup>i</sup> <sup>k</sup>* and *θ j <sup>s</sup>* are independent for |*k* − *s*| �= 0, *m* and *Cov*[*θ<sup>i</sup> k*, *θ j <sup>s</sup>*] are known for <sup>|</sup>*<sup>k</sup>* <sup>−</sup> *<sup>s</sup>*<sup>|</sup> <sup>=</sup> 0, *<sup>m</sup>*. Defining *<sup>θ</sup><sup>k</sup>* = (*θ*<sup>1</sup> *<sup>k</sup>* ,..., *<sup>θ</sup><sup>r</sup> k*)*T* , the covariance matrices of *θ<sup>k</sup>* and *θ<sup>s</sup>* will be denoted by *K<sup>θ</sup> k*,*s* .

Finally, we assume the following hypothesis on the independence of the initial state and noises:

**Hypothesis 5.** The initial state *x*<sup>0</sup> and the noise processes {*wk*; *k* ≥ 0}, {*vk*; *k* ≥ 1} and {*θk*; *k* ≥ 1} are mutually independent.

**Remark 2.** For the derivation of the estimation algorithms a matrix product, called *Hadamard product*, which is simpler than the conventional product, will be considered. Let *A*, *B* ∈ *Mmn*, the Hadamard product (denoted by ◦) of *A* and *B* is defined as [*A* ◦ *B*]*ij* = *AijBij*. From this definition it is easily deduced (see [7]) the next property that will be needed later.

*For any random matrix Gm*×*<sup>m</sup> independent of* {Θ*k*; *k* ≥ 1}*, the following equality is satisfied*

$$E\left[\Theta\_k G\_{m \times m} \Theta\_s\right] = E\left[\theta\_k \theta\_s^T\right] \circ E\left[G\_{m \times m}\right].$$

Particularly, denoting Θ*<sup>k</sup>* = *E*[Θ*k*], it is immediately clear that

4 Will-be-set-by-IN-TECH

and perturbed by noises whose statistical properties are not necessarily the same for all the sensors. Specifically, we assume that, in each sensor and at any time *k*, the observation *y<sup>i</sup>*

perturbed by an additive noise, can have no information about the state (thus being only

*<sup>k</sup>*, with probability 1 − *θ*

*<sup>k</sup>*, for *k* ≥ 1, are known deterministic matrices of compatible dimensions. If we introduce

*<sup>k</sup>*, with probability *θ*

*i*

*<sup>k</sup>*; *k* ≥ 1}, *i* = 1, . . . ,*r*, coming from *r* sensors

*i k*

*<sup>k</sup>*, *k* ≥ 1, *i* = 1, . . . ,*r*. (2)

*<sup>k</sup>*,..., *<sup>y</sup><sup>r</sup>*

*yk* = Θ*kHkxk* + *vk*, *k* ≥ 1. (3)

*<sup>k</sup>*), Equation (2) is equivalent to the following

*i*

*i k*

*<sup>k</sup>*; *k* ≥ 1} is the observation additive noise process of the *i*-th sensor

*k*,

*<sup>k</sup>* = 1] = *θ*

*<sup>k</sup>*, the state *xk* is present in the

*<sup>k</sup>*. This probability is called *false alarm*

*<sup>k</sup>*)*T*, *vk* = (*v*<sup>1</sup>

*<sup>k</sup>* = 0 such observation

*i <sup>k</sup>*, the

*k*

*<sup>k</sup>*)*T*,

*<sup>k</sup>*,..., *<sup>v</sup><sup>r</sup>*

We consider scalar uncertain observations {*y<sup>i</sup>*

*yi <sup>k</sup>* =

*yi <sup>k</sup>* <sup>=</sup> *<sup>θ</sup><sup>i</sup> kHi*

coming from the *r* sensors. For convenience, denoting *yk* = (*y*<sup>1</sup>

*<sup>k</sup>* )*<sup>T</sup>* and <sup>Θ</sup>*<sup>k</sup>* <sup>=</sup> *Diag*(*θ*<sup>1</sup>

⎧ ⎨ ⎩

*Hi <sup>k</sup>xk* <sup>+</sup> *<sup>v</sup><sup>i</sup>*

*vi*

*<sup>k</sup>*; *<sup>k</sup>* <sup>≥</sup> <sup>1</sup>}, *<sup>i</sup>* <sup>=</sup> 1, . . . ,*r*, sequences of Bernoulli random variables with *<sup>P</sup>*[*θ<sup>i</sup>*

*<sup>k</sup>xk* <sup>+</sup> *<sup>v</sup><sup>i</sup>*

*<sup>k</sup>* = 1, which occurs with known probability *θ*

*<sup>k</sup>* coming from the *<sup>i</sup>*-th sensor at time *<sup>k</sup>*, whereas if *<sup>θ</sup><sup>i</sup>*

*<sup>k</sup>*, with probability 1 − *θ*

*probability* and it represents the probability that only noise is observed or, equivalently, that *y<sup>i</sup>*

The aim is to address the state estimation problem considering all the available observations

In order to analyze the least-squares linear estimation problem of the state *xk* from the observations *y*1,..., *yL*, with *L* ≥ *k*, some considerations must be taken into account. On the one hand, it is known that the linear estimator of *xk*, is the orthogonal projection of *xk* onto the space of *n*-dimensional random variables obtained as linear transformations of the observations *y*1,..., *yL*, which requires the existence of the second-order moments of such observations. On the other hand, we consider that the variables describing the uncertainty in the observations are correlated in instants that differ *m* units of time to cover many practical situations where the independence assumption on such variables is not realistic. Specifically,

**Hypothesis 1.** The initial state *x*<sup>0</sup> is a random vector with *E*[*x*0] = *x*<sup>0</sup> and *Cov*[*x*0] = *P*0.

**Hypothesis 2.** The state noise {*wk*; *k* ≥ 0} is a zero-mean white sequence with *Cov*[*wk*] =

*<sup>k</sup>* ,..., *<sup>θ</sup><sup>r</sup>*

noise) with a known probability. That is,

observations of the state can be rewritten as

where, for *<sup>i</sup>* <sup>=</sup> 1, . . . ,*r*, {*v<sup>i</sup>*

only contains additive noise, *v<sup>i</sup>*

does not contain the state.

*<sup>k</sup>* ,..., *<sup>H</sup>rT*

stacked observation equation

**2.1. Model hypotheses**

the following hypotheses are assumed:

and *H<sup>i</sup>*

**Remark 1.** If *θ<sup>i</sup>*

observation *y<sup>i</sup>*

*Hk* = (*H*1*<sup>T</sup>*

*Qk*, ∀*k* ≥ 0.

{*θi*

$$E\left[\left(\Theta\_k - \overline{\Theta}\_k\right)\mathcal{G}\_{m \times m}(\Theta\_s - \overline{\Theta}\_s)\right] = K\_{k,s}^{\theta} \circ E\left[\mathcal{G}\_{m \times m}\right].\tag{4}$$

**Remark 3.** Several authors assume that the observations available for the estimation come either from multiple sensors with identical uncertainty characteristics or from a single sensor (see [20] for the case when the uncertainty is modeled by independent variables, and [19] for the case when such variables are correlated at consecutive sampling times). Nevertheless, in the last years, this situation has been generalized by some authors considering multiple sensors featuring different uncertainty characteristics (see e.g. [7] for the case of independent uncertainty, and [1] for situations where the uncertainty in each sensor is modeled by variables correlated at consecutive sampling times). We analyze the state estimation problem for the class of linear discrete-time systems with uncertain observation (3), which, as established in Hypothesis 4, are characterized by the fact that the uncertainty at any sampling time *k* depends only on the uncertainty at the previous time *k* − *m*; this form of correlation allows us to consider certain models in which the signal cannot be absent in *m* + 1 consecutive observations.

## **3. Least-squares linear estimation problem**

As mentioned above, our aim in this chapter is to obtain the least-squares linear estimator, *<sup>x</sup><sup>k</sup>*/*L*, of the signal *xk* based on the observations {*y*1,..., *yL*}, with *<sup>L</sup>* <sup>≥</sup> *<sup>k</sup>*, by recursive formulas. Specifically, the problem is to derive recursive algorithms for the least-squares linear filter (*L* = *k*) and fixed-point smoother (fixed *k* and *L* > *k*) of the state using uncertain observations (3). For this purpose, we use an innovation approach as described in [11].

5

#### 6 Will-be-set-by-IN-TECH 6 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>7</sup>

Since the observations are generally nonorthogonal vectors, we use the *Gram-Schmidt orthogonalization procedure* to transform the set of observations {*y*1,..., *yL*} into an equivalent set of orthogonal vectors {*ν*1,..., *νL*}; equivalent in the sense that they both generate the same linear subspace; that is,

$$
\mathcal{L}(y\_1, \ldots, y\_L) = \mathcal{L}(\nu\_1, \ldots, \nu\_L) = \mathcal{L}\_L.
$$

Let {*ν*1,..., *<sup>ν</sup>k*−1} be the set of orthogonal vectors satisfying L(*ν*1,..., *<sup>ν</sup>k*−1) = L(*y*1,..., *yk*−1), the next orthogonal vector, *νk*, corresponding to the new observation *yk*, is obtained by projecting *yk* onto L*k*−1; specifically

$$\nu\_k = y\_k - \text{Proj}\{y\_k \text{ onto } \mathcal{L}\_{k-1}\}\_k$$

and, because of the orthogonality of {*ν*1,..., *<sup>ν</sup>k*−1} the above projection can be found by projecting *yk* along each of the previously found orthogonal vectors *νi*, for *i* ≤ *k* − 1,

$$\operatorname{Proj}\{y\_k \text{ onto } \mathcal{L}\_{k-1}\} = \sum\_{i=1}^{k-1} \operatorname{Proj}\{y\_k \text{ along } \nu\_i\} = \sum\_{i=1}^{k-1} \operatorname{E}[y\_k \nu\_i^T] \left(\operatorname{E}[\nu\_i \nu\_i^T]\right)^{-1} \nu\_i.$$

Since the projection of *yk* onto <sup>L</sup>*k*−<sup>1</sup> is *<sup>y</sup><sup>k</sup>*/*k*−1, the one-stage least-squares linear predictor of *yk*, we have that

$$\hat{y}\_{k/k-1} = \sum\_{i=1}^{k-1} T\_{k,i} \Pi\_i^{-1} \nu\_{i\prime} \quad k \ge 2 \tag{5}$$

**4. Least-squares linear estimation recursive algorithms**

*<sup>x</sup><sup>k</sup>*/*<sup>k</sup>* <sup>=</sup> *<sup>x</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>+</sup> *Sk*,*k*Π−<sup>1</sup>

the innovation *νk*, its covariance matrix Π*<sup>k</sup>* and the matrix *Sk*,*<sup>k</sup>* are required.

From the observation equation (3) and hypotheses 3 and 5, it is clear that

*Tk*,*<sup>i</sup>* = *E*

of the innovations *νi*, and then we have that *Tk*,*<sup>i</sup>* = Θ*kHkE*[*xkν<sup>T</sup>*

<sup>Θ</sup>*kHkSk*,*k*−*i*Π−<sup>1</sup>

*k*−1 ∑ *i*=1

*k*−*i*

Next, we determine an expression for *Tk*,*k*−*<sup>i</sup>* − <sup>Θ</sup>*kHkSk*,*k*−*i*, for 1 ≤ *<sup>i</sup>* ≤ *<sup>m</sup>*.

*m* ∑ *i*=1

*Sk*,*i*Π−<sup>1</sup> *<sup>i</sup> ν<sup>i</sup>* +

*<sup>x</sup><sup>k</sup>*/*k*, and the fixed-point smoother, *<sup>x</sup><sup>k</sup>*/*L*, for fixed *<sup>k</sup>* and *<sup>L</sup>* <sup>&</sup>gt; *<sup>k</sup>*.

**4.1. Linear filtering algorithm**

manipulations, we obtain

obvious that

and subtracting

from the one-stage state predictor, *<sup>x</sup><sup>k</sup>*/*k*−1, by

into account (5), we start by calculating *Tk*,*<sup>i</sup>* = *E*[*ykν<sup>T</sup>*

*<sup>I</sup>*. For *<sup>k</sup>* <sup>≤</sup> *<sup>m</sup>*, it is satisfied that *<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHk*

*I I*. For *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*, we have that *<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHk*

*m* ∑ *i*=1

*<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHk*

In this section, using an innovation approach, recursive algorithms are proposed for the filter,

In view of the general expression (6) for *<sup>L</sup>* <sup>=</sup> *<sup>k</sup>*, it is clear that the state filter, *<sup>x</sup><sup>k</sup>*/*k*, is obtained

Hence, an equation for the predictor *<sup>x</sup><sup>k</sup>*/*k*−<sup>1</sup> in terms of the filter *<sup>x</sup><sup>k</sup>*−1/*k*−<sup>1</sup> and expressions for

*State predictor* <sup>x</sup>*<sup>k</sup>***/***k*−**1**. From hypotheses 2 and 5, it is immediately clear that the filter of the noise *wk*−<sup>1</sup> is *<sup>w</sup><sup>k</sup>*−1/*k*−<sup>1</sup> <sup>=</sup> *<sup>E</sup>*[*wk*−1] = 0 and hence, taking into account Equation (1), we have

*Innovation* <sup>ν</sup>*k*. We will now get an explicit formula for the innovation, *<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> *<sup>y</sup><sup>k</sup>*/*k*−1, or equivalently for the one-stage predictor of the observation, *<sup>y</sup><sup>k</sup>*/*k*−1. For this purpose, taking

> <sup>Θ</sup>*kHkxkν<sup>T</sup> i*

Now, for *k* ≤ *m* or *k* > *m* and *i* < *k* − *m*, hypotheses 4 and 5 guarantee that Θ*<sup>k</sup>* is independent

*k*−1 ∑ *i*=1

*k*−(*m*+1) ∑ *i*=1

*Sk*,*i*Π−<sup>1</sup>

*Sk*,*i*Π−<sup>1</sup> *<sup>i</sup> ν<sup>i</sup>* +

*<sup>ν</sup>k*−*i*, the following equality holds

(*Tk*,*k*−*<sup>i</sup>* <sup>−</sup> <sup>Θ</sup>*kHkSk*,*k*−*i*)Π−<sup>1</sup>

*<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHkx<sup>k</sup>*/*k*−1, *<sup>k</sup>* <sup>≤</sup> *<sup>m</sup>*. (10)

*m* ∑ *i*=1

*<sup>k</sup> <sup>ν</sup>k*, *<sup>k</sup>* <sup>≥</sup> 1; *<sup>x</sup>*0/0 <sup>=</sup> *<sup>x</sup>*0. (7)

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

, *i* ≤ *k* − 1. (9)

*<sup>i</sup>* ] = Θ*kHkSk*,*i*. So, after some

*<sup>ν</sup>k*−*<sup>i</sup>* and adding

7

*<sup>ν</sup>k*−*i*, *<sup>k</sup>* > *<sup>m</sup>*. (11)

*<sup>i</sup> νi*, and using (6) for *L* = *k* − 1 it is

*Tk*,*k*−*i*Π−<sup>1</sup> *k*−*i*

*k*−*i*

*<sup>x</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> *Fk*−1*<sup>x</sup><sup>k</sup>*−1/*k*−1, *<sup>k</sup>* <sup>≥</sup> 1. (8)

*<sup>i</sup>* ], for *i* ≤ *k* − 1.

where *Tk*,*<sup>i</sup>* = *E*[*ykν<sup>T</sup> <sup>i</sup>* ] and <sup>Π</sup>*<sup>i</sup>* <sup>=</sup> *<sup>E</sup>*[*νiν<sup>T</sup> <sup>i</sup>* ] is the covariance of *νi*.

Consequently, by starting with *ν*<sup>1</sup> = *y*<sup>1</sup> − *E*[*y*1], the orthogonal vectors *ν<sup>k</sup>* are determined by *<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> *<sup>y</sup><sup>k</sup>*/*k*−1, for *<sup>k</sup>* <sup>≥</sup> 2. Hence, *<sup>ν</sup><sup>k</sup>* can be considered as the "new information" or the "innovation" in *yk* given {*y*1,..., *yk*−1}.

In summary, the observation process {*yk*; *k* ≥ 1} has been transformed into an equivalent white noise {*νk*; *k* ≥ 1} known as innovation process. Taking into account that both processes satisfy that

$$\nu\_i \in \mathcal{L}(y\_1, \dots, y\_i) \quad \text{and} \quad y\_i \in \mathcal{L}(\nu\_1, \dots, \nu\_i), \quad \forall i \ge 1,$$

we conclude that such processes are related to each other by a causal and causally invertible linear transformation, thus making the innovation process be uniquely determined by the observations.

This consideration allows us to state that the least-squares linear estimator of the state based on the observations, *<sup>x</sup><sup>k</sup>*/*L*, is equal to the least-squares linear estimator of the state based on the innovations {*ν*1,..., *νL*}. Thus, projecting *xk* separately onto each *νi*, *i* ≤ *L*, the following general expression for the estimator *<sup>x</sup><sup>k</sup>*/*<sup>L</sup>* is obtained

$$
\widehat{\mathfrak{X}}\_{k/L} = \sum\_{i=1}^{L} \mathbb{S}\_{k,i} \Pi\_i^{-1} \nu\_{i\nu} \quad k \ge 1,\tag{6}
$$

where *Sk*,*<sup>i</sup>* = *E*[*xkν<sup>T</sup> <sup>i</sup>* ]. This expression is the starting point to derive the recursive filtering and fixed-point smoothing algorithms in the next section.

### **4. Least-squares linear estimation recursive algorithms**

In this section, using an innovation approach, recursive algorithms are proposed for the filter, *<sup>x</sup><sup>k</sup>*/*k*, and the fixed-point smoother, *<sup>x</sup><sup>k</sup>*/*L*, for fixed *<sup>k</sup>* and *<sup>L</sup>* <sup>&</sup>gt; *<sup>k</sup>*.

#### **4.1. Linear filtering algorithm**

6 Will-be-set-by-IN-TECH

Since the observations are generally nonorthogonal vectors, we use the *Gram-Schmidt orthogonalization procedure* to transform the set of observations {*y*1,..., *yL*} into an equivalent set of orthogonal vectors {*ν*1,..., *νL*}; equivalent in the sense that they both generate the same

L(*y*1,..., *yL*) = L(*ν*1,..., *νL*) = L*L*.

Let {*ν*1,..., *<sup>ν</sup>k*−1} be the set of orthogonal vectors satisfying L(*ν*1,..., *<sup>ν</sup>k*−1) = L(*y*1,..., *yk*−1), the next orthogonal vector, *νk*, corresponding to the new observation *yk*, is obtained by

*<sup>ν</sup><sup>k</sup>* = *yk* − *Proj*{*yk* onto L*k*−1},

and, because of the orthogonality of {*ν*1,..., *<sup>ν</sup>k*−1} the above projection can be found by

*Proj*{*yk* along *νi*} =

Since the projection of *yk* onto <sup>L</sup>*k*−<sup>1</sup> is *<sup>y</sup><sup>k</sup>*/*k*−1, the one-stage least-squares linear predictor of

*Tk*,*i*Π−<sup>1</sup>

*<sup>i</sup>* ] is the covariance of *νi*.

Consequently, by starting with *ν*<sup>1</sup> = *y*<sup>1</sup> − *E*[*y*1], the orthogonal vectors *ν<sup>k</sup>* are determined by *<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> *<sup>y</sup><sup>k</sup>*/*k*−1, for *<sup>k</sup>* <sup>≥</sup> 2. Hence, *<sup>ν</sup><sup>k</sup>* can be considered as the "new information" or the

In summary, the observation process {*yk*; *k* ≥ 1} has been transformed into an equivalent white noise {*νk*; *k* ≥ 1} known as innovation process. Taking into account that both processes

*ν<sup>i</sup>* ∈ L(*y*1,..., *yi*) and *yi* ∈ L(*ν*1,..., *νi*), ∀*i* ≥ 1, we conclude that such processes are related to each other by a causal and causally invertible linear transformation, thus making the innovation process be uniquely determined by the

This consideration allows us to state that the least-squares linear estimator of the state based on the observations, *<sup>x</sup><sup>k</sup>*/*L*, is equal to the least-squares linear estimator of the state based on the innovations {*ν*1,..., *νL*}. Thus, projecting *xk* separately onto each *νi*, *i* ≤ *L*, the following

*Sk*,*i*Π−<sup>1</sup>

*<sup>i</sup>* ]. This expression is the starting point to derive the recursive filtering and

*k*−1 ∑ *i*=1

*E*[*ykν<sup>T</sup> i* ] *E*[*νiν<sup>T</sup> i* ] −<sup>1</sup> *νi*.

*<sup>i</sup> νi*, *k* ≥ 2 (5)

*<sup>i</sup> νi*, *k* ≥ 1, (6)

projecting *yk* along each of the previously found orthogonal vectors *νi*, for *i* ≤ *k* − 1,

*k*−1 ∑ *i*=1

*k*−1 ∑ *i*=1

*<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup>

*<sup>i</sup>* ] and <sup>Π</sup>*<sup>i</sup>* <sup>=</sup> *<sup>E</sup>*[*νiν<sup>T</sup>*

general expression for the estimator *<sup>x</sup><sup>k</sup>*/*<sup>L</sup>* is obtained

fixed-point smoothing algorithms in the next section.

*<sup>x</sup><sup>k</sup>*/*<sup>L</sup>* <sup>=</sup>

*L* ∑ *i*=1

linear subspace; that is,

*yk*, we have that

where *Tk*,*<sup>i</sup>* = *E*[*ykν<sup>T</sup>*

satisfy that

observations.

where *Sk*,*<sup>i</sup>* = *E*[*xkν<sup>T</sup>*

projecting *yk* onto L*k*−1; specifically

*Proj*{*yk* onto L*k*−1} =

"innovation" in *yk* given {*y*1,..., *yk*−1}.

In view of the general expression (6) for *<sup>L</sup>* <sup>=</sup> *<sup>k</sup>*, it is clear that the state filter, *<sup>x</sup><sup>k</sup>*/*k*, is obtained from the one-stage state predictor, *<sup>x</sup><sup>k</sup>*/*k*−1, by

$$
\hat{\mathfrak{x}}\_{k/k} = \hat{\mathfrak{x}}\_{k/k-1} + \mathcal{S}\_{k,k} \Pi\_{\mathbf{k}}^{-1} \nu\_{\mathbf{k}}, \quad k \ge 1; \quad \hat{\mathfrak{x}}\_{0/0} = \overline{\mathfrak{x}}\_0. \tag{7}
$$

Hence, an equation for the predictor *<sup>x</sup><sup>k</sup>*/*k*−<sup>1</sup> in terms of the filter *<sup>x</sup><sup>k</sup>*−1/*k*−<sup>1</sup> and expressions for the innovation *νk*, its covariance matrix Π*<sup>k</sup>* and the matrix *Sk*,*<sup>k</sup>* are required.

*State predictor* <sup>x</sup>*<sup>k</sup>***/***k*−**1**. From hypotheses 2 and 5, it is immediately clear that the filter of the noise *wk*−<sup>1</sup> is *<sup>w</sup><sup>k</sup>*−1/*k*−<sup>1</sup> <sup>=</sup> *<sup>E</sup>*[*wk*−1] = 0 and hence, taking into account Equation (1), we have

$$
\widehat{\mathfrak{X}}\_{k/k-1} = F\_{k-1} \widehat{\mathfrak{X}}\_{k-1/k-1\prime} \quad k \ge 1. \tag{8}
$$

*Innovation* <sup>ν</sup>*k*. We will now get an explicit formula for the innovation, *<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> *<sup>y</sup><sup>k</sup>*/*k*−1, or equivalently for the one-stage predictor of the observation, *<sup>y</sup><sup>k</sup>*/*k*−1. For this purpose, taking into account (5), we start by calculating *Tk*,*<sup>i</sup>* = *E*[*ykν<sup>T</sup> <sup>i</sup>* ], for *i* ≤ *k* − 1.

From the observation equation (3) and hypotheses 3 and 5, it is clear that

$$T\_{k,i} = E\left[\Theta\_k H\_k \mathbf{x}\_k \boldsymbol{\nu}\_i^T\right], \quad i \le k - 1. \tag{9}$$

Now, for *k* ≤ *m* or *k* > *m* and *i* < *k* − *m*, hypotheses 4 and 5 guarantee that Θ*<sup>k</sup>* is independent of the innovations *νi*, and then we have that *Tk*,*<sup>i</sup>* = Θ*kHkE*[*xkν<sup>T</sup> <sup>i</sup>* ] = Θ*kHkSk*,*i*. So, after some manipulations, we obtain

*<sup>I</sup>*. For *<sup>k</sup>* <sup>≤</sup> *<sup>m</sup>*, it is satisfied that *<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHk k*−1 ∑ *i*=1 *Sk*,*i*Π−<sup>1</sup> *<sup>i</sup> νi*, and using (6) for *L* = *k* − 1 it is obvious that

$$
\hat{y}\_{k/k-1} = \overline{\Theta}\_k H\_k \hat{x}\_{k/k-1\prime} \quad k \le m. \tag{10}
$$

*I I*. For *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*, we have that *<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHk k*−(*m*+1) ∑ *i*=1 *Sk*,*i*Π−<sup>1</sup> *<sup>i</sup> ν<sup>i</sup>* + *m* ∑ *i*=1 *Tk*,*k*−*i*Π−<sup>1</sup> *k*−*i <sup>ν</sup>k*−*<sup>i</sup>* and adding

and subtracting *m* ∑ *i*=1 <sup>Θ</sup>*kHkSk*,*k*−*i*Π−<sup>1</sup> *k*−*i <sup>ν</sup>k*−*i*, the following equality holds

$$\hat{y}\_{k/k-1} = \overline{\Theta}\_k H\_k \sum\_{i=1}^{k-1} \mathbf{S}\_{k,i} \Pi\_i^{-1} \nu\_i + \sum\_{i=1}^m (T\_{k,k-i} - \overline{\Theta}\_k H\_k \mathbf{S}\_{k,k-i}) \Pi\_{k-l}^{-1} \nu\_{k-i\prime} \quad k > m. \tag{11}$$

Next, we determine an expression for *Tk*,*k*−*<sup>i</sup>* − <sup>Θ</sup>*kHkSk*,*k*−*i*, for 1 ≤ *<sup>i</sup>* ≤ *<sup>m</sup>*.

#### 8 Will-be-set-by-IN-TECH 8 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>9</sup>

Taking into account (9), it follows that

$$T\_{k,k-i} - \overline{\Theta}\_k H\_k \mathbf{S}\_{k,k-i} = E\left[ (\Theta\_k - \overline{\Theta}\_k) H\_k \mathbf{x}\_k \boldsymbol{\nu}\_{k-i}^T \right], \quad 1 \le i \le m. \tag{12}$$

Now, from expression (5),

we have that

*<sup>y</sup><sup>k</sup>*−*i*/*k*−(*i*+1) <sup>=</sup>

*<sup>k</sup>*,*k*−*<sup>m</sup>* ◦

*Tk*,*k*−*<sup>i</sup>* = <sup>Θ</sup>*kHk***F***k*,*k*−*iSk*−*i*,*k*−*i*, 2 ≤ *<sup>k</sup>* ≤ *<sup>m</sup>*, 1 ≤ *<sup>i</sup>* ≤ *<sup>k</sup>* − 1,

*I*. From Equation (3) and the independence hypothesis, it is clear that *E*[*xky<sup>T</sup>*

*<sup>k</sup>* ] is given by (13).

*<sup>E</sup>*[*xky<sup>T</sup>*

*<sup>k</sup>*/*k*−1] = *<sup>E</sup>*[*xkx<sup>T</sup>*

*<sup>k</sup>*/*k*−1] = (*Dk* <sup>−</sup> *Pk*/*k*−1) *<sup>H</sup><sup>T</sup>*

*Pk*/*k*−1, where *Pk*/*k*−<sup>1</sup> <sup>=</sup> *<sup>E</sup>*[(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*−1)(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*−1)*T*] is the prediction error

By using the orthogonal projection lemma, which assures that *<sup>E</sup>*[*xkx<sup>T</sup>*

*Tk*,*k*−*<sup>i</sup>* <sup>−</sup> <sup>Θ</sup>*kHkSk*,*k*−*<sup>i</sup>* <sup>=</sup> <sup>−</sup>Ψ*k*,*k*−*mT<sup>T</sup>*

*Tk*,*k*−*<sup>i</sup>* − <sup>Θ</sup>*kHkSk*,*k*−*<sup>i</sup>* = −*E*[

or, equivalently, from (12) for *i* = *m*, (15) and noting

<sup>Ψ</sup>*k*,*k*−*<sup>m</sup>* <sup>=</sup> *<sup>K</sup><sup>θ</sup>*

*<sup>y</sup><sup>k</sup>*/*k*−<sup>1</sup> <sup>=</sup> <sup>Θ</sup>*kHkx<sup>k</sup>*/*k*−<sup>1</sup> <sup>+</sup> <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

*Tk*,*k*−*<sup>i</sup>* <sup>=</sup> <sup>Θ</sup>*kHk***F***k*,*k*−*iSk*−*i*,*k*−*<sup>i</sup>* <sup>−</sup> <sup>Ψ</sup>*k*,*k*−*mT<sup>T</sup>*

taken into account and two cases must be considered:

*<sup>E</sup>*[*xky<sup>T</sup>*

*Matrix* <sup>S</sup>*k***,***k*. Since *<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> *<sup>y</sup><sup>k</sup>*/*k*−1, we have that *Sk*,*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[*xkν<sup>T</sup>*

matrices *Tk*,*k*−*<sup>i</sup>* in (17) are obtained by

Next, we calculate these expectations.

(a) For *k* ≤ *m*, from (10) we obtain

covariance matrix, we get

<sup>∀</sup>*<sup>k</sup>* <sup>≥</sup> 1, where *Dk* <sup>=</sup> *<sup>E</sup>*[*xkx<sup>T</sup>*

*I I*. To calculate *<sup>E</sup>*[*xky<sup>T</sup>*

and using again that Θ*<sup>k</sup>* is independent of *νi*, for *i* �= *k* − *m*, it is deduced that

Next, substituting (15) and (16) into (11) and using (6) for *<sup>x</sup><sup>k</sup>*/*k*−1, it is concluded that

*νk*−*<sup>m</sup>* −

Finally, using (3) and (16) and taking into account that, from (1), *Sk*,*k*−*<sup>i</sup>* = **<sup>F</sup>***k*,*k*−*iSk*−*i*,*k*−*i*, the

 Θ*<sup>k</sup>* − Θ*<sup>k</sup>*

*k*−(*i*+1) ∑ *j*=1

*Tk*−*i*,*j*Π−<sup>1</sup>

 *Hkxkν<sup>T</sup>*

*Hk***F***k*,*k*−*mDk*−*mH<sup>T</sup>*

*m*−1 ∑ *i*=1 *TT* *<sup>j</sup> νj*,

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

*<sup>k</sup>*−*m*]Π−<sup>1</sup>

*k*−*m* Π−<sup>1</sup> *<sup>k</sup>*−*m*,

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*k*−*i νk*−*<sup>i</sup>* 

*<sup>k</sup>*−*i*,*k*−*m*, *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*, 1 <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *<sup>m</sup>* <sup>−</sup> 1.

*<sup>k</sup>*/*k*−1], the correlation hypothesis of the random variables *<sup>θ</sup><sup>k</sup>* must be

*<sup>k</sup>*/*k*−1]*H<sup>T</sup>*

*<sup>k</sup>* Θ*k*.

*<sup>k</sup>* Θ*k*, *k* ≤ *m*.

*<sup>k</sup>* ] = *<sup>E</sup>*[*xky<sup>T</sup>*

*<sup>k</sup>*−*mT<sup>T</sup>*

*k*−*i*,*k*−*m*

*<sup>k</sup>*−*i*,*k*−*m*, *<sup>i</sup>* <sup>&</sup>lt; *<sup>m</sup>*. (16)

, *k* > *m*. (17)

9

*<sup>k</sup>* ] <sup>−</sup> *<sup>E</sup>*[*xky<sup>T</sup>*

*<sup>k</sup>* ] = *DkH<sup>T</sup>*

*<sup>k</sup>*/*k*−1] = *Dk* <sup>−</sup>

*<sup>k</sup>*/*k*−1].

*<sup>k</sup>* Θ*k*,

or equivalently,

$$T\_{k,k-i} - \overline{\Theta}\_k H\_k \mathbf{S}\_{k,k-i} = E\left[ (\Theta\_k - \overline{\Theta}\_k) H\_k \mathbf{x}\_k y\_{k-i}^T \right] - E\left[ (\Theta\_k - \overline{\Theta}\_k) H\_k \mathbf{x}\_k \hat{y}\_{k-i/k-(i+1)}^T \right].$$

To calculate the first expectation, we use again (3) for *yk*−*<sup>i</sup>* and from hypotheses 3 and 5, we have that

$$E\left[\left(\Theta\_k - \overline{\Theta}\_k\right)H\_k\mathbf{x}\_k\mathbf{y}\_{k-i}^T\right] = E\left[\left(\Theta\_k - \overline{\Theta}\_k\right)H\_k\mathbf{x}\_k\mathbf{x}\_{k-i}^T H\_{k-i}^T \Theta\_{k-i}\right]$$

which, using Property (4), yields

$$E\left[ (\Theta\_k - \overline{\Theta}\_k) H\_k \mathbf{x}\_k \mathbf{y}\_{k-i}^T \right] = K\_{k,k-i}^{\theta} \circ \left( H\_k E\left[ \mathbf{x}\_k \mathbf{x}\_{k-i}^T \right] H\_{k-i}^T \right) .$$

Now, denoting *Dk* = *E*[*xkx<sup>T</sup> <sup>k</sup>* ] and **<sup>F</sup>***k*,*<sup>i</sup>* = *Fk*−<sup>1</sup> ··· *Fi*, from Equation (1) it is clear that *E*[*xkx<sup>T</sup> k*−*i* ] = **<sup>F</sup>***k*,*k*−*iDk*−*i*, and hence

$$E\left[ (\Theta\_k - \overline{\Theta}\_k) H\_k \mathfrak{x}\_k \mathfrak{y}\_{k-i}^T \right] = K\_{k,k-i}^{\theta} \circ \left( H\_k \mathbb{F}\_{k,k-i} D\_{k-i} H\_{k-i}^T \right)^{-1}$$

where *Dk* can be recursively obtained by

$$\begin{aligned} D\_k &= F\_{k-1} D\_{k-1} F\_{k-1}^T + Q\_{k-1}, \quad k \ge 1; \\ D\_0 &= P\_0 + \overline{\mathbf{x}}\_0 \overline{\mathbf{x}}\_0^T. \end{aligned} \tag{13}$$

Summarizing, we have that

$$\begin{split} T\_{k,k-i} - \overline{\Theta}\_{k} H\_{k} S\_{k,k-i} &= \mathcal{K}\_{k,k-i}^{\theta} \circ \left( H\_{k} \mathbb{F}\_{k,k-i} D\_{k-i} H\_{k-i}^{T} \right) \\ &- E \left[ (\Theta\_{k} - \overline{\Theta}\_{k}) H\_{k} \mathbf{x}\_{k} \widehat{\boldsymbol{y}}\_{k-i/k-(i+1)}^{T} \right], \quad 1 \le i \le m. \end{split} \tag{14}$$

Taking into account the correlation hypothesis of the variables describing the uncertainty, the right-hand side of this equation is calculated differently for *i* = *m* or *i* < *m*, as shown below.

(a) For *i* = *m*, since Θ*<sup>k</sup>* is independent of the innovations *νi*, for *i* < *k* − *m*, we have that *E* (Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup> k*−*m*/*k*−(*m*+1) = 0, and from (14)

$$T\_{k,k-m} - \overline{\Theta}\_k H\_k S\_{k,k-m} = K\_{k,k-m}^{\theta} \circ \left( H\_k \mathbb{F}\_{k,k-m} D\_{k-m} H\_{k-m}^T \right). \tag{15}$$

(b) For *i* < *m*, from Hypothesis 4, *K<sup>θ</sup> <sup>k</sup>*,*k*−*<sup>i</sup>* <sup>=</sup> 0 and, hence, from (14)

$$\mathcal{F}T\_{k,k-i} - \overline{\Theta}\_k H\_k \mathcal{S}\_{k,k-i} = -E\left[ (\Theta\_k - \overline{\Theta}\_k) H\_k \mathbf{x}\_k \widehat{y}\_{k-i/k-(i+1)}^T \right].$$

Now, from expression (5),

8 Will-be-set-by-IN-TECH

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxkν<sup>T</sup>*

*k*−*i* − *E* 

Θ*<sup>k</sup>* − Θ*<sup>k</sup>*

 *Hkxkx<sup>T</sup>*

*<sup>k</sup>*−<sup>1</sup> <sup>+</sup> *Qk*−1, *<sup>k</sup>* <sup>≥</sup> 1;

*Hk***F***k*,*k*−*iDk*−*iH<sup>T</sup>*

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

Taking into account the correlation hypothesis of the variables describing the uncertainty, the right-hand side of this equation is calculated differently for *i* = *m* or *i* < *m*, as shown

(a) For *i* = *m*, since Θ*<sup>k</sup>* is independent of the innovations *νi*, for *i* < *k* − *m*, we have that

= 0, and from (14)

*<sup>k</sup>*,*k*−*<sup>m</sup>* ◦

*<sup>k</sup>*,*k*−*<sup>i</sup>* <sup>=</sup> 0 and, hence, from (14)

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

*HkE*[*xkx<sup>T</sup>*

*<sup>k</sup>* ] and **<sup>F</sup>***k*,*<sup>i</sup>* = *Fk*−<sup>1</sup> ··· *Fi*, from Equation (1) it is clear that

*k*−*i* 

*k*−*i*/*k*−(*i*+1)

*Hk***F***k*,*k*−*mDk*−*mH<sup>T</sup>*

*Hk***F***k*,*k*−*iDk*−*iH<sup>T</sup>*

<sup>0</sup> . (13)

, 1 ≤ *i* ≤ *m*.

*k*−*m* 

> .

*k*−*i*/*k*−(*i*+1)

(14)

. (15)

*k*−*i* 

To calculate the first expectation, we use again (3) for *yk*−*<sup>i</sup>* and from hypotheses 3 and 5,

*k*−*i* 

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

*<sup>k</sup>*−*iH<sup>T</sup> k*−*i* <sup>Θ</sup>*k*−*<sup>i</sup>* 

*k*−*i* ]*H<sup>T</sup> k*−*i* .

, 1 ≤ *i* ≤ *m*, (12)

*k*−*i*/*k*−(*i*+1)

 .

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

*k*−*i* = *E* 

> *k*−*i* = *K<sup>θ</sup> <sup>k</sup>*,*k*−*<sup>i</sup>* ◦

*k*−*i* = *K<sup>θ</sup> <sup>k</sup>*,*k*−*<sup>i</sup>* ◦ 

*Dk* <sup>=</sup> *Fk*−<sup>1</sup>*Dk*−1*F<sup>T</sup>*

*<sup>k</sup>*,*k*−*<sup>i</sup>* ◦ 

*D*<sup>0</sup> = *P*<sup>0</sup> + *x*0*x<sup>T</sup>*

− *E* 

*k*−*m*/*k*−(*m*+1)

*Tk*,*k*−*<sup>m</sup>* <sup>−</sup> <sup>Θ</sup>*kHkSk*,*k*−*<sup>m</sup>* <sup>=</sup> *<sup>K</sup><sup>θ</sup>*

*Tk*,*k*−*<sup>i</sup>* − <sup>Θ</sup>*kHkSk*,*k*−*<sup>i</sup>* = −*<sup>E</sup>*

Taking into account (9), it follows that

*Tk*,*k*−*<sup>i</sup>* − <sup>Θ</sup>*kHkSk*,*k*−*<sup>i</sup>* = *<sup>E</sup>*

*E* 

which, using Property (4), yields

Now, denoting *Dk* = *E*[*xkx<sup>T</sup>*

Summarizing, we have that

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

(b) For *i* < *m*, from Hypothesis 4, *K<sup>θ</sup>*

*E* 

*E* 

] = **<sup>F</sup>***k*,*k*−*iDk*−*i*, and hence

where *Dk* can be recursively obtained by

*Tk*,*k*−*<sup>i</sup>* <sup>−</sup> <sup>Θ</sup>*kHkSk*,*k*−*<sup>i</sup>* <sup>=</sup> *<sup>K</sup><sup>θ</sup>*

or equivalently,

we have that

*E*[*xkx<sup>T</sup> k*−*i*

below.

*E*  *Tk*,*k*−*<sup>i</sup>* − <sup>Θ</sup>*kHkSk*,*k*−*<sup>i</sup>* = *<sup>E</sup>*

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

(Θ*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*k*)*Hkxky<sup>T</sup>*

$$
\widehat{y}\_{k-i/k-(i+1)} = \sum\_{j=1}^{k-(i+1)} T\_{k-i,j} \Pi\_j^{-1} \nu\_{j,i}
$$

and using again that Θ*<sup>k</sup>* is independent of *νi*, for *i* �= *k* − *m*, it is deduced that

$$T\_{k,k-i} - \overline{\Theta}\_k H\_k \mathcal{S}\_{k,k-i} = -E[\left(\Theta\_k - \overline{\Theta}\_k\right) H\_k \ge\_k \nu\_{k-m}^T] \Pi\_{k-m}^{-1} T\_{k-i,k-m}^T$$

or, equivalently, from (12) for *i* = *m*, (15) and noting

$$\Psi\_{k,k-m} = \mathcal{K}\_{k,k-m}^{\theta} \circ \left( H\_k \mathbb{F}\_{k,k-m} D\_{k-m} H\_{k-m}^T \right) \Pi\_{k-m'}^{-1}$$

we have that

$$T\_{k,k-i} - \overline{\Theta}\_k H\_k S\_{k,k-i} = -\Psi\_{k,k-m} T\_{k-i,k-m'}^T \quad \text{i } < m. \tag{16}$$

Next, substituting (15) and (16) into (11) and using (6) for *<sup>x</sup><sup>k</sup>*/*k*−1, it is concluded that

$$\hat{y}\_{k/k-1} = \overline{\Theta}\_k \mathbf{H}\_k \hat{\mathbf{x}}\_{k/k-1} + \mathbf{V}\_{k,k-m} \left[ \boldsymbol{\nu}\_{k-m} - \sum\_{i=1}^{m-1} T\_{k-i,k-m}^T \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{\nu}\_{k-i} \right], \quad k > m. \tag{17}$$

Finally, using (3) and (16) and taking into account that, from (1), *Sk*,*k*−*<sup>i</sup>* = **<sup>F</sup>***k*,*k*−*iSk*−*i*,*k*−*i*, the matrices *Tk*,*k*−*<sup>i</sup>* in (17) are obtained by

$$T\_{k,k-i} = \overline{\Theta}\_k H\_k \mathbb{F}\_{k,k-i} \mathbb{S}\_{k-i,k-i} \quad 2 \le k \le m, \quad 1 \le i \le k - 1,$$

$$T\_{k,k-i} = \overline{\Theta}\_k H\_k \mathbb{F}\_{k,k-i} \mathbb{S}\_{k-i,k-i} - \mathbb{Y}\_{k,k-m} T\_{k-i,k-m}^T \quad k > m, \quad 1 \le i \le m - 1.$$

*Matrix* <sup>S</sup>*k***,***k*. Since *<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> *<sup>y</sup><sup>k</sup>*/*k*−1, we have that *Sk*,*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[*xkν<sup>T</sup> <sup>k</sup>* ] = *<sup>E</sup>*[*xky<sup>T</sup> <sup>k</sup>* ] <sup>−</sup> *<sup>E</sup>*[*xky<sup>T</sup> <sup>k</sup>*/*k*−1]. Next, we calculate these expectations.

	- (a) For *k* ≤ *m*, from (10) we obtain

$$E[\mathfrak{x}\_k \widehat{y}\_{k/k-1}^T] = E[\mathfrak{x}\_k \widehat{\mathfrak{x}}\_{k/k-1}^T] H\_k^T \overline{\Theta}\_k.$$

By using the orthogonal projection lemma, which assures that *<sup>E</sup>*[*xkx<sup>T</sup> <sup>k</sup>*/*k*−1] = *Dk* <sup>−</sup> *Pk*/*k*−1, where *Pk*/*k*−<sup>1</sup> <sup>=</sup> *<sup>E</sup>*[(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*−1)(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*−1)*T*] is the prediction error covariance matrix, we get

$$E[\mathfrak{x}\_k \widehat{y}\_{k/k-1}^T] = \left(D\_k - P\_{k/k-1}\right) H\_k^T \overline{\Theta}\_{k\prime} \quad k \le m.$$

#### 10 Will-be-set-by-IN-TECH 10 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>11</sup>

(b) For *k* > *m*, from (17) it follows that

$$\begin{aligned} E[\boldsymbol{\pi}\_k \boldsymbol{\hat{y}}\_{k/k-1}^T] &= E[\boldsymbol{\pi}\_k \boldsymbol{\hat{x}}\_{k/k-1}^T] H\_k^T \overline{\boldsymbol{\Theta}}\_k + E[\boldsymbol{\pi}\_k \boldsymbol{\nu}\_{k-m}^T] \boldsymbol{\Psi}\_{k,k-m}^T \\ &- E\left[\boldsymbol{\pi}\_k \left(\sum\_{i=1}^{m-1} T\_{k-i,k-m}^T \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{\nu}\_{k-i}\right)^T\right] \boldsymbol{\Psi}\_{k,k-m}^T \end{aligned}$$

*I I*. For *k* > *m*, an analogous reasoning, but using now Equation (17), yields

*m*−1 ∑ *i*=1 *TT*

*m*−1 ∑ *i*=1 *TT*

*m*−1 ∑ *i*=1 *TT*

*Sk*,*k*−*m*Ψ*<sup>T</sup>*

<sup>Ψ</sup>*k*,*k*−*mS<sup>T</sup>*

*Sk*,*k*−*i*Π−<sup>1</sup> *k*−*i*

*m*−1 ∑ *i*=1 *TT*

*T*

*<sup>k</sup>* ] − *θkθ*

*k*,*k* ◦ 

<sup>+</sup> <sup>Θ</sup>*kHkE*[*<sup>x</sup><sup>k</sup>*/*k*−1*ν<sup>T</sup>*

<sup>−</sup> <sup>Θ</sup>*kHkE*[*<sup>x</sup><sup>k</sup>*/*k*−<sup>1</sup>

*Hk*(*Dk* <sup>−</sup> *Pk*/*k*−1)*H<sup>T</sup>*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*<sup>k</sup>*−*m*]Ψ*<sup>T</sup>*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*Hk*(*Dk* <sup>−</sup> *Pk*/*k*−1)*H<sup>T</sup>*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*<sup>k</sup>*,*k*−*<sup>m</sup>* −

*<sup>k</sup>*,*k*−*<sup>m</sup>* <sup>−</sup> <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

*Tk*−*i*,*k*−*m*Ψ*<sup>T</sup>*

*k*−*i*

*<sup>k</sup>* Θ*<sup>k</sup>* − *ST*

*k* 

*Tk*−*i*,*k*−*m*Ψ*<sup>T</sup>*

*Hk*(*Dk* <sup>−</sup> *Pk*/*k*−1)*H<sup>T</sup>*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*Sk*,*<sup>k</sup>* <sup>−</sup> *Pk*/*k*−1*H<sup>T</sup>*

*HkDkH<sup>T</sup> k* 

*m*−1 ∑ *i*=1 *νT k*−*i* Π−<sup>1</sup> *k*−*i*

*k*−*i*

*k*−*i*

*k*−*i*

*m*−1 ∑ *i*=1

> *m*−1 ∑ *i*=1 *TT*

*k* 

<sup>Π</sup>*k*−*i*Π−<sup>1</sup> *k*−*i*

*<sup>E</sup>*[*νk*−*ix<sup>T</sup>*

*k* 

*Tk*−*i*,*k*−*m*Ψ*<sup>T</sup>*

*Sk*,*k*−*i*Π−<sup>1</sup> *k*−*i*

*<sup>k</sup>*,*k*−*<sup>m</sup>* <sup>+</sup> <sup>Ψ</sup>*k*,*k*−*mE*[*νk*−*mx<sup>T</sup>*

*Tk*−*i*,*k*−*m*]Ψ*<sup>T</sup>*

*<sup>k</sup>*/*k*−1]*H<sup>T</sup>*

*k*−*i*

<sup>+</sup> <sup>Ψ</sup>*k*,*k*−*m*Π*k*−*m*Ψ*<sup>T</sup>*

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

*Tk*−*i*,*k*−*m*Ψ*<sup>T</sup>*

*k*,*k*−*m*

*<sup>k</sup>* Θ*k*.

] = *E*[*xkν<sup>T</sup>*

*Tk*−*i*,*k*−*m*Ψ*<sup>T</sup>*

<sup>+</sup> <sup>Ψ</sup>*k*,*k*−*m*Π*k*−*m*Ψ*<sup>T</sup>*

*k*,*k*−*m*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*<sup>k</sup>*,*k*−*<sup>m</sup>* <sup>=</sup> <sup>−</sup>(*Sk*,*<sup>k</sup>* <sup>−</sup> *Pk*/*k*−1*H<sup>T</sup>*

<sup>+</sup> <sup>Ψ</sup>*k*,*k*−*m*Π*k*−*m*Ψ*<sup>T</sup>*

*<sup>k</sup>*,*<sup>k</sup>* − <sup>Θ</sup>*kHkPk*/*k*−<sup>1</sup>

*k*,*k*−*m*

*<sup>k</sup>* , the above expectations lead to the following expression

+ *Rk* + Θ*kHkSk*,*k*, *k* ≤ *m*,

*k*−*i ST k*,*k*−*i*

*k*−*i*

*k*,*k*−*m*

*k*,*k*−*m*

*<sup>k</sup>* Θ*k*),

*k*,*k*−*m*

 *H<sup>T</sup>*

*<sup>k</sup>* Θ*k*, *k* > *m*.

 *H<sup>T</sup> <sup>k</sup>* Θ*k*.

*k*,*k*−*m*

*k*,*k*−*m*

*<sup>k</sup>*/*k*−1]*H<sup>T</sup>*

*<sup>k</sup>* Θ*<sup>k</sup>*

11

] = *Sk*,*k*−*i*, for

*<sup>E</sup>*[*<sup>y</sup><sup>k</sup>*/*k*−1*<sup>y</sup><sup>T</sup>*

1 ≤ *i* ≤ *m*, and therefore

*<sup>E</sup>*[*<sup>y</sup><sup>k</sup>*/*k*−1*<sup>y</sup><sup>T</sup>*

*<sup>k</sup>*/*k*−1] =

*<sup>k</sup>*/*k*−1] =

*θkθ T k* ◦ 

+ <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

− <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

*θkθ T k* ◦ 

+ <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

+ Θ*kHk*

+ 

*m*−1 ∑ *i*=1

Finally, from Equation (18), we have

*<sup>k</sup>*,*k*−*<sup>m</sup>* −

*θkθ T k* ◦ 

+ <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

− Θ*kHk*

Π*<sup>k</sup>* = *K<sup>θ</sup>*

*Sk*,*k*−*m*Ψ*<sup>T</sup>*

*<sup>k</sup>*/*k*−1] =

*<sup>k</sup>*,*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[*θkθ<sup>T</sup>*

for the innovation covariance matrices

and hence,

Finally, since *K<sup>θ</sup>*

*<sup>E</sup>*[*<sup>y</sup><sup>k</sup>*/*k*−1*<sup>y</sup><sup>T</sup>*

Next, again from the orthogonal projection lemma, *<sup>E</sup>*[*<sup>x</sup><sup>k</sup>*/*k*−1*ν<sup>T</sup>*

hence, using again the orthogonal projection lemma and taking into account that *Sk*,*k*−*<sup>i</sup>* <sup>=</sup> *<sup>E</sup>*[*xkν<sup>T</sup> k*−*i* ], for 1 ≤ *i* ≤ *m*, it follows that

$$\begin{aligned} E[\mathbf{x}\_k \widehat{y}\_{k/k-1}^T] &= \left(D\_k - P\_{k/k-1}\right) H\_k^T \overline{\Theta}\_k + \mathcal{S}\_{k,k-m} \mathbf{\boldsymbol{\Psi}}\_{k,k-m}^T \\ &- \sum\_{i=1}^{m-1} \mathcal{S}\_{k,k-i} \boldsymbol{\Pi}\_{k-i}^{-1} T\_{k-i,k-m} \mathbf{\boldsymbol{\Psi}}\_{k,k-m}^T \quad k > m. \end{aligned}$$

Then, substituting these expectations in the expression of *Sk*,*<sup>k</sup>* and simplifying, it is clear that

$$\begin{split} \mathbf{S}\_{k,k} &= \mathbf{P}\_{k/k-1} \mathbf{H}\_{k}^{T} \overline{\boldsymbol{\Theta}}\_{k} \quad 1 \le k \le m, \\ \mathbf{S}\_{k,k} &= \mathbf{P}\_{k/k-1} \mathbf{H}\_{k}^{T} \overline{\boldsymbol{\Theta}}\_{k} - \left( \mathbf{S}\_{k,k-m} - \sum\_{i=1}^{m-1} \mathbf{S}\_{k,k-i} \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{T}\_{k-i,k-m} \right) \boldsymbol{\Psi}\_{k,k-m'}^{T} \quad k > m. \end{split} \tag{18}$$

Now, an expression for the prediction error covariance matrix, *Pk*/*k*−1, is necessary. From Equation (1), it is immediately clear that

$$P\_{k/k-1} = F\_{k-1} P\_{k-1/k-1} F\_{k-1}^T + Q\_{k-1}, \quad k \ge 1, \ldots$$

where *Pk*/*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[(*xk* <sup>−</sup> *<sup>x</sup>*�*k*/*k*)(*xk* <sup>−</sup> *<sup>x</sup>*�*k*/*k*)*T*] is the filtering error covariance matrix. From Equation (7), it is concluded that

$$P\_{k/k} = P\_{k/k-1} - \mathbf{S}\_{k,k} \Pi\_k^{-1} \mathbf{S}\_{k,k'}^T \quad k \ge 1; \quad P\_{0/0} = P\_0.$$

*Covariance matrix of the innovation* **Π***<sup>k</sup>* = E[ν*k*ν*<sup>T</sup> <sup>k</sup>* ]. From the orthogonal projection lemma, the covariance matrix of the innovation is obtained as Π*<sup>k</sup>* = *E*[*yky<sup>T</sup> <sup>k</sup>* ] <sup>−</sup> *<sup>E</sup>*[*y*�*k*/*k*−1*y*�*<sup>T</sup> <sup>k</sup>*/*k*−1].

From (3) and using Property (4), we have that

$$E[y\_k y\_k^T] = E[\theta\_k \theta\_k^T] \circ \left(H\_k D\_k H\_k^T\right) + \mathcal{R}\_{k\prime} \quad k \ge 1.$$

To obtain *<sup>E</sup>*[*y*�*k*/*k*−1*y*�*<sup>T</sup> <sup>k</sup>*/*k*−1] two cases must be distinguished again, due to the correlation hypothesis of the Bernoulli variables *θk*:

*I*. For *k* ≤ *m*, Equation (10) and Property (4) yield

$$E[\hat{y}\_{k/k-1}\hat{y}\_{k/k-1}^T] = \left(\overline{\theta}\_k \overline{\theta}\_k^T\right) \circ \left(H\_k E[\hat{\mathbf{x}}\_{k/k-1}\hat{\mathbf{x}}\_{k/k-1}^T] H\_k^T\right),$$

and in view of the orthogonal projection lemma,

$$E[\widehat{y}\_{k/k-1}\widehat{y}\_{k/k-1}^T] = \left(\overline{\theta}\_k \overline{\theta}\_k^T\right) \circ \left(H\_k (D\_k - P\_{k/k-1}) H\_k^T\right), \quad k \le m.$$

#### *I I*. For *k* > *m*, an analogous reasoning, but using now Equation (17), yields

$$\begin{split} E[\hat{y}\_{k/k-1}\hat{y}\_{k/k-1}^T] &= \left(\overline{\boldsymbol{\sigma}}\_k \overline{\boldsymbol{\sigma}}\_k^T\right) \circ \left(\boldsymbol{H}\_k(\boldsymbol{D}\_k - \boldsymbol{P}\_{k/k-1})\boldsymbol{H}\_k^T\right) + \boldsymbol{\Psi}\_{k,k-m} \Pi\_{k-m} \boldsymbol{\Psi}\_{k,k-m}^T \\ &+ \boldsymbol{\Psi}\_{k,k-m} \sum\_{i=1}^{m-1} \boldsymbol{T}\_{k-i,k-m}^T \Pi\_{k-i}^{-1} \Pi\_{k-i} \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{T}\_{k-i,k-m} \boldsymbol{\Psi}\_{k,k-m}^T \\ &+ \overline{\boldsymbol{\Theta}}\_k \boldsymbol{H}\_k \boldsymbol{E}[\widehat{\boldsymbol{\chi}}\_{k/k-1} \boldsymbol{\nu}\_{k-m}^T] \boldsymbol{\Psi}\_{k,k-m}^T + \boldsymbol{\Psi}\_{k,k-m} \boldsymbol{E}[\boldsymbol{\nu}\_{k-m} \widehat{\boldsymbol{\kappa}}\_{k'/k-1}^T] \boldsymbol{H}\_k^T \overline{\boldsymbol{\Theta}}\_k \\ &- \overline{\boldsymbol{\Theta}}\_k \boldsymbol{H}\_k \boldsymbol{E}[\widehat{\boldsymbol{\chi}}\_{k/k-1} \sum\_{i=1}^{m-1} \boldsymbol{\nu}\_{k-i}^T \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{T}\_{k-i,k-m}] \boldsymbol{\Psi}\_{k,k-m}^T \\ &- \boldsymbol{\Psi}\_{k,k-m} \sum\_{i=1}^{m-1} \boldsymbol{T}\_{k-i,k-m}^T \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{E}[\boldsymbol{\nu}\_{k-i} \widehat{\boldsymbol{\chi}}\_{k'/k-1}^T] \boldsymbol{H}\_k^T \overline{\boldsymbol{\Theta}}\_k. \end{split}$$

Next, again from the orthogonal projection lemma, *<sup>E</sup>*[*<sup>x</sup><sup>k</sup>*/*k*−1*ν<sup>T</sup> k*−*i* ] = *E*[*xkν<sup>T</sup> k*−*i* ] = *Sk*,*k*−*i*, for 1 ≤ *i* ≤ *m*, and therefore

$$\begin{split} E[\hat{y}\_{k/k-1}\hat{y}\_{k/k-1}^T] &= \left(\overline{\boldsymbol{\Theta}}\_k \overline{\boldsymbol{\theta}}\_k^T\right) \circ \left(\boldsymbol{H}\_k (\boldsymbol{D}\_k - \boldsymbol{P}\_{k/k-1}) \boldsymbol{H}\_k^T\right) + \boldsymbol{\Psi}\_{k,k-m} \boldsymbol{\Pi}\_{k-m} \boldsymbol{\Psi}\_{k,k-m}^T \\ &+ \boldsymbol{\Psi}\_{k,k-m} \sum\_{i=1}^{m-1} \boldsymbol{T}\_{k-i,k-m}^T \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{T}\_{k-i,k-m} \boldsymbol{\Psi}\_{k,k-m}^T \\ &+ \overline{\boldsymbol{\Theta}}\_k \boldsymbol{H}\_k \left(\boldsymbol{S}\_{k,k-m} \boldsymbol{\Psi}\_{k,k-m}^T - \sum\_{i=1}^{m-1} \boldsymbol{S}\_{k,k-i} \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{T}\_{k-i,k-m} \boldsymbol{\Psi}\_{k,k-m}^T\right) \\ &+ \left(\boldsymbol{\Psi}\_{k,k-m} \boldsymbol{S}\_{k,k-m}^T - \boldsymbol{\Psi}\_{k,k-m} \sum\_{i=1}^{m-1} \boldsymbol{T}\_{k-i,k-m}^T \boldsymbol{\Pi}\_{k-i}^{-1} \boldsymbol{S}\_{k,k-i}^T\right) \boldsymbol{H}\_k^T \overline{\boldsymbol{\Theta}}\_k. \end{split}$$

Finally, from Equation (18), we have

$$S\_{k,k-m} \Psi\_{k,k-m}^T - \sum\_{i=1}^{m-1} S\_{k,k-i} \Pi\_{k-i}^{-1} T\_{k-i,k-m} \Psi\_{k,k-m}^T = -(S\_{k,k} - P\_{k/k-1} \mathbf{H}\_k^T \overline{\Theta}\_k)\_{,k}$$

and hence,

10 Will-be-set-by-IN-TECH

*<sup>k</sup>*/*k*−1]*H<sup>T</sup>*

�*m*−<sup>1</sup> ∑ *i*=1 *TT*

hence, using again the orthogonal projection lemma and taking into account that

*Sk*,*k*−*i*Π−<sup>1</sup> *k*−*i*

Then, substituting these expectations in the expression of *Sk*,*<sup>k</sup>* and simplifying, it is clear that

Now, an expression for the prediction error covariance matrix, *Pk*/*k*−1, is necessary. From

where *Pk*/*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[(*xk* <sup>−</sup> *<sup>x</sup>*�*k*/*k*)(*xk* <sup>−</sup> *<sup>x</sup>*�*k*/*k*)*T*] is the filtering error covariance matrix. From

*<sup>k</sup> <sup>S</sup><sup>T</sup>*

*HkDkH<sup>T</sup> k* �

*Sk*,*k*−*i*Π−<sup>1</sup> *k*−*i*

*m*−1 ∑ *i*=1

*<sup>k</sup>* <sup>Θ</sup>*<sup>k</sup>* <sup>+</sup> *<sup>E</sup>*[*xkν<sup>T</sup>*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*k*−*i νk*−*<sup>i</sup>*

*<sup>k</sup>* <sup>Θ</sup>*<sup>k</sup>* <sup>+</sup> *Sk*,*k*−*m*Ψ*<sup>T</sup>*

*Tk*−*i*,*k*−*<sup>m</sup>*

*<sup>k</sup>*−<sup>1</sup> <sup>+</sup> *Qk*−1, *<sup>k</sup>* <sup>≥</sup> 1,

*<sup>k</sup>*,*k*, *k* ≥ 1; *P*0/0 = *P*0.

+ *Rk*, *k* ≥ 1.

*<sup>k</sup>*/*k*−1] two cases must be distinguished again, due to the correlation

*HkE*[*x*�*k*/*k*−1*x*�*<sup>T</sup>*

*Hk*(*Dk* <sup>−</sup> *Pk*/*k*−1)*H<sup>T</sup>*

� Ψ*<sup>T</sup>*

*Tk*−*i*,*k*−*m*Ψ*<sup>T</sup>*

*<sup>k</sup>*−*m*]Ψ*<sup>T</sup>*

*k*,*k*−*m*

*k*,*k*−*m*

*<sup>k</sup>*,*k*−*m*, *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*.

*<sup>k</sup>* ]. From the orthogonal projection lemma,

*<sup>k</sup>*/*k*−1]*H<sup>T</sup> k* �

, *k* ≤ *m*.

*k* �

*<sup>k</sup>* ] <sup>−</sup> *<sup>E</sup>*[*y*�*k*/*k*−1*y*�*<sup>T</sup>*

*<sup>k</sup>*,*k*−*m*, *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*. (18)

*<sup>k</sup>*/*k*−1].

�*T*⎤ <sup>⎦</sup> <sup>Ψ</sup>*<sup>T</sup> <sup>k</sup>*,*k*−*m*,

(b) For *k* > *m*, from (17) it follows that

*k*−*i*

*<sup>E</sup>*[*xky*�*<sup>T</sup>*

*Sk*,*k*−*<sup>i</sup>* <sup>=</sup> *<sup>E</sup>*[*xkν<sup>T</sup>*

*Sk*,*<sup>k</sup>* <sup>=</sup> *Pk*/*k*−1*H<sup>T</sup>*

*Sk*,*<sup>k</sup>* <sup>=</sup> *Pk*/*k*−1*H<sup>T</sup>*

Equation (7), it is concluded that

To obtain *<sup>E</sup>*[*y*�*k*/*k*−1*y*�*<sup>T</sup>*

Equation (1), it is immediately clear that

*<sup>E</sup>*[*xky*�*<sup>T</sup>*

*<sup>k</sup>*/*k*−1] = *<sup>E</sup>*[*xkx*�*<sup>T</sup>*

− *E*

⎡ ⎣*xk*

], for 1 ≤ *i* ≤ *m*, it follows that

− *m*−1 ∑ *i*=1

*Sk*,*k*−*<sup>m</sup>* −

*Pk*/*k*−<sup>1</sup> <sup>=</sup> *Fk*−<sup>1</sup>*Pk*−1/*k*−1*F<sup>T</sup>*

*Pk*/*<sup>k</sup>* <sup>=</sup> *Pk*/*k*−<sup>1</sup> <sup>−</sup> *Sk*,*k*Π−<sup>1</sup>

the covariance matrix of the innovation is obtained as Π*<sup>k</sup>* = *E*[*yky<sup>T</sup>*

*<sup>k</sup>* ] = *<sup>E</sup>*[*θkθ<sup>T</sup>*

*<sup>k</sup>*/*k*−1] =

� *θkθ T k* � ◦ �

*<sup>k</sup>* ] ◦ �

> � *θkθ T k* � ◦ �

*<sup>k</sup>* Θ*k*, 1 ≤ *k* ≤ *m*,

�

*<sup>k</sup>* Θ*<sup>k</sup>* −

*Covariance matrix of the innovation* **Π***<sup>k</sup>* = E[ν*k*ν*<sup>T</sup>*

*<sup>E</sup>*[*yky<sup>T</sup>*

*I*. For *k* ≤ *m*, Equation (10) and Property (4) yield

*<sup>E</sup>*[*y*�*k*/*k*−1*y*�*<sup>T</sup>*

and in view of the orthogonal projection lemma,

*<sup>k</sup>*/*k*−1] =

*<sup>E</sup>*[*y*�*k*/*k*−1*y*�*<sup>T</sup>*

From (3) and using Property (4), we have that

hypothesis of the Bernoulli variables *θk*:

*<sup>k</sup>*/*k*−1] = (*Dk* <sup>−</sup> *Pk*/*k*−1) *<sup>H</sup><sup>T</sup>*

$$\begin{split} E[\hat{y}\_{k/k-1}\hat{y}\_{k/k-1}^T] &= \left(\overline{\theta\_k}\overline{\theta}\_k^T\right) \circ \left(H\_k(D\_k - P\_{k/k-1})H\_k^T\right) + \overline{\Psi}\_{k,k-m}\Pi\_{k-m}\Psi\_{k,k-m}^T \\ &+ \overline{\Psi}\_{k,k-m}\sum\_{i=1}^{m-1} T\_{k-i,k-m}^T \Pi\_{k-i}^{-1} T\_{k-i,k-m}\Psi\_{k,k-m}^T \\ &- \overline{\Theta}\_k H\_k \left(S\_{k,k} - P\_{k/k-1}H\_k^T\overline{\Theta}\_k\right) - \left(S\_{k,k}^T - \overline{\Theta}\_k H\_k P\_{k/k-1}\right) H\_k^T \overline{\Theta}\_{k'}, \ k > m. \end{split}$$

Finally, since *K<sup>θ</sup> <sup>k</sup>*,*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[*θkθ<sup>T</sup> <sup>k</sup>* ] − *θkθ T <sup>k</sup>* , the above expectations lead to the following expression for the innovation covariance matrices

$$\Pi\_k = K\_{k,k}^{\theta} \circ \left( H\_k D\_k H\_k^T \right) + \mathcal{R}\_k + \overline{\Theta}\_k H\_k \mathcal{S}\_{k,k\prime} \quad k \le m\_{\omega}$$

12 Will-be-set-by-IN-TECH 12 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>13</sup>

$$\begin{split} \Pi\_{k} &= \mathcal{K}\_{k,k}^{\theta} \circ \left( H\_{k} D\_{k} H\_{k}^{T} \right) + \mathcal{R}\_{k} - \Psi\_{k,k-m} \left( \Pi\_{k-m} + \sum\_{i=1}^{m-1} T\_{k-i,k-m}^{T} \Pi\_{k-i}^{-1} T\_{k-i,k-m} \right) \Psi\_{k,k-m}^{T} \\ &+ \overline{\Theta}\_{k} H\_{k} \mathcal{S}\_{k,k} + \mathcal{S}\_{k,k}^{T} H\_{k}^{T} \overline{\Theta}\_{k} - \overline{\Theta}\_{k} H\_{k} P\_{k/k-1} H\_{k}^{T} \overline{\Theta}\_{k} \quad k > m. \end{split}$$

where *Pk*/*k*−1, the prediction error covariance matrix, is obtained by

*Pk*/*<sup>k</sup>* <sup>=</sup> *Pk*/*k*−<sup>1</sup> <sup>−</sup> *Sk*,*k*Π−<sup>1</sup>

with *Pk*/*k*, the filtering error covariance matrix, satisfying

formula for the error covariance matrices, *Pk*/*k*+*<sup>N</sup>* = *E*

whose initial condition is the filter, *<sup>x</sup><sup>k</sup>*/*k*, given in (7).

*Sk*,*k*+*N*−*<sup>m</sup>* −

*Mk*,*k*+*<sup>N</sup>* <sup>=</sup> *Mk*,*k*+*N*−1*F<sup>T</sup>*

<sup>Ψ</sup>*k*+*N*,*k*+*N*−*m*, *Dk* and *Pk*/*<sup>k</sup>* are given in Theorem 1.

*<sup>k</sup>*+*N*,*<sup>k</sup>* <sup>−</sup> *Mk*,*k*+*N*−1*F<sup>T</sup>*

*<sup>k</sup>*+*N*,*<sup>k</sup>* <sup>−</sup> *Mk*,*k*+*N*−1*F<sup>T</sup>*

*m*−1 ∑ *i*=1

where the matrices *Mk*,*k*+*<sup>N</sup>* satisfy the following recursive formula:

*<sup>k</sup>*+*N*,*k*+*N*−*m*, *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>* <sup>−</sup> *<sup>N</sup>*, *<sup>N</sup>* <sup>≥</sup> 1.

Finally, the fixed-point smoothing error covariance matrix, *Pk*/*k*+*N*, verifies

*Pk*/*k*+*<sup>N</sup>* <sup>=</sup> *Pk*/*k*+*N*−<sup>1</sup> <sup>−</sup> *Sk*,*k*+*N*Π−<sup>1</sup>

with initial condition the filtering error covariance matrix, *Pk*/*k*, given by (19).

The matrices *Sk*,*k*+*<sup>N</sup>* are calculated from

 *Dk***F***<sup>T</sup>*

 *Dk***F***<sup>T</sup>*

− 

<sup>×</sup> <sup>Ψ</sup>*<sup>T</sup>*

*Sk*,*k*+*<sup>N</sup>* =

*Sk*,*k*+*<sup>N</sup>* =

immediately clear.

**4.2. Linear fixed-point smoothing algorithm**

derived.

*Pk*/*k*−<sup>1</sup> <sup>=</sup> *Fk*−<sup>1</sup>*Pk*−1/*k*−1*F<sup>T</sup>*

*<sup>k</sup> <sup>S</sup><sup>T</sup>*

The following theorem provides a recursive fixed-point smoothing algorithm to obtain the least-squares linear estimator, *<sup>x</sup><sup>k</sup>*/*k*+*N*, of the state *xk* based on the observations {*y*1,..., *yk*<sup>+</sup>*N*}, for *k* ≥ 1 fixed and *N* ≥ 1. Moreover, to measure of the estimation accuracy, a recursive

**Theorem 2.** For each fixed *<sup>k</sup>* <sup>≥</sup> 1, the fixed-point smoothers, *<sup>x</sup><sup>k</sup>*/*k*+*N*, *<sup>N</sup>* <sup>≥</sup> 1 are calculated by

*k*+*N*−1 *H<sup>T</sup>*

*k*+*N*−1 *H<sup>T</sup>*

*k*+*N*−*i*

*<sup>k</sup>*+*N*−<sup>1</sup> <sup>+</sup> *Sk*,*k*+*N*Π−<sup>1</sup>

The innovations *<sup>ν</sup>k*+*N*, their covariance matrices <sup>Π</sup>*k*+*N*, the matrices *Tk*<sup>+</sup>*N*,*k*+*N*−*i*,

**Proof.** From the general expression (6), for each fixed *k* ≥ 1, the recursive relation (20) is

*Sk*,*k*+*N*−*i*Π−<sup>1</sup>

*<sup>x</sup><sup>k</sup>*/*k*+*<sup>N</sup>* <sup>=</sup> *<sup>x</sup><sup>k</sup>*/*k*+*N*−<sup>1</sup> <sup>+</sup> *Sk*,*k*+*N*Π−<sup>1</sup>

*<sup>k</sup>*−<sup>1</sup> <sup>+</sup> *Qk*−1, *<sup>k</sup>* <sup>≥</sup> 1,

*<sup>k</sup>*+*N*Θ*k*+*<sup>N</sup>*

*Tk*<sup>+</sup>*N*−*i*,*k*+*N*−*<sup>m</sup>*

*k*+*NS<sup>T</sup>*

*Mk*,*<sup>k</sup>* <sup>=</sup> *Dk* <sup>−</sup> *Pk*/*k*. (22)

*k*+*NS<sup>T</sup>*

*<sup>k</sup>*,*k*, *k* ≥ 1; *P*0/0 = *P*0. (19)

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*+*N*)(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*+*N*)*<sup>T</sup>*

*<sup>k</sup>*+*Nνk*+*N*, *N* ≥ 1, (20)

*<sup>k</sup>*+*N*Θ*k*+*N*, *k* ≤ *m* − *N*, *N* ≥ 1,

*<sup>k</sup>*+*N*,*k*+*N*, *N* ≥ 1,

*<sup>k</sup>*,*k*+*N*, *N* ≥ 1, (23)

, is

13

(21)

All these results are summarized in the following theorem.

**Theorem 1.** The linear filter, *<sup>x</sup>k*/*k*, of the state *xk* is obtained as

$$
\widehat{\mathfrak{x}}\_{k/k} = \widehat{\mathfrak{x}}\_{k/k-1} + \mathcal{S}\_{k,k} \Pi\_k^{-1} \nu\_{k\prime} \quad k \ge 1; \quad \widehat{\mathfrak{x}}\_{0/0} = \overline{\mathfrak{x}}\_{0\prime}
$$

where the state predictor, *<sup>x</sup>k*/*k*−1, is given by

$$
\widehat{\mathfrak{x}}\_{k/k-1} = F\_{k-1} \widehat{\mathfrak{x}}\_{k-1/k-1\prime} \quad k \ge 1.
$$

The innovation process satisfies

$$\begin{aligned} \nu\_k &= y\_k - \overline{\Theta}\_k H\_k \widehat{\mathbf{x}}\_{k/k - 1 \prime} \quad k \le m, \\ \nu\_k &= y\_k - \overline{\Theta}\_k H\_k \widehat{\mathbf{x}}\_{k/k - 1} + \Psi\_{k, k - m} \left[ \nu\_{k - m} - \sum\_{i = 1}^{m - 1} T\_{k - i, k - m}^T \Pi\_{k - i}^{-1} \nu\_{k - i} \right], \quad k > m, \end{aligned}$$

where <sup>Θ</sup>*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[Θ*k*] and <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>* <sup>=</sup> *<sup>K</sup><sup>θ</sup> <sup>k</sup>*,*k*−*<sup>m</sup>* ◦ *Hk***F***k*,*k*−*mDk*−*mH<sup>T</sup> <sup>k</sup>*−*m*Π−<sup>1</sup> *k*−*m* , with ◦ the Hadamard product, **<sup>F</sup>***k*,*<sup>i</sup>* <sup>=</sup> *Fk*−<sup>1</sup> ··· *Fi* and *Dk* <sup>=</sup> *<sup>E</sup>*[*xkx<sup>T</sup> <sup>k</sup>* ] recursively obtained by

$$D\_k = F\_{k-1} D\_{k-1} F\_{k-1}^T + Q\_{k-1}, \quad k \ge 1; \qquad D\_0 = P\_0 + \overline{\mathbf{x}}\_0 \overline{\mathbf{x}}\_0^T.$$

The matrices *Tk*,*k*−*<sup>i</sup>* are given by

$$T\_{k,k-i} = \overline{\Theta}\_k H\_k \mathbb{F}\_{k,k-i} \mathbb{S}\_{k-i,k-i} \quad 2 \le k \le m, \quad 1 \le i \le k - 1,$$

$$T\_{k,k-i} = \overline{\Theta}\_k H\_k \mathbb{F}\_{k,k-i} \mathbb{S}\_{k-i,k-i} - \mathbb{Y}\_{k,k-m} T\_{k-i,k-m}^T \quad k > m, \quad 1 \le i \le m - 1.$$

The covariance matrix of the innovation, Π*<sup>k</sup>* = *E*[*νkν<sup>T</sup> <sup>k</sup>* ], satisfies

$$\begin{split} \Pi\_{k} &= K\_{k,k}^{\theta} \circ \left( H\_{k} D\_{k} H\_{k}^{T} \right) + R\_{k} + \overline{\Theta}\_{k} H\_{k} \mathbb{S}\_{k,k}, \quad k \le m, \\ \Pi\_{k} &= K\_{k,k}^{\theta} \circ \left( H\_{k} D\_{k} H\_{k}^{T} \right) + R\_{k} - \Psi\_{k,k-m} \left( \Pi\_{k-m} + \sum\_{i=1}^{m-1} T\_{k-i,k-m}^{T} \Pi\_{k-i}^{-1} T\_{k-i,k-m} \right) \Psi\_{k,k-m}^{T}, \\ &+ \overline{\Theta}\_{k} H\_{k} \mathbb{S}\_{k,k} + S\_{k,k}^{T} H\_{k}^{T} \overline{\Theta}\_{k} - \overline{\Theta}\_{k} H\_{k} P\_{k/k-1} H\_{k}^{T} \overline{\Theta}\_{k}, \quad k > m. \end{split}$$

The matrix *Sk*,*<sup>k</sup>* is determined by the following expression

$$\begin{aligned} S\_{k,k} &= P\_{k/k-1} H\_k^T \overline{\Theta}\_k, \quad k \le m\_\prime\\ S\_{k,k} &= P\_{k/k-1} H\_k^T \overline{\Theta}\_k - \left( S\_{k,k-m} - \sum\_{i=1}^{m-1} S\_{k,k-i} \Pi\_{k-i}^{-1} T\_{k-i,k-m} \right) \Psi\_{k,k-m}^T & \quad k > m\_\prime \end{aligned}$$

where *Pk*/*k*−1, the prediction error covariance matrix, is obtained by

$$P\_{k/k-1} = F\_{k-1} P\_{k-1/k-1} F\_{k-1}^T + Q\_{k-1}, \quad k \ge 1, \ldots$$

with *Pk*/*k*, the filtering error covariance matrix, satisfying

$$P\_{k/k} = P\_{k/k-1} - S\_{k,k} \Pi\_k^{-1} S\_{k,k'}^T \quad k \ge 1; \qquad P\_{0/0} = P\_0. \tag{19}$$

#### **4.2. Linear fixed-point smoothing algorithm**

12 Will-be-set-by-IN-TECH

*<sup>x</sup>k*/*k*−<sup>1</sup> <sup>=</sup> *Fk*−1*xk*−1/*k*−1, *<sup>k</sup>* <sup>≥</sup> 1.

*νk*−*<sup>m</sup>* −

*<sup>k</sup>*,*k*−*<sup>m</sup>* ◦

*m*−1 ∑ *i*=1 *TT*

*<sup>k</sup>*−<sup>1</sup> <sup>+</sup> *Qk*−1, *<sup>k</sup>* <sup>≥</sup> 1; *<sup>D</sup>*<sup>0</sup> <sup>=</sup> *<sup>P</sup>*<sup>0</sup> <sup>+</sup> *<sup>x</sup>*0*x<sup>T</sup>*

*<sup>k</sup>* ], satisfies

*m*−1 ∑ *i*=1 *TT*

*<sup>k</sup>* Θ*k*, *k* > *m*.

*Tk*−*i*,*k*−*<sup>m</sup>*

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*Hk***F***k*,*k*−*mDk*−*mH<sup>T</sup>*

*k*−*i νk*−*<sup>i</sup>* 

*<sup>k</sup>* ] recursively obtained by

*<sup>k</sup>*−*i*,*k*−*m*, *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*, 1 <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *<sup>m</sup>* <sup>−</sup> 1.

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

 Ψ*<sup>T</sup>*

*k*−*i*

*Tk*−*i*,*k*−*<sup>m</sup>*

*<sup>k</sup>*,*k*−*m*, *<sup>k</sup>* <sup>&</sup>gt; *<sup>m</sup>*,

 Ψ*<sup>T</sup> k*,*k*−*m*

*<sup>k</sup>*−*m*Π−<sup>1</sup> *k*−*m* 

0 .

, *k* > *m*,

, with ◦ the

*Tk*,*k*−*<sup>i</sup>* = <sup>Θ</sup>*kHk***F***k*,*k*−*iSk*−*i*,*k*−*i*, 2 ≤ *<sup>k</sup>* ≤ *<sup>m</sup>*, 1 ≤ *<sup>i</sup>* ≤ *<sup>k</sup>* − 1,

+ *Rk* + Θ*kHkSk*,*k*, *k* ≤ *m*,

*<sup>k</sup>* <sup>Θ</sup>*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*kHkPk*/*k*−1*H<sup>T</sup>*

*m*−1 ∑ *i*=1

<sup>Π</sup>*k*−*<sup>m</sup>* +

*Sk*,*k*−*i*Π−<sup>1</sup> *k*−*i*

+ *Rk* − <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

<sup>Π</sup>*k*−*<sup>m</sup>* +

*m*−1 ∑ *i*=1 *TT*

*<sup>k</sup>* Θ*k*, *k* > *m*.

*<sup>k</sup> <sup>ν</sup>k*, *<sup>k</sup>* <sup>≥</sup> 1; *<sup>x</sup>*0/0 <sup>=</sup> *<sup>x</sup>*0,

*<sup>k</sup>*−*i*,*k*−*m*Π−<sup>1</sup>

*k*−*i*

*Tk*−*i*,*k*−*<sup>m</sup>*

 Ψ*<sup>T</sup> k*,*k*−*m*

+ *Rk* − <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

*<sup>k</sup>* <sup>Θ</sup>*<sup>k</sup>* <sup>−</sup> <sup>Θ</sup>*kHkPk*/*k*−1*H<sup>T</sup>*

Π*<sup>k</sup>* = *K<sup>θ</sup>*

*k*,*k* ◦ 

+ <sup>Θ</sup>*kHkSk*,*<sup>k</sup>* + *<sup>S</sup><sup>T</sup>*

The innovation process satisfies

The matrices *Tk*,*k*−*<sup>i</sup>* are given by

Π*<sup>k</sup>* = *K<sup>θ</sup>*

Π*<sup>k</sup>* = *K<sup>θ</sup>*

*k*,*k* ◦ 

*k*,*k* ◦  *HkDkH<sup>T</sup> k* 

where the state predictor, *<sup>x</sup>k*/*k*−1, is given by

*<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> <sup>Θ</sup>*kHkxk*/*k*−1, *<sup>k</sup>* <sup>≤</sup> *<sup>m</sup>*,

*<sup>ν</sup><sup>k</sup>* <sup>=</sup> *yk* <sup>−</sup> <sup>Θ</sup>*kHkxk*/*k*−<sup>1</sup> <sup>+</sup> <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>*

Hadamard product, **<sup>F</sup>***k*,*<sup>i</sup>* <sup>=</sup> *Fk*−<sup>1</sup> ··· *Fi* and *Dk* <sup>=</sup> *<sup>E</sup>*[*xkx<sup>T</sup>*

*Dk* <sup>=</sup> *Fk*−<sup>1</sup>*Dk*−1*F<sup>T</sup>*

*Tk*,*k*−*<sup>i</sup>* <sup>=</sup> <sup>Θ</sup>*kHk***F***k*,*k*−*iSk*−*i*,*k*−*<sup>i</sup>* <sup>−</sup> <sup>Ψ</sup>*k*,*k*−*mT<sup>T</sup>*

The covariance matrix of the innovation, Π*<sup>k</sup>* = *E*[*νkν<sup>T</sup>*

*k*,*kH<sup>T</sup>*

The matrix *Sk*,*<sup>k</sup>* is determined by the following expression

*<sup>k</sup>* Θ*<sup>k</sup>* −

*<sup>k</sup>* Θ*k*, *k* ≤ *m*,

*Sk*,*k*−*<sup>m</sup>* −

*HkDkH<sup>T</sup> k* 

*HkDkH<sup>T</sup> k* 

+ Θ*kHkSk*,*<sup>k</sup>* + *S<sup>T</sup>*

*Sk*,*<sup>k</sup>* <sup>=</sup> *Pk*/*k*−1*H<sup>T</sup>*

*Sk*,*<sup>k</sup>* <sup>=</sup> *Pk*/*k*−1*H<sup>T</sup>*

where <sup>Θ</sup>*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[Θ*k*] and <sup>Ψ</sup>*k*,*k*−*<sup>m</sup>* <sup>=</sup> *<sup>K</sup><sup>θ</sup>*

*k*,*kH<sup>T</sup>*

All these results are summarized in the following theorem.

**Theorem 1.** The linear filter, *<sup>x</sup>k*/*k*, of the state *xk* is obtained as

*<sup>x</sup>k*/*<sup>k</sup>* <sup>=</sup> *<sup>x</sup>k*/*k*−<sup>1</sup> <sup>+</sup> *Sk*,*k*Π−<sup>1</sup>

The following theorem provides a recursive fixed-point smoothing algorithm to obtain the least-squares linear estimator, *<sup>x</sup><sup>k</sup>*/*k*+*N*, of the state *xk* based on the observations {*y*1,..., *yk*<sup>+</sup>*N*}, for *k* ≥ 1 fixed and *N* ≥ 1. Moreover, to measure of the estimation accuracy, a recursive formula for the error covariance matrices, *Pk*/*k*+*<sup>N</sup>* = *E* (*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*+*N*)(*xk* <sup>−</sup> *<sup>x</sup><sup>k</sup>*/*k*+*N*)*<sup>T</sup>* , is derived.

**Theorem 2.** For each fixed *<sup>k</sup>* <sup>≥</sup> 1, the fixed-point smoothers, *<sup>x</sup><sup>k</sup>*/*k*+*N*, *<sup>N</sup>* <sup>≥</sup> 1 are calculated by

$$
\hat{\mathfrak{x}}\_{k/k+N} = \hat{\mathfrak{x}}\_{k/k+N-1} + \mathfrak{S}\_{k,k+N} \Pi\_{k+N}^{-1} \nu\_{k+N\prime} \quad N \ge 1,\tag{20}
$$

whose initial condition is the filter, *<sup>x</sup><sup>k</sup>*/*k*, given in (7).

The matrices *Sk*,*k*+*<sup>N</sup>* are calculated from

$$\begin{split} \mathcal{S}\_{k,k+N} &= \left( D\_k \mathbb{E}\_{k+N,k}^T - M\_{k,k+N-1} \mathbb{E}\_{k+N-1}^T \right) H\_{k+N}^T \overline{\Theta}\_{k+N\prime} \quad k \le m - N\_\prime \quad N \ge 1, \\ \mathcal{S}\_{k,k+N} &= \left( D\_k \mathbb{E}\_{k+N,k}^T - M\_{k,k+N-1} \mathbb{E}\_{k+N-1}^T \right) H\_{k+N}^T \overline{\Theta}\_{k+N} \\ &- \left( S\_{k,k+N-m} - \sum\_{i=1}^{m-1} S\_{k,k+N-i} \Pi\_{k+N-i}^{-1} T\_{k+N-i,k+N-m} \right) \\ &\times \mathbb{1}\_{k+N,k+N-m\prime}^T \quad k > m - N\_\prime \quad N \ge 1. \end{split} \tag{21}$$

where the matrices *Mk*,*k*+*<sup>N</sup>* satisfy the following recursive formula:

$$\begin{aligned} M\_{k,k+N} &= M\_{k,k+N-1} \mathbf{F}\_{k+N-1}^T + \mathbf{S}\_{k,k+N} \Pi\_{k+N}^{-1} \mathbf{S}\_{k+N,k+N}^T \quad N \ge 1, \\\ M\_{k,k} &= D\_k - P\_{k/k} \mathbf{.} \end{aligned} \tag{22}$$

The innovations *<sup>ν</sup>k*+*N*, their covariance matrices <sup>Π</sup>*k*+*N*, the matrices *Tk*<sup>+</sup>*N*,*k*+*N*−*i*, <sup>Ψ</sup>*k*+*N*,*k*+*N*−*m*, *Dk* and *Pk*/*<sup>k</sup>* are given in Theorem 1.

Finally, the fixed-point smoothing error covariance matrix, *Pk*/*k*+*N*, verifies

$$P\_{k/k+N} = P\_{k/k+N-1} - S\_{k,k+N} \Pi\_{k+N}^{-1} S\_{k,k+N}^T \quad N \ge 1,\tag{23}$$

with initial condition the filtering error covariance matrix, *Pk*/*k*, given by (19).

**Proof.** From the general expression (6), for each fixed *k* ≥ 1, the recursive relation (20) is immediately clear.

#### 14 Will-be-set-by-IN-TECH 14 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>15</sup>

Now, we need to prove (21) for *Sk*,*k*+*<sup>N</sup>* = *E*[*xkν<sup>T</sup> <sup>k</sup>*+*N*] = *<sup>E</sup>*[*xky<sup>T</sup> <sup>k</sup>*+*N*] <sup>−</sup> *<sup>E</sup>*[*xky<sup>T</sup> <sup>k</sup>*+*N*/*k*+*N*−1], thus being necessary to calculate both expectations.

*I*. From Equation (3), taking into account that *E*[*xkx<sup>T</sup> <sup>k</sup>*+*N*] = *Dk***F***<sup>T</sup> <sup>k</sup>*+*N*,*<sup>k</sup>* and using that Θ*k*+*<sup>N</sup>* and *vk*<sup>+</sup>*<sup>N</sup>* are independent of *xk*, we obtain

$$E[\mathbf{x}\_k y\_{k+N}^T] = D\_k \mathbb{F}\_{k+N,k}^T H\_{k+N}^T \overline{\mathfrak{S}}\_{k+N\prime} \quad N \ge 1.1$$

	- (a) For *<sup>k</sup>* <sup>≤</sup> *<sup>m</sup>* <sup>−</sup> *<sup>N</sup>*, using (10) for *<sup>y</sup><sup>k</sup>*+*N*/*k*+*N*−<sup>1</sup> with (8) for *<sup>x</sup><sup>k</sup>*+*N*/*k*+*N*−1, we have that

$$E\left[\mathfrak{x}\_k \widehat{y}\_{k+N/k+N-1}^T\right] = M\_{k,k+N-1} F\_{k+N-1}^T H\_{k+N}^T \overline{\Theta}\_{k+N}$$

where *Mk*,*k*+*N*−<sup>1</sup> = *<sup>E</sup> xkx<sup>T</sup> k*+*N*−1/*k*+*N*−1 .

(b) For *k* > *m* − *N*, by following a similar reasoning to the previous one but starting from (17), we get

$$\begin{aligned} E[\mathbf{x}\_k \widehat{y}\_{k+N/k+N-1}^T] &= M\_{k,k+N-1} F\_{k+N-1}^T H\_{k+N}^T \overline{\Theta}\_{k+N} + \left( \mathbb{S}\_{k,k+N-m} \\ &- \sum\_{i=1}^{m-1} \mathbb{S}\_{k,k+N-i} \Pi\_{k+N-i}^{-1} T\_{k+N-i,k+N-m} \right) \mathbf{y}\_{k+N,k+N-m}^T \end{aligned}$$

Then, the replacement of the above expectations in *Sk*,*k*+*<sup>N</sup>* leads to expression (21).

The recursive relation (22) for *Mk*,*k*+*<sup>N</sup>* = *E xkx<sup>T</sup> k*+*N*/*k*+*N* is immediately clear from (7) for *<sup>x</sup><sup>k</sup>*+*N*/*k*+*<sup>N</sup>* and its initial condition *Mk*,*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[*xkx<sup>k</sup>*/*k*] is calculated taking into account that, from the orthogonality, *<sup>E</sup>*[*xkx<sup>T</sup> <sup>k</sup>*/*k*] = *<sup>E</sup>*[*<sup>x</sup><sup>k</sup>*/*kx<sup>T</sup> <sup>k</sup>*/*k*] = *Dk* − *Pk*/*k*.

Finally, since *Pk*/*k*+*<sup>N</sup>* = *E xkx<sup>T</sup> k* − *E <sup>x</sup><sup>k</sup>*/*k*<sup>+</sup>*Nx<sup>T</sup> k*/*k*+*N* , using (20) and taking into account that *<sup>x</sup><sup>k</sup>*/*k*+*N*−<sup>1</sup> is uncorrelated with *<sup>ν</sup>k*+*N*, we have

$$P\_{k/k+N} = E\left[\mathbf{x}\_k \mathbf{x}\_k^T\right] - E\left[\widehat{\mathbf{x}}\_{k/k+N-1} \widehat{\mathbf{x}}\_{k/k+N-1}^T\right] - S\_{k,k+N} \Pi\_{k+N}^{-1} \mathbf{S}\_{k,k+N'}^T \quad N \ge 1$$

and, consequently, expression (23) holds.

Consider a two-dimensional state process, {*xk*; *k* ≥ 0}, generated by the following first-order

<sup>50</sup> �� �0.8 0

• The initial state, *<sup>x</sup>*0, is a zero-mean Gaussian vector with covariance matrix given by *<sup>P</sup>*<sup>0</sup> <sup>=</sup> �

• The process {*wk*; *k* ≥ 0} is a zero-mean white Gaussian noise with covariance matrices

Suppose that the scalar observations come from two sensors according to the following

According to our theoretical model, it is assumed that, for each sensor, the uncertainty at time

modeling this type of uncertainty correlation in the observation process are modeled by two

*<sup>k</sup>* <sup>=</sup> <sup>1</sup>] = *<sup>γ</sup>i*. Specifically, the variables *<sup>θ</sup><sup>i</sup>*

*<sup>k</sup>*+*m*(<sup>1</sup> <sup>−</sup> *<sup>γ</sup><sup>i</sup>*

*<sup>k</sup>* <sup>=</sup> 0, and hence, *<sup>θ</sup><sup>i</sup>*

the state is absent at time *k*, after *k* + *m* instants of time the observation necessarily contains the state. Therefore, there cannot be more than *m* consecutive observations consisting of noise

> ⎧ ⎪⎪⎪⎪⎪⎨

> ⎪⎪⎪⎪⎪⎩

To illustrate the effectiveness of the respective estimators, two hundred iterations of the proposed algorithms have been performed and the results obtained for different values of

*θ i* (1 − *θ i*

*<sup>s</sup>* are independent, *θ<sup>i</sup>*

−(1 − *θ i*

*<sup>k</sup>*, *k* ≥ 1, *i* = 1, 2.

*<sup>k</sup>*), *i* = 1, 2.

*i*

*<sup>k</sup>* and *<sup>θ</sup><sup>i</sup>*

0 if |*k* − *s*| �= 0, *m*,

)<sup>2</sup> if <sup>|</sup>*<sup>k</sup>* <sup>−</sup> *<sup>s</sup>*<sup>|</sup> <sup>=</sup> *<sup>m</sup>*,

) if |*k* − *s*| = 0.

*<sup>k</sup>*; *k* ≥ 1}, *i* = 1, 2, are zero-mean independent white Gaussian processes with

*<sup>k</sup>xk* <sup>+</sup> *<sup>v</sup><sup>i</sup>*

*<sup>k</sup>* = 0.9, ∀*k* ≥ 1, respectively.

independent sequences of independent Bernoulli random variables, {*γ<sup>i</sup>*

*<sup>k</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>γ</sup><sup>i</sup>*

*<sup>k</sup>* and *<sup>γ</sup><sup>i</sup>*

the uncertainty probability and several values of *m* have been analyzed.

*<sup>k</sup>* depends only on the uncertainty at the previous time *<sup>k</sup>* <sup>−</sup> *<sup>m</sup>*. The variables *<sup>θ</sup><sup>i</sup>*

0.9 0.2�

*xk*−<sup>1</sup> + *wk*−1, *<sup>k</sup>* ≥ <sup>1</sup>

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

*<sup>k</sup>*, *i* = 1, 2,

15

*<sup>k</sup>*; *k* ≥ 1}, *i* = 1, 2, with

*<sup>s</sup>* also are independent for

*<sup>k</sup>* are defined as follows

*<sup>k</sup>*+*<sup>m</sup>* = 1; this fact guarantees that, if

= 1 − *γi*(1 − *γi*) and its covariance

autoregressive model

0.1 0 0 0.1� .

*Qk* = �

where {*v<sup>i</sup>*

So, if *θ<sup>i</sup>*

only.

variances *R*<sup>1</sup>

*xk* = �

with the following hypotheses:

0.36 0.3 0.3 0.25�

*<sup>k</sup>* <sup>=</sup> 0.5 and *<sup>R</sup>*<sup>2</sup>

observation equations:

constant probabilities *P*[*γ<sup>i</sup>*

*<sup>k</sup>* <sup>=</sup> 0, then *<sup>γ</sup><sup>i</sup>*

Moreover, since the variables *γ<sup>i</sup>*

*Kθ*

*<sup>k</sup>*,*<sup>s</sup>* <sup>=</sup> *<sup>E</sup>*[(*θ<sup>i</sup>*

function is given by

, ∀*k* ≥ 0.

*yi <sup>k</sup>* <sup>=</sup> *<sup>θ</sup><sup>i</sup>*

*θi*

*<sup>k</sup>*+*<sup>m</sup>* <sup>=</sup> 1 and *<sup>γ</sup><sup>i</sup>*


*<sup>k</sup>* − *θ i* )(*θ<sup>i</sup> <sup>s</sup>* − *θ i* )] =

<sup>1</sup> <sup>+</sup> 0.2 sin � (*<sup>k</sup>* <sup>−</sup> <sup>1</sup>)*<sup>π</sup>*

#### **5. Numerical simulation example**

In this section, we present a numerical example to show the performance of the recursive algorithms proposed in this chapter. To illustrate the effectiveness of the proposed estimators, we ran a program in MATLAB which, at each iteration, simulates the state and the observed values and provides the filtering and fixed-point smoothing estimates, as well as the corresponding error covariance matrices, which provide a measure of the estimators accuracy. Consider a two-dimensional state process, {*xk*; *k* ≥ 0}, generated by the following first-order autoregressive model

$$\mathbf{x}\_{k} = \left(1 + 0.2\sin\left(\frac{(k-1)\pi}{50}\right)\right)\begin{pmatrix} 0.8 & 0\\ 0.9 & 0.2 \end{pmatrix} \mathbf{x}\_{k-1} + w\_{k-1\prime} \quad k \ge 1$$

with the following hypotheses:

14 Will-be-set-by-IN-TECH

*k*+*N*,*kH<sup>T</sup>*

*I I*. Based on expressions (10) and (17) for *<sup>y</sup><sup>k</sup>*+*N*/*k*+*N*−1, which are different depending on

(a) For *<sup>k</sup>* <sup>≤</sup> *<sup>m</sup>* <sup>−</sup> *<sup>N</sup>*, using (10) for *<sup>y</sup><sup>k</sup>*+*N*/*k*+*N*−<sup>1</sup> with (8) for *<sup>x</sup><sup>k</sup>*+*N*/*k*+*N*−1, we have that

 . (b) For *k* > *m* − *N*, by following a similar reasoning to the previous one but starting from

<sup>=</sup> *Mk*,*k*+*N*−1*F<sup>T</sup>*

*<sup>k</sup>*+*N*−1*H<sup>T</sup>*

*k*+*N*−*i*

*k*+*N*/*k*+*N*

*<sup>k</sup>*/*k*] = *Dk* − *Pk*/*k*.

*k*/*k*+*N* 

*k*/*k*+*N*−1

In this section, we present a numerical example to show the performance of the recursive algorithms proposed in this chapter. To illustrate the effectiveness of the proposed estimators, we ran a program in MATLAB which, at each iteration, simulates the state and the observed values and provides the filtering and fixed-point smoothing estimates, as well as the corresponding error covariance matrices, which provide a measure of the estimators accuracy.

*Sk*,*k*+*N*−*i*Π−<sup>1</sup>

 *xkx<sup>T</sup>*

*<sup>x</sup><sup>k</sup>*/*k*<sup>+</sup>*Nx<sup>T</sup>*

*<sup>x</sup><sup>k</sup>*+*N*/*k*+*<sup>N</sup>* and its initial condition *Mk*,*<sup>k</sup>* <sup>=</sup> *<sup>E</sup>*[*xkx<sup>k</sup>*/*k*] is calculated taking into account that,

*k*+*N*−1/*k*+*N*−1

Then, the replacement of the above expectations in *Sk*,*k*+*<sup>N</sup>* leads to expression (21).

*<sup>k</sup>*+*N*] = *<sup>E</sup>*[*xky<sup>T</sup>*

*<sup>k</sup>*+*N*] = *Dk***F***<sup>T</sup>*

*<sup>k</sup>*+*N*Θ*k*+*N*, *N* ≥ 1.

*<sup>k</sup>*+*N*−1*H<sup>T</sup>*

*<sup>k</sup>*+*N*Θ*k*+*<sup>N</sup>* <sup>+</sup>

<sup>−</sup> *Sk*,*k*+*N*Π−<sup>1</sup>

*Tk*<sup>+</sup>*N*−*i*,*k*+*N*−*<sup>m</sup>*

*<sup>k</sup>*+*N*] <sup>−</sup> *<sup>E</sup>*[*xky<sup>T</sup>*

*<sup>k</sup>*+*N*Θ*k*+*N*,

*Sk*,*k*+*N*−*<sup>m</sup>*

 Ψ*<sup>T</sup>*

*<sup>k</sup>*+*N*,*k*+*N*−*m*.

is immediately clear from (7) for

*<sup>k</sup>*,*k*+*N*, *N* ≥ 1

, using (20) and taking into account that

*k*+*NS<sup>T</sup>*

*<sup>k</sup>*+*N*/*k*+*N*−1], thus

*<sup>k</sup>*+*N*,*<sup>k</sup>* and using that Θ*k*+*<sup>N</sup>*

Now, we need to prove (21) for *Sk*,*k*+*<sup>N</sup>* = *E*[*xkν<sup>T</sup>*

and *vk*<sup>+</sup>*<sup>N</sup>* are independent of *xk*, we obtain

*E xky<sup>T</sup>*

> *xkx<sup>T</sup>*

where *Mk*,*k*+*N*−<sup>1</sup> = *<sup>E</sup>*

*<sup>E</sup>*[*xky<sup>T</sup>*

from the orthogonality, *<sup>E</sup>*[*xkx<sup>T</sup>*

Finally, since *Pk*/*k*+*<sup>N</sup>* = *E*

*Pk*/*k*+*<sup>N</sup>* = *E*

The recursive relation (22) for *Mk*,*k*+*<sup>N</sup>* = *E*

*<sup>x</sup><sup>k</sup>*/*k*+*N*−<sup>1</sup> is uncorrelated with *<sup>ν</sup>k*+*N*, we have

 *xkx<sup>T</sup> k* − *E* 

and, consequently, expression (23) holds.

**5. Numerical simulation example**

 *xkx<sup>T</sup> k* − *E* 

(17), we get

*I*. From Equation (3), taking into account that *E*[*xkx<sup>T</sup>*

*E*[*xky<sup>T</sup>*

*k* + *N* ≤ *m* or *k* + *N* > *m*, two options must be considered:

*<sup>k</sup>*+*N*/*k*+*N*−1] = *Mk*,*k*+*N*−1*F<sup>T</sup>*

− *m*−1 ∑ *i*=1

*<sup>k</sup>*/*k*] = *<sup>E</sup>*[*<sup>x</sup><sup>k</sup>*/*kx<sup>T</sup>*

*<sup>x</sup><sup>k</sup>*/*k*+*N*−1*<sup>x</sup><sup>T</sup>*

*k*+*N*/*k*+*N*−1

*<sup>k</sup>*+*N*] = *Dk***F***<sup>T</sup>*

being necessary to calculate both expectations.


Suppose that the scalar observations come from two sensors according to the following observation equations:

$$y\_k^i = \theta\_k^i x\_k + v\_{k'}^i \quad k \ge 1, \quad i = 1, 2.$$

where {*v<sup>i</sup> <sup>k</sup>*; *k* ≥ 1}, *i* = 1, 2, are zero-mean independent white Gaussian processes with variances *R*<sup>1</sup> *<sup>k</sup>* <sup>=</sup> 0.5 and *<sup>R</sup>*<sup>2</sup> *<sup>k</sup>* = 0.9, ∀*k* ≥ 1, respectively.

According to our theoretical model, it is assumed that, for each sensor, the uncertainty at time *<sup>k</sup>* depends only on the uncertainty at the previous time *<sup>k</sup>* <sup>−</sup> *<sup>m</sup>*. The variables *<sup>θ</sup><sup>i</sup> <sup>k</sup>*, *i* = 1, 2, modeling this type of uncertainty correlation in the observation process are modeled by two independent sequences of independent Bernoulli random variables, {*γ<sup>i</sup> <sup>k</sup>*; *k* ≥ 1}, *i* = 1, 2, with constant probabilities *P*[*γ<sup>i</sup> <sup>k</sup>* <sup>=</sup> <sup>1</sup>] = *<sup>γ</sup>i*. Specifically, the variables *<sup>θ</sup><sup>i</sup> <sup>k</sup>* are defined as follows

$$
\theta\_k^{\dot{i}} = 1 - \gamma\_{k+m}^{\dot{i}} (1 - \gamma\_k^{\dot{i}})\_\prime \quad \dot{\imath} = 1, 2.
$$

So, if *θ<sup>i</sup> <sup>k</sup>* <sup>=</sup> 0, then *<sup>γ</sup><sup>i</sup> <sup>k</sup>*+*<sup>m</sup>* <sup>=</sup> 1 and *<sup>γ</sup><sup>i</sup> <sup>k</sup>* <sup>=</sup> 0, and hence, *<sup>θ</sup><sup>i</sup> <sup>k</sup>*+*<sup>m</sup>* = 1; this fact guarantees that, if the state is absent at time *k*, after *k* + *m* instants of time the observation necessarily contains the state. Therefore, there cannot be more than *m* consecutive observations consisting of noise only.

Moreover, since the variables *γ<sup>i</sup> <sup>k</sup>* and *<sup>γ</sup><sup>i</sup> <sup>s</sup>* are independent, *θ<sup>i</sup> <sup>k</sup>* and *<sup>θ</sup><sup>i</sup> <sup>s</sup>* also are independent for |*k* − *s*| �= 0, *m*. The common mean of these variables is *θ i* = 1 − *γi*(1 − *γi*) and its covariance function is given by

$$K\_{k,s}^{\theta} = E[(\theta\_k^i - \overline{\theta}^i)(\theta\_s^i - \overline{\theta}^i)] = \begin{cases} 0 & \text{if } \quad |k-s| \neq 0, m\_\nu \\\\ -(1 - \overline{\theta}^i)^2 & \text{if } \quad |k-s| = m\_\nu \\\\ \overline{\theta}^i (1 - \overline{\theta}^i) & \text{if } \quad |k-s| = 0. \end{cases}$$

To illustrate the effectiveness of the respective estimators, two hundred iterations of the proposed algorithms have been performed and the results obtained for different values of the uncertainty probability and several values of *m* have been analyzed.

#### 16 Will-be-set-by-IN-TECH 16 Stochastic Modeling and Control Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations <sup>17</sup>

Let us observe that the mean function of the variables *θ<sup>i</sup> <sup>k</sup>*, for *i* = 1, 2 are the same if 1 − *γ<sup>i</sup>* is used instead of *γi*; for this reason, only the case *γ<sup>i</sup>* ≤ 0.5 will be considered here. Note that, in such case, the false alarme probability at the *i*-th sensor, 1 − *θ i* , is an increasing function of *γi*.

Firstly, the values of the first component of a simulated state together with the filtering and the fixed-point smoothing estimates for *N* = 2, obtained from simulated observations of the state for *m* = 3 and *γ*<sup>1</sup> = *γ*<sup>2</sup> = 0.5 are displayed in Fig. 1. This graph shows that the fixed-point smoothing estimates follow the state evolution better than the filtering ones.

<sup>0</sup> <sup>20</sup> <sup>40</sup> <sup>60</sup> <sup>80</sup> <sup>100</sup> <sup>120</sup> <sup>140</sup> <sup>160</sup> <sup>180</sup> <sup>200</sup> 0.05

**Figure 2.** Filtering and smoothing error variances for the first state component for *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 and

On the other hand, in order to show more precisely the dependence of the error variance on the values *γ*<sup>1</sup> and *γ*2, Fig. 3 displays the filtering and fixed-point smoothing error variances of the first state component, at a fixed iteration (namely, *k* = 200) for *m* = 3, when both *γ*<sup>1</sup>

More specifically, we have considered the values *γ<sup>i</sup>* = 0.1, 0.2, 0.3, 0.4, 0.5, which lead to the

In this figure, both graphs (corresponding to the filtering and fixed-point smoothing error variances) corroborate the previous results, showing again that, as the false alarm probability increases, the filtering and fixed-point smoothing error variances (*N* = 2) become greater and consequently, worse estimations are obtained. Also, it is concluded that the smoothing error

Analogous results to those of Fig. 1-3 are obtained for the second component of the state. As example, Fig. 4 shows the filtering and fixed-point smoothing error variances of the second state component, at *k* = 200, versus *γ*<sup>1</sup> for constant values of *γ*2, when *m* = 3 and similar

Finally, for *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 the performance of the estimators is compared for different values of *m*; specifically, for *m* = 1, 3, 6, the filtering error variances of the first state component are displayed in Fig. 5. From this figure it is gathered that the estimators are more accurate as the values of *m* are lower. In other words, a greater distance between the instants at which the variables are correlated (which means that more consecutive observations may not contain

= 0.09, 0.16, 0.22, 0.24, 0.25, respectively.

and *γ*<sup>2</sup> are varied from 0.1 to 0.5, which provide different values of the probabilities *θ*

*i*

γ 1 =0.2 γ 2

=0.4 <sup>γ</sup>

1 =0.1 γ 2 =0.3

Iteration k

Filtering error variances Smoothing error variances, N=2 Smoothing error variances, N=5

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

> <sup>1</sup> and *<sup>θ</sup>* 2 .

17

0.1

*γ*<sup>1</sup> = 0.1, *γ*<sup>2</sup> = 0.3, when *m* = 3.

false alarm probabilities 1 − *θ*

variances are better than the filtering ones.

comments to those made from Fig. 3 are deduced.

state information) yields worse estimations.

0.15

0.2

0.25

0.3

0.35

0.4

**Figure 1.** First component of the simulate state, filtering and fixed-point smoothing estimates for *N* = 2, when *m* = 3 and *γ*<sup>1</sup> = *γ*<sup>2</sup> = 0.5.

Next, assuming again that the Bernoulli variables of the observations are correlated at sampling times that differ three units of time (*m* = 3), we compare the effectiveness of the proposed filtering and fixed-point smoothing estimators considering different values of the probabilities *γ*<sup>1</sup> and *γ*2, which provides different values of the false alarm probabilities 1 − *θ i* , *i* = 1, 2; specifically, *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 and *γ*<sup>1</sup> = 0.1, *γ*<sup>2</sup> = 0.3. For these values, Fig. 2 shows the filtering and fixed-point smoothing error variances, when *N* = 2 and *N* = 5, for the first state component. From this figure it is observed that:


16 Will-be-set-by-IN-TECH

used instead of *γi*; for this reason, only the case *γ<sup>i</sup>* ≤ 0.5 will be considered here. Note that, in

Firstly, the values of the first component of a simulated state together with the filtering and the fixed-point smoothing estimates for *N* = 2, obtained from simulated observations of the state for *m* = 3 and *γ*<sup>1</sup> = *γ*<sup>2</sup> = 0.5 are displayed in Fig. 1. This graph shows that the fixed-point

<sup>0</sup> <sup>20</sup> <sup>40</sup> <sup>60</sup> <sup>80</sup> <sup>100</sup> <sup>120</sup> <sup>140</sup> <sup>160</sup> <sup>180</sup> <sup>200</sup> −3

Time k

**Figure 1.** First component of the simulate state, filtering and fixed-point smoothing estimates for *N* = 2,

Next, assuming again that the Bernoulli variables of the observations are correlated at sampling times that differ three units of time (*m* = 3), we compare the effectiveness of the proposed filtering and fixed-point smoothing estimators considering different values of the probabilities *γ*<sup>1</sup> and *γ*2, which provides different values of the false alarm probabilities 1 − *θ*

*i* = 1, 2; specifically, *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 and *γ*<sup>1</sup> = 0.1, *γ*<sup>2</sup> = 0.3. For these values, Fig. 2 shows the filtering and fixed-point smoothing error variances, when *N* = 2 and *N* = 5, for the first

*i*) As both *γ*<sup>1</sup> and *γ*<sup>2</sup> decrease (which means that the false alarm probability decreases), the

*ii*) The error variances corresponding to the fixed-point smoothers are less than those of the filters and, consequently, agreeing with the comments on the previous figure, the

*iii*) The accuracy of the smoothers at each fixed-point *k* is better as the number of available

error variances are smaller and, consequently, better estimations are obtained.

smoothing estimates follow the state evolution better than the filtering ones.

*<sup>k</sup>*, for *i* = 1, 2 are the same if 1 − *γ<sup>i</sup>* is

, is an increasing function of *γi*.

Simulate state Filtering estimates Smoothing estimates

> *i* ,

*i*

Let us observe that the mean function of the variables *θ<sup>i</sup>*

−2

when *m* = 3 and *γ*<sup>1</sup> = *γ*<sup>2</sup> = 0.5.

observations increases.

state component. From this figure it is observed that:

fixed-point smoothing estimates are more accurate.

−1

0

1

2

3

4

5

such case, the false alarme probability at the *i*-th sensor, 1 − *θ*

**Figure 2.** Filtering and smoothing error variances for the first state component for *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 and *γ*<sup>1</sup> = 0.1, *γ*<sup>2</sup> = 0.3, when *m* = 3.

On the other hand, in order to show more precisely the dependence of the error variance on the values *γ*<sup>1</sup> and *γ*2, Fig. 3 displays the filtering and fixed-point smoothing error variances of the first state component, at a fixed iteration (namely, *k* = 200) for *m* = 3, when both *γ*<sup>1</sup> and *γ*<sup>2</sup> are varied from 0.1 to 0.5, which provide different values of the probabilities *θ* <sup>1</sup> and *<sup>θ</sup>* 2 . More specifically, we have considered the values *γ<sup>i</sup>* = 0.1, 0.2, 0.3, 0.4, 0.5, which lead to the false alarm probabilities 1 − *θ i* = 0.09, 0.16, 0.22, 0.24, 0.25, respectively.

In this figure, both graphs (corresponding to the filtering and fixed-point smoothing error variances) corroborate the previous results, showing again that, as the false alarm probability increases, the filtering and fixed-point smoothing error variances (*N* = 2) become greater and consequently, worse estimations are obtained. Also, it is concluded that the smoothing error variances are better than the filtering ones.

Analogous results to those of Fig. 1-3 are obtained for the second component of the state. As example, Fig. 4 shows the filtering and fixed-point smoothing error variances of the second state component, at *k* = 200, versus *γ*<sup>1</sup> for constant values of *γ*2, when *m* = 3 and similar comments to those made from Fig. 3 are deduced.

Finally, for *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 the performance of the estimators is compared for different values of *m*; specifically, for *m* = 1, 3, 6, the filtering error variances of the first state component are displayed in Fig. 5. From this figure it is gathered that the estimators are more accurate as the values of *m* are lower. In other words, a greater distance between the instants at which the variables are correlated (which means that more consecutive observations may not contain state information) yields worse estimations.

**Figure 3.** Filtering error variances and smoothing error variances for *N* = 2 of the first state component at *k* = 200 versus *γ*<sup>1</sup> with *γ*<sup>2</sup> varying from 0.1 to 0.5 when *m* = 3.

<sup>0</sup> <sup>20</sup> <sup>40</sup> <sup>60</sup> <sup>80</sup> <sup>100</sup> <sup>120</sup> <sup>140</sup> <sup>160</sup> <sup>180</sup> <sup>200</sup> 0.1

In this chapter, the least-squares linear filtering and fixed-point smoothing problems have been addressed for linear discrete-time stochastic systems with uncertain observations coming from multiple sensors. The uncertainty in the observations is modeled by a binary variable taking the values one or zero (Bernoulli variable), depending on whether the signal is present or absent in the corresponding observation, and it has been supposed that the uncertainty at any sampling time *k* depends only on the uncertainty at the previous time *k* − *m*. This situation covers, in particular, those signal transmission models in which any failure in the transmission is detected and the old sensor is replaced after *m* instants of time, thus avoiding

By applying an innovation technique, recursive algorithms for the linear filtering and fixed-point smoothing estimators have been obtained. This technique consists of obtaining the estimators as a linear combination of the innovations, simplifying the derivation of these

Finally, the feasibility of the theoretical results has been illustrated by the estimation of a two-dimensional signal from uncertain observations coming from two sensors, for different uncertainty probabilities and different values of *m*. The results obtained confirm the greater effectiveness of the fixed-point smoothing estimators in contrast to the filtering ones and

In recent years, several problems of signal processing, such as signal prediction, detection and control, as well as image restoration problems, have been treated using quadratic estimators and, generally, polynomial estimators of arbitrary degree. Hence, it must be noticed that the current chapter can be extended by considering the least-squares polynomial estimation

**Figure 5.** Filtering error variances for *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 and *m* = 1, 3, 6.

the possibility of missing signal in *m* + 1 consecutive observations.

estimators, due to the fact that the innovations constitute a white process.

conclude that more accurate estimations are obtained as the values of *m* are lower.

**6. Conclusions and future research**

Iteration k

Filtering error variances, m=1, γ

Design of Estimation Algorithms from an Innovation Approach in Linear Discrete-Time Stochastic Systems with Uncertain Observations

Filtering error variances, m=3, γ

Filtering error variances, m=6, γ

1 =0.2, γ 2 =0.4 19

1 =0.2, γ 2 =0.4

1 =0.2, γ 2 =0.4

0.15

0.2

0.25

0.3

0.35

0.4

0.45

**Figure 4.** Filtering error variances and smoothing error variances for *N* = 2 of the second state component at *k* = 200 versus *γ*<sup>1</sup> with *γ*<sup>2</sup> varying from 0.1 to 0.5 when *m* = 3.

**Figure 5.** Filtering error variances for *γ*<sup>1</sup> = 0.2, *γ*<sup>2</sup> = 0.4 and *m* = 1, 3, 6.
