**8.2 Scaled mixture models**

Scaled Gaussian Mixture (SGM) models have been used in many applications to model rare events by their heavier tails with respect to Gaussian. They are also used in sparse signals modeling. A general SGM is defined as follows:

$$\mathcal{S}(f) = \int \mathcal{N}(f|\mathbf{0}, v) \, p\_m(v|\boldsymbol{\theta}) \, \mathrm{d}v \tag{58}$$

where the variance of the Gaussian model N ð Þ *f*j0, *v* is assumed to follow the mixing probability law *pm*ð Þ *v*j*θ :* Between many possibilities for this mixing pdf is Inverse-Gamma which results to Student-t:

$$\mathcal{S}(f|\nu) = \int \mathcal{N}(f|\mathbf{0}, \nu) \mathcal{Z}\mathcal{G}(v|\nu, \nu) \,\mathrm{d}v \tag{59}$$

which have been extended to more general case:

$$\mathcal{S}(f|a,\beta) = \int \mathcal{N}(f|\mathbf{0},v) \mathcal{Z}\mathcal{G}(v|a,\beta) \,\mathrm{d}v \tag{60}$$

This pdf models have been used with success in many developments in Bayesian approach for inverse problems by:

$$p(\mathbf{f}|a,\boldsymbol{\beta}) = \prod\_{j} \mathcal{N}\left(\mathbf{f}\_{j}|\mathbf{0}, v\_{j}\right) \mathcal{IG}(v\_{j}|a,\boldsymbol{\beta}) \text{ d}v\_{j} \tag{61}$$

or

$$p(\mathbf{f}|a,\beta) = \int \mathcal{N}(\mathbf{f}|0,\nu\Sigma) \mathcal{IG}(\nu|a,\beta) \,\mathrm{d}\nu \tag{62}$$

Scaled Gaussian mixture models have been used extensively for modeling sparse signals. However, it happens very often that the signals or images are not sparse directly, but their gradients are, or more generally in a transformed domain such as Fourier or Wavelet domains. We have used these models extensively in hierarchical way:

$$\begin{cases} \mathbf{g} = \mathbf{H}\mathbf{f} + \mathbf{e}, \\\\ \mathbf{f} = \mathbf{D}\mathbf{z} + \boldsymbol{\zeta}, \quad \mathbf{z} \text{ sparse Student} \to \begin{cases} p\left(\mathbf{z}\_{j}|\boldsymbol{v}\_{\boldsymbol{x}\_{j}}\right) = \mathcal{N}\left(\mathbf{z}\_{j}|\mathbf{0}, \boldsymbol{v}\_{\boldsymbol{x}\_{j}}\right), \\\\ p\left(\boldsymbol{v}\_{\boldsymbol{x}\_{j}}\right) = \mathcal{Z}\mathcal{G}\left(\boldsymbol{v}\_{\boldsymbol{x}\_{j}}|\boldsymbol{a}\_{\boldsymbol{x}\_{0}}, \boldsymbol{\beta}\_{\boldsymbol{x}\_{0}}\right) \end{cases} \end{cases} \tag{63}$$

*Bayesian Inference for Inverse Problems DOI: http://dx.doi.org/10.5772/intechopen.104467*

where *D* represents any linear transformations and *D*–<sup>1</sup> applied of *f* transforms it to a sparse vector *z*.

The whole relations of the likelihood and priors are summarized in below:

$$\begin{cases} p(\mathbf{g}|\mathbf{f}) = \mathcal{N}(\mathbf{g}|\mathbf{H}\mathbf{f}, v\_{\varepsilon}\mathbf{I}) \\ p(\mathbf{f}|\mathbf{z}) = \mathcal{N}(\mathbf{f}|\mathbf{Dz}, v\_{\varepsilon}\mathbf{I}) \\ p(\mathbf{z}|v\_{\mathbf{z}}) = \mathcal{N}(\mathbf{z}|\mathbf{0}, V\_{\mathbf{z}}) \\ p(v\_{\mathbf{z}}) = \prod\_{j} \mathcal{G}(v\_{\mathbf{z}\_{j}}|a\_{\mathbf{z}\_{0}}, \boldsymbol{\beta}\_{\mathbf{z}\_{0}}) \\ p(v\_{\varepsilon}) = \mathcal{D}\mathcal{G}(v\_{\varepsilon}|a\_{\varepsilon\_{0}}, \boldsymbol{\beta}\_{\varepsilon\_{0}}) \\ p(v\_{\boldsymbol{\xi}}) = \mathcal{D}\mathcal{G}\left(v\_{\boldsymbol{\xi}}|a\_{\boldsymbol{\xi}\_{\mathbf{z}}}, \boldsymbol{\beta}\_{\boldsymbol{\xi}\_{\mathbf{z}}}\right) \end{cases} \tag{64}$$

and the corresponding joint posterior of all the unknowns writes:

$$\begin{cases} \begin{aligned} &p(\mathbf{f},\mathbf{z},\boldsymbol{\nu}\_{\boldsymbol{\sigma}},\boldsymbol{\nu}\_{\boldsymbol{\varepsilon}},\boldsymbol{\nu}\_{\boldsymbol{\xi}}|\mathbf{g}) \propto \exp\left[-J(\mathbf{f},\mathbf{z},\boldsymbol{\nu}\_{\boldsymbol{\sigma}},\boldsymbol{\nu}\_{\boldsymbol{\varepsilon}},\boldsymbol{\nu}\_{\boldsymbol{\xi}})\right] \\ &J(\mathbf{f},\mathbf{z},\boldsymbol{\nu}\_{\boldsymbol{\sigma}},\boldsymbol{\nu}\_{\boldsymbol{\varepsilon}},\boldsymbol{\nu}\_{\boldsymbol{\xi}}) = \frac{1}{2\nu\_{\varepsilon}}\left\|\mathbf{g}-\mathbf{H}\mathbf{f}\right\|\_{2}^{2} + \frac{1}{2\nu\_{\xi}}\left\|\mathbf{f}-\mathbf{D}\mathbf{z}\right\|\_{2}^{2} + \left\|\mathbf{V}\_{\mathbf{z}}^{-\frac{1}{2}}\mathbf{z}\right\|\_{2}^{2} + \\ &\sum\_{j} (a\_{\mathbf{z}\_{0}}+1)\ln v\_{\mathbf{z}\_{j}} + \beta\_{\mathbf{z}\_{0}}/v\_{\mathbf{z}\_{j}} + \\ &(a\_{c\_{0}}+m/2)\ln v\_{\boldsymbol{\varepsilon}} + \beta\_{c\_{0}}/v\_{\boldsymbol{\varepsilon}} + \left(a\_{\xi\_{\mathbf{z}}} + n/2\right)\ln v\_{\boldsymbol{\xi}} + \beta\_{\xi\_{\mathbf{z}}}/v\_{\boldsymbol{\xi}} \end{aligned} \tag{65}$$

Looking at this expression, we see that we have:


A final case we consider is the case of Non-stationary noise and sparsity enforcing prior in the same framework.

$$\begin{cases} \mathbf{g} = \mathbf{H}\mathbf{f}' + \boldsymbol{\varepsilon}, \quad \boldsymbol{\varepsilon} \text{ non stationary} \rightarrow \begin{cases} p(c\_i | \boldsymbol{v}\_{c\_i}) = \mathcal{N}(c\_i | \mathbf{0}, \boldsymbol{v}\_{c\_i}), \\ p(\boldsymbol{v}\_{c\_i}) = \mathcal{Z}\mathcal{G}(\boldsymbol{v}\_{c\_i} | \boldsymbol{a}\_{c\_0}, \boldsymbol{\beta}\_{c\_0}) \end{cases} \\\\ \mathbf{f} = \mathbf{D}\boldsymbol{\mathfrak{z}} + \boldsymbol{\xi}, \quad \boldsymbol{\mathfrak{z}} \text{ sparse Student} \rightarrow \begin{cases} p\left(\boldsymbol{z}\_j | \boldsymbol{v}\_{\boldsymbol{x}\_j}\right) = \mathcal{N}\left(\boldsymbol{z}\_j | \mathbf{0}, \boldsymbol{v}\_{\boldsymbol{x}\_j}\right), \\ p\left(\boldsymbol{v}\_{\boldsymbol{x}\_j}\right) = \mathcal{Z}\mathcal{G}\left(\boldsymbol{v}\_{\boldsymbol{x}\_j} | \boldsymbol{a}\_{\boldsymbol{x}\_0}, \boldsymbol{\beta}\_{\boldsymbol{x}\_0}\right) \end{cases} \end{cases} \tag{66}$$

Again here, all the expressions of likelihood and priors can be summarized as follows:

$$\begin{cases} p(\mathbf{g}|\mathbf{f}) = \mathcal{N}(\mathbf{g}|\mathbf{H}\mathbf{f}, \mathbf{V}\_{\epsilon}) \\ p(\mathbf{f}|\mathbf{z}) = \mathcal{N}(\mathbf{f}|D\mathbf{z}, v\_{\hat{\epsilon}}\mathbf{I}) \\ p(\mathbf{z}|\boldsymbol{\nu\_{z}}) = \mathcal{N}(\mathbf{z}|\mathbf{0}, \mathbf{V}\_{\mathbf{z}}) \\ p(\boldsymbol{\nu\_{z}}) = \Pi \mathcal{Z}\mathcal{G}\Big(v\_{\pi\_{j}}|a\_{\pi\_{0}}, \boldsymbol{\beta\_{x\_{0}}}\big) \\ p(\boldsymbol{\nu\_{\epsilon}}) = \Pi \mathcal{Z}\mathcal{G}\big(v\_{\epsilon\_{i}}|a\_{\epsilon\_{0}}, \boldsymbol{\beta\_{\epsilon\_{0}}}\big) \\ p(\boldsymbol{\nu\_{\xi}}) = \mathcal{Z}\mathcal{G}\Big(v\_{\xi}|a\_{\xi\_{\pi}}, \boldsymbol{\beta\_{\xi\_{\pi}}}\big) \end{cases} \tag{67}$$

and the joint posterior of all the unknowns become:

$$\begin{cases} \begin{aligned} &p(\boldsymbol{f},\boldsymbol{x},\boldsymbol{u}\_{\boldsymbol{x}},\boldsymbol{v}\_{\boldsymbol{\epsilon}},\boldsymbol{v}\_{\boldsymbol{\epsilon}}|\boldsymbol{g}) \propto \exp\left[-\boldsymbol{f}(\boldsymbol{f},\boldsymbol{x},\boldsymbol{v}\_{\boldsymbol{x}},\boldsymbol{v}\_{\boldsymbol{\epsilon}},\boldsymbol{v}\_{\boldsymbol{\epsilon}})\right] \\ &f(\boldsymbol{f},\boldsymbol{x},\boldsymbol{u}\_{\boldsymbol{x}},\boldsymbol{v}\_{\boldsymbol{\epsilon}},\boldsymbol{v}\_{\boldsymbol{\epsilon}}) = \left\|\boldsymbol{V}\_{\boldsymbol{\epsilon}}^{\bar{\boldsymbol{\epsilon}}}(\boldsymbol{g}-\boldsymbol{H}\boldsymbol{f})\right\|\_{2}^{2} + \frac{1}{2\nu\_{\boldsymbol{\xi}}} \left\|\boldsymbol{f}-D\boldsymbol{x}\right\|\_{2}^{2} + \left\|\boldsymbol{V}\_{\boldsymbol{x}}^{-\frac{\bar{\boldsymbol{\epsilon}}}{2}}\boldsymbol{z}\right\|\_{2}^{2} + \\ &\sum\_{j} (a\_{\boldsymbol{x}\_{0}}+\mathbf{1})\ln v\_{\boldsymbol{x}\_{j}} + \beta\_{\boldsymbol{x}\_{0}}/v\_{\boldsymbol{x}\_{j}} + \\ &\sum\_{j} (a\_{\boldsymbol{e}\_{0}}+\mathbf{1})\ln v\_{\boldsymbol{e}\_{i}} + \beta\_{\boldsymbol{e}\_{0}}/v\_{\boldsymbol{e}\_{i}} + \\ &(a\_{\boldsymbol{\xi}\_{\boldsymbol{x}}}+\boldsymbol{n}/2)\ln v\_{\boldsymbol{\xi}} + \beta\_{\boldsymbol{\xi}\_{\boldsymbol{x}}}/v\_{\boldsymbol{\xi}} \end{aligned} \tag{68}$$

The following scheme shows graphically this case.
