**5.3 Marginalization over** *f*

The main idea here is first find a good estimate for the parameters *θ* and then use it for the inference on *f*. So, first obtain:

$$p(\boldsymbol{\theta}|\mathbf{g}) = \int p(\boldsymbol{f}, \boldsymbol{\theta}|\mathbf{g}) \, \mathrm{d}\mathbf{f} \tag{28}$$

which can be used to first estimate *θ* and then use it. For example, the method which is related to the *Second type Maximum likelihood*, first estimate ^*θ* by

$$\hat{\boldsymbol{\theta}} = \arg\max\_{\boldsymbol{\theta}} \left\{ p \left( \hat{\boldsymbol{\theta}} | \mathbf{g} \right) \right\} \tag{29}$$

and then use it with *<sup>p</sup> <sup>f</sup>*j^*θ*, *<sup>g</sup>* � � to infer on ^*f:* For a flat prior model, *<sup>p</sup>*ð Þ *<sup>θ</sup>*j*<sup>g</sup>* <sup>∝</sup>*p g*ð Þ <sup>j</sup>*<sup>θ</sup>* which is called the *likelihood* and the estimator

$$\hat{\boldsymbol{\theta}} = \underset{\boldsymbol{\theta}}{\text{arg}\, \max} \left\{ p\left(\hat{\boldsymbol{\theta}} | \mathbf{g}\right) \right\} = \underset{\boldsymbol{\theta}}{\text{arg}\, \max} \left\{ p\left(\mathbf{g} | \hat{\boldsymbol{\theta}}\right) \right\} \tag{30}$$

is called *Maximum Likelihood (ML)* and the whole approach is called *ML of second type*. This method can be summarized as follows:

$$\begin{array}{|c|c|} \hline \multicolumn{2}{|}{p(f,\theta|g)} \longrightarrow & \multicolumn{2}{|}{p(\theta|g)} \longrightarrow & \hat{\theta} \longrightarrow & \multicolumn{2}{|}{p(f|\hat{\theta},g)} \longrightarrow \\ \hline \text{out Posterior} & \text{Marginalize over } f & & \\ \hline \end{array}$$

The main difficulty in this approach is that, rarely we can have an analytical expression for the first marginalization. To overcome this difficulty, many algorithms have been proposed to compute *f*. One of them is called Expectation- Maximization (EM) and its generalization (GEM). The main idea of these algorithms are summarized in the following subsections:
