**5.1 Bayesian estimation**

**Likelihood:** It was assumed that probing depth follows a gamma distribution. **Prior distribution:** It was defined as a product of marginal prior distributions for each component of *β* in Model 23. *β* is composed by the overall mean, main effects, and interactions: *ξ*000, *ξq*00, *ξ*0*<sup>m</sup>*0, *ξ*00*l*, *ξ*0*ml*, *ξq*0*l*, *ξqm*0, *ξqml*, and all of them had *N* 0, 10<sup>2</sup> prior. *brms* function uses a special parameterization for matrices *D* and *G* in Eq. (21). This parameterization is *G* ¼ *F*ð Þ *σ<sup>k</sup>* Ω*kF*ð Þ *σ<sup>k</sup>* , where *F*ð Þ *σ<sup>k</sup>* is a diagonal matrix with diagonal elements *σ<sup>k</sup>* [6]. Priors for *D* and *G* needed only to specify priors for *σ<sup>k</sup>* and Ω*k*, which were *σ<sup>k</sup>* � *HalfCauchy*ð Þ 10 and Ω*<sup>k</sup>* � *CorrLKJ*ð Þ1 . Finally, the shape hyper-parameter was *shape* � G*amma*ð Þ 0*:*01,0*:*01 .

The analysis of this model was performed with the *brms* library [6, 11] that uses the probabilistic programming language *Stan* [12] in the environment of *R Software 4.0.5*.

**Simulation:** All the MCMC had four chains; the number of iterations and burn-in was not the same for the models studied, but all used a final sample of 4000.

The MCMC of the models (37, 38 and 39), that is, null, with level-one, and with level-two variables, were obtained using 4000 iterations and a burn-in of 3000. Model 40 used 5000 iterations and a burn-in of 4000, Model 41 used 7000 iterations and a burn-in of 6000, and Model 43 used 8000 iterations and a burn-in of 7000.

**Bayesian estimators:** The mean of the posterior distribution was used as the Bayesian estimator; this is related with minimizing the squared loss function.

**Models studied:** We studied a three-level multilevel generalized linear model, where *i* represented the level-one units, *j* the level-two units, and *k* the level-three units. Although the values of the Rhats are not shown, all the MCMC of the studied models converged since all the Rhats were at most 1.01.

**Step 1.** The null model is

$$\log\left(\mu\_{ijk}\right) = \xi\_{0\,00} + \left(\nu\_{0\,0k} + \mu\_{0jk}\right) \tag{45}$$

Columns 2 and 3 of **Table 2** show the Bayesian estimations of the null model. The credible intervals did not contain zero, so that the variances at the tooth level and at the patient level were significant. This supports the use of MGLM.

**Step 2.** The model with level-one variable, *bleeding*, is

$$\log\left(\mu\_{ijk}\right) = \xi\_{000} + a\_1 \text{blededing}\_{1ijk} + \left(\nu\_{00k} + \mu\_{0jk}\right) \tag{46}$$

The Bayesian estimations of the model showed that the bleeding coefficient was significant (columns 4 and 5 of **Table 2**). The comparison of Models 45 and 46, using LOO-CV, indicates that the model including the level-one variables was better (before the last row and column 5 in **Table 2**).

**Step 3.** The model with level-two variable, *mobility*, is

$$\log\left(\mu\_{ijk}\right) = \xi\_{000} + a\_1 \text{blededing}\_{1jk} + \xi\_{010} \text{mobidity}\_{1jk} + \left(\nu\_{00k} + u\_{0jk}\right) \tag{47}$$


*Bayesian Multilevel Modeling in Dental Research DOI: http://dx.doi.org/10.5772/intechopen.108442*

> *\*95%CrI: 95%*

*Credible Interval. Comparisons.†Model 45 vs. 46 vs. 47. Model 47 vs. 48.#Model 48 vs.* 

**Table 2.**

*Bayesian estimates of Models 45–50.*

Mobility fixed effect was not significant (columns 6 and 7 of **Table 2**); however, it was retained in the model because it was the only level-two variable and to get estimations of the effect of the third-level independent variables adjusted for the effect of level-two variable. The LOO-CV criterion indicated that this model was slightly better (before the last row and column 7 in **Table 2**).

There are seven level-three contextual variables (**Table 1**); before specifying the model containing only the significant level-three variables, a forward selection of variables was performed to avoid having an overparameterized model. **Table 3** shows the variable selection procedure where each model contains the level-one variable, *bleeding*, and the level-two variable, *mobility*. The LOO-CV model comparison indicates that the model that includes *calculus* and *smoking* variables is the best model.

**Step 4.** The model with level-three variables, *calculus* and *smoking*, is

$$\begin{split} \log \left( \mu\_{ij|k} \right) &= \xi\_{000} + a\_1 \text{blededing}\_{1jk} + \xi\_{010} \text{mobbility}\_{1jk} \\ &+ \xi\_{001} \text{calculus}\_{1k} + \xi\_{002} \text{smoding}\_{2k} + \left( \nu\_{00k} + u\_{0jk} \right) \end{split} \tag{48}$$

Columns 8 and 9 in **Table 2** show that the variable *smoking* was not significant; however, the model that contains smoking is better than the others. Model 48 was better than Model 47 (before the last row and column 9 in **Table 2**).

**Step 5.** The model with a random slope for the variable bleeding.

In Eq. (49), a random slope for the variable bleeding is added that varies at patient and teeth levels; that is, the relationship between probing depth and bleeding varied between patients and between teeth.

$$\begin{split} \log \left( \mu\_{ij|k} \right) &= \xi\_{000} + \alpha\_{1} \text{blededing}\_{1ijk} + \xi\_{010} \text{mobbility}\_{1jk} \\ &+ \xi\_{001} \text{calulus}\_{1k} + \xi\_{002} \text{smoking}\_{2k} \\ &+ \nu\_{10k} \text{bleding}\_{1jk} + u\_{1jk} \text{bleding}\_{1jk} + \left( \nu\_{00k} + u\_{0jk} \right) \end{split} \tag{49}$$

Finally, in the next model interaction, terms were added based on signs that occur in periodontal disease.

Columns 10 and 11 of **Table 2** show that the random slope of bleeding was significant at patient and teeth levels. Again, this model was compared with Model 48 using the LOO-CV criterion, and the best model was Model 48, which contained random slopes (before the last row and column 11 in **Table 2**).

**Step 7.** The model with cross-level interactions is

$$\begin{split} \log \left( \mu\_{\vec{q}|k} \right) &= \xi\_{000} + a\_1 \text{blededing}\_{1\vec{q}|k} + \xi\_{010} \text{mobidity}\_{1\vec{q}} \\ &+ \xi\_{001} \text{calulus}\_{1k} + \xi\_{002} \text{smoking}\_{2k} + \xi\_{101} \text{bleding}\_{1\vec{q}|k} \text{calulus}\_{1k} \\ &+ \nu\_{10k} \text{bleding}\_{1\vec{q}|k} + u\_{1\vec{q}} \text{bleding}\_{1\vec{q}|k} + \left( \nu\_{00k} + u\_{0\vec{q}k} \right) \end{split} \tag{50}$$

Eq. (50) has an interaction between the level-three variable calculus with the levelone variable bleeding. Columns 12 and 13 of **Table 2** show that the interaction was not significant (its credible interval contained zero). Finally, the comparison of models indicated that the best model was Model 49 corresponding to the bleeding random slope model (the last row and column 13 in **Table 2**). So, this model is interpreted.

**Figure 2** shows the posterior predictive fit of Model 49 to the data. The replicated data are plotted in a light color, and the observed data are plotted in black. As both


*\* All the models have the structure: log μij*∣*<sup>k</sup>* <sup>¼</sup> *<sup>ξ</sup>*<sup>000</sup> <sup>þ</sup> *<sup>α</sup>*1bleeding1*ijk* <sup>þ</sup> *<sup>ξ</sup>*010mobility1*jk* <sup>þ</sup> var1

<sup>þ</sup>var2 <sup>þ</sup> var3 <sup>þ</sup> *<sup>ν</sup>*00*<sup>k</sup>* <sup>þ</sup> *<sup>u</sup>*0*jk , where* var1 *is the independent variable that produces the best fit among all the seven models with one independent variable. Similarly,* var2 *is the second independent variable that produces the best fit among all the six models, having* var1 *in common, with two independent variables, and so on for* var3*.*
