Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous Variables by Bayesian Model Averaging

*Jonathan Lee and Alex Lenkoski*

#### **Abstract**

We develop a method to incorporate model uncertainty by model averaging in generalized linear models subject to multiple endogeneity and instrumentation. Our approach builds on a Gibbs sampler for the instrumental variable framework that incorporates model uncertainty in both outcome and instrumentation stages. Direct evaluation of model probabilities is intractable in this setting. However, we show that by nesting model moves inside the Gibbs sampler, a model comparison can be performed via conditional Bayes factors, leading to straightforward calculations. This new Gibbs sampler is slightly more involved than the original algorithm and exhibits no evidence of mixing difficulties. We further show how the same principle may be employed to evaluate the validity of instrumentation choices. We conclude with an empirical marketing study: estimating opening box office by three endogenous regressors (prerelease advertising, opening screens, and production budget).

**Keywords:** multiple endogeneity, instrumental variables, Bayesian model averaging, conditional Bayes factors, box office forecasting

#### **1. Introduction**

Market response modeling focuses on estimating the effects of marketing activities on performance. However, marketing managers are often strategic in their use of marketing activities and adapt them in response to factors unobserved by the researcher [1–3]. Endogeneity arises, for example, when a firm's marketing strategies such as advertising spending, channel selection, and pricing are nonrandom and influenced by the firm- and industry-level factors [4–6]. Strategic management decisions are endogenous to their expected effects on market performance. Therefore, empirical market response models that seek to estimate the causal effect of multiple marketing instruments need to account for such strategic planning of marketing activities, or otherwise may suffer from an endogeneity problem, leading to biased estimates of the effects of the marketing activities on performance [1, 3, 4, 7]. Dealing with endogeneity has been extensively discussed in the marketing literature,

especially concerning different forms of regression and panel models [1, 5, 8–10], choice models [11, 12], endogeneity correction based on a control function approach [13, 14], as well as structural equations models [4]. However, little research addresses incorporating model uncertainty related to endogeneity in generalized linear models.

We consider the problem of incorporating instruments and covariate uncertainty into the Bayesian estimation of an instrumental variable (IV) regression system. The concepts of model uncertainty and model averaging have received widespread attention in the economics literature for the standard linear regression framework [15–18] and in generalized linear models [19–22]. For a good introduction to Bayesian model averaging (BMA), see [23]. Primarily, these frameworks do not directly address the case of multiple endogenous variables, and only recently has attention been paid to model uncertainty involving multiple endogenous variables. Unfortunately, the nested nature of IV estimation renders direct model comparison difficult. In the economics literature, this has led to several different approaches [24, 25]. Durlauf et al. [25] consider approximations of marginal likelihoods in a framework similar to two-stage least squares. Lenkoski et al. [16] continue this development with the twostage Bayesian model averaging (2SBMA) methodology, which uses a framework developed by Kleibergen and Zivot [26] to propose a two-stage extension of the unit information prior [27]. Similar approaches in closely related models have been developed by [15, 28].

Koop et al. [29] developed a fully Bayesian methodology that does not utilize approximations to integrated likelihoods. They present a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm [30], which extends the methodology of Holmes et al. [31]. The authors then show that the method can handle a variety of priors, including those of [32, 33], and [34]. However, the authors note that the direct application of RJMCMC leads to significant mixing difficulties and relies on a complicated model move procedure similar to simulated tempering to escape local model modes. There is a more straightforward and relatively general model search procedure. Madigan and York [35] proposed the Markov Chain Monte Carlo Model Composition (MC3) in which one applies the same idea of a Metropolis-Hastings step for model jumps from RJMCMC but in a simplified fashion.

We propose an alternative solution to this problem: Instrumental Variable Bayesian Model Averaging (IVBMA). Our method builds on a Gibbs sampler for the IV framework, extended from that discussed in Rossi et al. [36]. While direct model comparisons are intractable, we introduce the notion of a conditional Bayes factor (CBF), first discussed by Dickey and Gunel [37] and employed in a seemingly unrelated regression context by [31]. The CBF compares two models in a nested hierarchical system, conditional on parameters not influenced by the models under consideration. We show that the CBF for both outcome and instrumental equations is exceedingly straightforward to calculate and essentially reduces to the normalizing constants of a multivariate normal distribution.

Further, we note that our method can handle generalized linear mixed models with multiple endogenous variables in a straightforward fashion. This leads to a procedure in which model moves are embedded in a Gibbs sampler, which we term MC3-within-Gibbs. Based on this order of operations, IVBMA is only trivially more complicated than a Gibbs sampler that does not incorporate model uncertainty and thus appears to have limited issues regarding mixing. This feature is essential as it shows more complicated scenarios involving endogeneity, instrumentation, and model uncertainty can be handled within this framework, an important feature when constructing more involved Bayesian hierarchical models.

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

When working with a large system of equations subject to endogeneity and instrumentation, there is a natural concern that the instrument assumptions may not hold. A host of frequentist-type hypotheses has been proposed to examine the instrument conditions; the most familiar to applied researchers is the test of Sargan [38]. There have been, to our knowledge, no similar checks of instrument validity proposed in the Bayesian IV literature outside of the approximate method advocated in [16]. We offer a new verification of instrument validity, also based on CBFs, which appears to be the Bayesian analog of the Sargan test. This method can integrate seamlessly with the IVBMA framework and offers a check of instrument validity.

The article proceeds as follows. The basic framework we consider and the Gibbs sampler ignoring model uncertainty is discussed in Section 2. Section 3 reviews the concept of model uncertainty, introduces the notion of CBFs, and derives the conditional model probabilities used by IVBMA. In Section 4, we propose our method of assessing instrument validity. Section 5 presents empirical illustrations of the proposed model for predicting box office revenues. Lastly, we summarize and conclude with potential applications of the IVBMA approach.

#### **2. The instrumental variable model with multiple endogenous variables**

We consider the following classic linear system model with multiple endogenous variables:

$$Y\_{\dot{r}} = \mathbf{U}\_{\dot{i}}^{(r)'} \mathfrak{f}\_r + \varepsilon\_{\dot{r}r},\tag{1}$$

where *r*∈f g 1, … , *R* denotes the *R* equations in the system and *i*∈ f g 1, … , *n* a set of *iid* observations. Throughout, we assume that *Yi*<sup>1</sup> represents the dependent outcome of interest and (*Yi*2, … ,*YiR*Þ represent endogenously determining variables for observation *i*. Thus, each covariate vector *U*ð Þ*<sup>r</sup> <sup>i</sup>* has length *pr* and is formed such that

$$\mathcal{U}\_i^{(1)} = \left(Y\_{i1} \dots Y\_{iR} W\_{i1} \dots W\_{iq}\right)',$$

while

$$\mathbf{U}\_i^{(r)} = \left( Z\_{i1} \dots Z\_{i\mathbf{s}} W\_{i1} \dots W\_{iq} \right)',$$

for *r* > 1. Letting *ε<sup>i</sup>* ¼ ð Þ *εi*1, … , *εiR* <sup>0</sup> , we assume

$$
\mathfrak{e}\_i \sim \mathcal{N}\_\mathbb{R}(\mathbf{0}, \mathbf{K}^{-1}).\tag{2}
$$

When *K*1*<sup>r</sup>* 6¼ 0 for a given *r* > 1, this implies a lack of conditional independence between the residuals for the response and the associated endogenous variable. This contaminates inference if unaccounted for, necessitating the existence of instruments *Z<sup>i</sup>* that do not appear in *U*ð Þ<sup>1</sup> *<sup>i</sup>* and joint estimation of the parameters in Eq. (1) and Eq. (2).

Generalized linear mixed models provide a unified approach that directly acknowledges multiple levels of dependency and model different data types [39–42]. Extensions to generalized linear models implicitly assume a continuous response with Gaussian errors. Extending these developments to alternative sampling models is straightforward in the context of a random-effects framework. Let *g* a link function such that for the response *Yi*,

$$E[Y\_{i1}] = \mathbf{g}^{-1} \left( \mathbf{U}\_i^{(1)'} \boldsymbol{\mathcal{J}}\_1 + \varepsilon\_{i1} \right),\tag{3}$$

while the remaining *Yir* have forms given by Eq. (1), and the residual vector remains distributed according to a N 0, *K*�<sup>1</sup> � � distribution. Below we first develop the normal IVBMA with an identity link.

We proceed by discussing the Bayesian estimation of these parameters under standard conjugate priors, following the developments of [36]. Accordingly, with each parameter vector, we assume

$$
\mathfrak{G}\_r \sim \mathcal{N}(\mathbf{0}, \mathbb{I}\_{p\_r}),
$$

and

$$\mathbb{K} \sim \mathcal{W}(\mathbf{3}, \mathbb{I}\_{\mathbb{R}})$$

where *K* � Wð Þ *δ*, *D* represents a Wishart distribution with density

$$\operatorname{pr}(\mathbf{K}|\delta,\mathbf{D}) \propto |\mathbf{K}|^{\frac{\delta-2}{2}} \exp\left(-\frac{1}{2}tr(\mathbf{K}\mathbf{D})\right) \mathbf{1}\_{K \in \mathcal{P}\_{\mathbb{R}}},$$

where *ℙ<sup>R</sup>* is the cone of symmetric positive definite matrices.

Let *θ* ¼ *β*1, … , *β<sup>R</sup>* f g , *K* represent the collection of parameters to be estimated. Denote the data <sup>D</sup> <sup>¼</sup> *<sup>Y</sup>*, *<sup>U</sup>*ð Þ<sup>1</sup> , … , *<sup>U</sup>*ð Þ *<sup>R</sup>* � �, where *<sup>Y</sup>* is the *<sup>n</sup>* � *<sup>R</sup>* matrix of responses and endogenous variables, and each *<sup>U</sup>*ð Þ*<sup>r</sup>* is a *<sup>n</sup>* � *pr* matrix. Then, our goal is to determine the posterior distribution *pr*ð Þ *θ*jD . Rossi et al. [36] discuss the estimation of this model for the case when *R* = 2 and note that it is not possible to evaluate this posterior directly. However, an approximate inference may be performed via Gibbs sampling.

Fix *r* and suppose that *K* and all *β<sup>t</sup>* for *t* 6¼ *r* are given. Note, by properties of standard normal variates that

$$
\varepsilon\_{ir}|\mathbf{K}, \{\boldsymbol{\mathfrak{f}}\_{t}\}\_{t \neq r} \sim \mathcal{N}(\boldsymbol{\mu}\_{ir}, \boldsymbol{K}\_{rr}^{-1}),
$$

where

$$\mu\_{\dot{r}} = -\sum\_{t \neq r} \frac{K\_{\pi t}}{K\_{\pi}} \left( Y\_{\dot{x}} - \mathbf{U}\_{i}^{(t)} \boldsymbol{\mathfrak{f}}\_{t} \right).$$

Set *<sup>Y</sup>*~*ir* <sup>¼</sup> *Yir* � *<sup>μ</sup>ir* and thus note that

$$
\tilde{Y}\_{ir} \sim \mathcal{N}\left(\mathbf{U}\_i^{(r)} \boldsymbol{\mathcal{J}}\_r, \boldsymbol{K}\_{rr}^{-1}\right).
$$

The act of conditioning, therefore, turns the original system into a simple linear regression problem, and via standard results, we have that

$$
\langle \boldsymbol{\beta}\_r | \mathbf{K}, \{ \boldsymbol{\beta}\_t \} \_{t \neq r} \sim \mathcal{N} \left( \hat{\boldsymbol{\beta}}\_r, \mathbf{Q}\_r^{-1} \right) \tag{4}
$$

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

where

$$\mathfrak{Q}\_r = K\_{rr} \mathbf{U}^{(r)'} \mathbf{U}^{(r)} + \mathbb{I}\_{p\_r},$$

$$\hat{\mathfrak{P}}\_r = K\_{rr} \mathfrak{Q}\_r^{-1} \mathbf{U}^{(r)'} \mathbf{\tilde{Y}}\_r.$$

Finally, suppose that all *β<sup>r</sup>* are given, then

$$\mathbf{K} \sim \mathcal{W}(\delta + n, \mathbf{E} + \mathbb{I}\_R), \tag{5}$$

where

$$\mathbf{E} = \sum\_{i=1}^{n} \mathbf{e}\_i \mathbf{e}\_i,$$

with each *ε<sup>i</sup>* computed relative to the current state of *β*1, … ,*βR*.

Eq. (4) and Eq. (5) thereby give the full conditionals necessary for the Gibbs sampler. For a basic introduction to MCMC sampling with illustration, see [43]. Our approach differs slightly from that of Rossi et al. [36], in that their Gibbs sampler features a more involved manner of updating the instrumental covariates *β*2. Though the two approaches evaluate the same posterior distribution, the application of [36] when R≥3 is not straightforward, and it only applies to a linear regression model. Therefore, we find that the above approach leads to more coherent implementation and description, and therefore prefer it to that of [36] for the generalized linear models with multiple endogenous variables.

For a Poisson regression using a log link in Eq. (3), the term *ε<sup>i</sup>*<sup>1</sup> is no longer observable and is often referred to as a Poisson random effect model [41, 44, 45]. However, in a Gibbs sampling framework, these factors may be incorporated into additional parameters to be determined in the posterior. Appendix 2 shows how MCMC methods can be implemented when *Yi*<sup>1</sup> in (3) has a Poisson likelihood.

#### **3. Incorporating model uncertainty**

We outline our method for incorporating model uncertainty into the framework in Eq. (1) and Eq. (2). To explain the motivation behind our CBF approach, we first review a classic Bayesian model selection method. We then show how the concept of Bayes Factors can be usefully embedded in a Gibbs sampler yielding CBFs. These CBFs are then shown to yield straightforward calculations.

#### **3.1 Model selection and Bayes factors**

In a general framework, incorporating model uncertainty involves considering a collection of candidate models I, using the data D. Each model *I* consists of a collection of probability distributions for the data D, f g *pr*ð Þ Dj*ψ* , *ψ* ∈ Ψ*<sup>I</sup>* where Ψ*<sup>I</sup>* denotes the parameter space for the parameters of model *I* and is a subset of the full parameter space Ψ.

By letting the model become an additional parameter to be assessed in the posterior, we aim to calculate the posterior model probabilities given the data D. By Bayes' rule

$$pr(I|\mathcal{D}) = \frac{pr(\mathcal{D}|I)pr(I)}{\sum\_{I' \in \mathcal{I}} pr(\mathcal{D}|I')pr(I')} \tag{6}$$

where *pr I*ð Þ, denotes the prior probability for model *I* ∈I. The integrated likelihood *pr*ð Þ Dj*I* , is defined by

$$\operatorname{pr}(\mathcal{D}|I) = \int\_{\Psi\_{l}} \operatorname{pr}(\mathcal{D}|\Psi) \operatorname{pr}(\mathcal{\nu}|I) d\mathcal{\nu} \tag{7}$$

where *pr*ð Þ *ψ*j*I* is the prior for *ψ* under model *I*, which by definition has all its mass on Ψ*I*.

One possibility for pairwise comparison of models is offered by the Bayes factor (BF), which is in most cases defined together with the posterior odds [22, 46]. The posterior odds of model *I* versus model *I* <sup>0</sup> are given by

$$\frac{pr(I|\mathcal{D})}{pr(I'|\mathcal{D})} = \frac{pr(\mathcal{D}|I)}{pr(\mathcal{D}|I')} \frac{pr(I)}{pr(I')},$$

where

$$\frac{pr(\mathcal{D}|I)}{pr(\mathcal{D}|I')} \text{and} \frac{pr(I)}{pr(I')}$$

denote the Bayes factor and the prior odds of *I* versus *I* 0 , respectively.

When the integrated likelihood in Eq. (7) and thus the BF can be computed directly, a straightforward method for exploring the model space, Markov Chain Monte Carlo Model Composition (MC3), was developed by Madigan and York [35]. MC3 determines posterior model probabilities by generating a stochastic process that moves through the model space I and has equilibrium distribution *pr I*ð Þ jD . Given the current state *I* ð Þ*<sup>s</sup>* , MC3 (a) proposes a new model *I* <sup>0</sup> according to a proposal distribution *q*ð Þ �j� , (b) calculates

$$a = \frac{pr(\mathcal{D}|I')pr(I')q\left(I^{(\varsigma)}|I'\right)}{pr(\mathcal{D}|I^{(\varsigma)})pr(I^{(\varsigma)})q\left(I'|I^{(\varsigma)}\right)},$$

and (c) sets *I* ð Þ *<sup>s</sup>*þ<sup>1</sup> <sup>¼</sup> *<sup>I</sup>* <sup>0</sup> with probability min f g *α*, 1 , otherwise setting *I* ð Þ *<sup>s</sup>*þ<sup>1</sup> <sup>¼</sup> *<sup>I</sup>* ð Þ*<sup>s</sup>* . It is important to note that moving between models via the MC3 approach constitutes a valid MCMC transition. This feature is critical in the development below, in that MC3 moves may be nested inside larger structures in a manner similar to Gibbs updates.

#### **3.2 Model determination**

Incorporating model uncertainty into the system Eq. (1) involves considering a separate model space M*<sup>r</sup>* for each equation in the system. A given model *Mr* ∈M*<sup>r</sup>* thus restricts certain elements of *β<sup>r</sup>* to zero, and we write *βMr* to indicate the non-zero elements according to *Mr*. We further let *ΛMr* be the subspace of *ℝpr* spanned by *βMr* . Ideally, we would be able to incorporate model uncertainty into this system in a manner analogous to that described above. Unfortunately, the following cannot be directly calculated in any discernible way.

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

$$pr(\mathcal{D}|M\_1, \dots, M\_R) = \int\_{\mathcal{P}\_R} \int\_{\Lambda\_{M\_1}} \cdots \int\_{\Lambda\_{M\_R}} pr\left(\mathcal{D}|\left\{\boldsymbol{\theta}\_{M\_r}\right\}\_{r=1}^R, \mathbf{K}\right) pr(\mathbf{K}) \prod\_{r=1}^R pr(\boldsymbol{\theta}\_{M\_r}) d\boldsymbol{\theta}\_{M\_1} \cdots d\boldsymbol{\theta}\_{M\_k} d\mathbf{K}$$

Therefore, an implementation of MC3 in the product space of M<sup>1</sup> � ⋯ � M*<sup>R</sup>* is infeasible. What we show below, however, is that embedding MC3 within the Gibbs sampler, and therefore calculation using CBFs to move between models offers an extremely efficient solution. CBFs were initially discussed in Dickey and Gunel [37] in a different context.

Given the system Eq. (1), fix *<sup>r</sup>* and suppose that *<sup>θ</sup>*�*<sup>r</sup>* <sup>¼</sup> *<sup>K</sup>*, f g *<sup>β</sup><sup>t</sup> <sup>t</sup>*6¼*<sup>r</sup>* n o is given. Now consider comparing two models *Mr*,*Lr* ∈M*r*. Finally, suppose that the prior over models M*<sup>r</sup>* is set independent of *θ*�*r*. We then have

$$\frac{pr(\mathcal{M}\_r|\mathcal{D}, \theta\_{-r})}{pr(L\_r|\mathcal{D}, \theta\_{-r})} = \frac{pr(\mathcal{D}|\mathcal{M}\_r, \theta\_{-r})}{pr(\mathcal{D}|L\_r, \theta\_{-r})} \times \frac{pr(\mathcal{M}\_r)}{pr(L\_r)}.\tag{8}$$

Thus, the conditional posterior odds depend on calculating a Bayes factor conditional on the current state of *θ*�*<sup>r</sup>*.

Calculating the relevant terms in Eq. (6) is straightforward. In particular, we note that

$$\int \operatorname{pr}(\mathcal{D}|\mathcal{M}\_r, \theta\_{-r}) = \int\_{\mathcal{M}\_r} \operatorname{pr}(\mathcal{D}|\theta\_{\mathcal{M}\_r}, \theta\_{-r}) \operatorname{pr}(\mathcal{J}\_{\mathcal{M}\_r}|\mathcal{M}\_r) d\theta\_{\mathcal{M}\_r}.$$

which is, in essence, an integrated likelihood for model *Mr* conditional on fixed values of *θ*�*<sup>r</sup>*. In Appendix 1, we show that

$$\int\_{\Lambda\_{\rm M\_r}} pr(\mathcal{D}|\boldsymbol{\theta}\_{\mathcal{M\_r}}, \boldsymbol{\theta}\_{-r}) d\boldsymbol{\beta}\_{\mathcal{M\_r}} \propto |\boldsymbol{\Omega}\_{\mathcal{M\_r}}|^{-1/2} \exp\left(\frac{1}{2} \boldsymbol{\hat{\boldsymbol{\beta}}\_{\mathcal{M\_r}}} \boldsymbol{\hat{\boldsymbol{\Omega}}}\_{\mathcal{M}, \boldsymbol{\hat{\boldsymbol{\beta}}}\_{\mathcal{M\_r}}}\right) \tag{9}$$

where ^*βMr* and *<sup>Ω</sup>Mr* relative to the subspace *<sup>Λ</sup>Mr*.

The power of this result is that the model *Mr* and the associated parameter *βMr* may then be updated in a block. In particular, we note that

$$pr(\mathcal{J}\_r, M\_r | \theta\_{-r}, \mathcal{D}) = pr(\mathcal{J}\_r | M\_r, \theta\_{-r}, \mathcal{D}) \times pr(M\_r | \theta\_{-r}, \mathcal{D}).$$

Since MC3 constitutes a valid MCMC transition in the model space M*r*, we may first attempt to update *Mr* via Eq. (8) and then subsequently resample *βMr* via Eq. (4). By cycling through all *R* equations in Eq. (1) in this manner, and then subsequently updating *K*, we have proposed a computationally efficient estimation strategy for incorporating model uncertainty in IV frameworks.

#### **4. Assessing instrument validity**

For the estimates *β*<sup>1</sup> to have appropriate inferential properties, it is critical that the instrumental variables *Z* be valid. In other words, *E Z<sup>i</sup>* <sup>0</sup> ½ *ε<sup>i</sup>*1j*εi*2, … , *εiR*� ¼ **0**. Many tools exist for evaluating the validity of this assumption in frequentist settings, and the most popular method is the test of Sargan [38]. To our knowledge, consideration of similar assessments in a Bayesian framework has not been explored beyond the

approximate analysis proposed in [16]. We offer a Bayesian evaluation of instrument validity, borrowing many of the ideas above and merging them with the idea of the Sargan test.

Suppose that all residuals were known. Let *ς* be such that

$$
\mathfrak{c}\_i = \mathfrak{e}\_{i1} + \sum\_{r=2}^{\mathbb{R}} \frac{K\_{1r}}{K\_{11}} \mathfrak{e}\_{ir}.
$$

The essential notion of the Sargan test is to consider the model,

$$\boldsymbol{\varsigma}\_{i} = \mathbf{Z}'\_{i}\boldsymbol{\xi} + \eta\_{i}, \eta\_{i} \sim \mathcal{N}(\mathbf{0}, \boldsymbol{\tau}^{-1}),$$

and test whether *ξ* 6¼ **0**. The mechanics of the Sargan test ultimately rely on asymptotic theory, and Lenkoski et al. [16] discuss its poor performance in low sample size environments.

Our approach is to model this in a Bayesian context. In particular, we consider two models: *<sup>J</sup>*<sup>0</sup> which states that *<sup>ξ</sup>* <sup>¼</sup> **<sup>0</sup>** and *<sup>J</sup>*<sup>1</sup> which puts *<sup>ξ</sup>* <sup>∈</sup>*ℝ<sup>q</sup>*. We then aim to determine whether *pr J*<sup>0</sup> ð Þ jD is large, indicating instrument validity. Note that this can be represented as the following marginalization

$$pr(J\_0|\mathcal{D}) = \int pr(J\_0|\mathfrak{s}, \mathcal{D}) pr(\mathfrak{s}|\mathcal{D}) d\mathfrak{s}.\tag{10}$$

Let *<sup>θ</sup>*ð Þ<sup>1</sup> , … , *<sup>θ</sup>*ð Þ*<sup>S</sup>* n o be an MCMC sample of *pr*ð Þ *<sup>θ</sup>*j<sup>D</sup> and *<sup>ς</sup>*ð Þ<sup>1</sup> , … , *<sup>ς</sup>*ð Þ*<sup>S</sup>* � � be the associated realization from each MCMC draw. This draw then enables us to approximate (10) with

$$\int pr(J\_0|\boldsymbol{\varsigma}, \mathcal{D}) pr(\boldsymbol{\varsigma}|\mathcal{D}) d\boldsymbol{\varsigma} = \frac{1}{\mathcal{S}} \sum\_{s=1}^{S} pr\left(J\_0|\boldsymbol{\varsigma}^{(s)}, \mathcal{D}\right).$$

Note that

$$pr\left(J\_0|\boldsymbol{\xi}^{(\iota)}, \mathcal{D}\right) = \frac{1}{1 + \frac{pr(J\_1|\boldsymbol{\xi}^{(\iota)}, \mathcal{D})}{pr(J\_0|\boldsymbol{\xi}^{(\iota)}, \mathcal{D})}}$$

and therefore, we have reduced the problem of assessing *pr J*<sup>0</sup> ð Þ jD to evaluating several CBFs. At this juncture, note that

$$\int pr\left(J\_0|\mathfrak{c}^{(s)},\mathcal{D}\right) \propto pr\left(\mathfrak{c}^{(s)}|J\_0,\mathcal{D}\right) \bullet pr(J\_0) = \int\_0^{\bullet} pr\left(\mathfrak{c}^{(s)}|\tau,\mathcal{D}\right) pr(\tau)d\tau \bullet pr(J\_0),$$

while

$$\int pr\left(J\_1|\boldsymbol{\xi}^{(\boldsymbol{\varepsilon})},\mathcal{D}\right) \propto pr\left(\boldsymbol{\xi}^{(\boldsymbol{\varepsilon})}|I\_1,\mathcal{D}\right) \cdot pr(I\_1) = \int\_0^\infty \int\_{\mathbb{R}^3} pr\left(\boldsymbol{\xi}^{(\boldsymbol{\varepsilon})}|\boldsymbol{\tau},\boldsymbol{\xi},\mathcal{D}\right) pr(\boldsymbol{\xi},\boldsymbol{\tau})d\boldsymbol{\xi}d\boldsymbol{\tau} \cdot pr(I\_1).$$

Evaluation of these integrals thus requires the specification of priors *pr*ð Þ*τ* under *J*<sup>0</sup> and *pr*ð Þ *ξ*, *τ* under *J*<sup>1</sup> . Under the model *J*0, we propose the standard prior

$$
\mathfrak{r} \sim \Gamma(\mathbf{1}/2, \mathbf{1}/2)
$$

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

which yields

$$\Pr\left(J\_0|\boldsymbol{\xi}^{(s)}, \mathcal{D}\right) \propto \left(\frac{1}{2} + \frac{\mathbf{g}^{(s)'}\mathbf{g}^{(s)}}{2}\right)^{-(n+1)/2}.\tag{11}$$

For *J*1, we use the prior

$$
\mathfrak{r} \sim \Gamma(\mathbf{1}/2, \mathbf{1}/2)
$$

$$
\mathfrak{g}|\mathfrak{r} \sim \mathcal{N}(\mathbf{0}, \mathfrak{r}^{-1} \mathbb{I}\_q).
$$

which yields

$$\Pr\left(I\_1|\boldsymbol{\xi}^{(i)},\mathcal{D}\right) \propto |\boldsymbol{\Xi}|^{-\frac{1}{2}} \left(\frac{1}{2} + \frac{\left(\boldsymbol{\xi}^{(i)} - \mathbf{Z}\hat{\boldsymbol{\xi}}^{(i)}\right)' \left(\boldsymbol{\xi}^{(i)} - \mathbf{Z}\hat{\boldsymbol{\xi}}^{(i)}\right)}{2}\right)^{-\frac{n+1}{2}},\tag{12}$$

where

$$
\boldsymbol{\Xi} = \boldsymbol{\tau} (\mathbf{Z}^\prime \mathbf{Z} + \mathbb{I}\_q),
$$

$$
\boldsymbol{\hat{\xi}} = \boldsymbol{\tau} \boldsymbol{\Xi}^{-1} \mathbf{Z} \boldsymbol{\varsigma}^{(s)}.
$$

This approach offers similar performance to the Sargan test, which has the desirable feature that it is a fully Bayesian approach, as opposed to the approximate test of [16], and it can be directly embedded in the Gibbs sampling procedures outlined above. We emphasize in the discussion section that further work can be done on this diagnostic.

#### **5. Empirical study: determinants of opening box office**

In this section, we consider a generalized linear model with an identity link in the presence of multiple endogenous variables and covariates based on the IVBMA framework incorporating model uncertainty. Based on previous studies of box office revenues, we estimate the effects of three endogenous predictors, prelaunch advertising spending, the number of screens, and production budget with other covariates on opening box office.

Several studies have established a significant link between advertising expenditures and box-office grosses [47–50]. Almost 90% of a movie's advertising budget is allocated in the weeks leading up to the theatrical launch [49] shows the importance of prerelease advertising. The number of screens on which a movie is released has been recognized as one of the most significant factors related to the box office [51–53]. Prerelease advertising spending and the number of opening screens need to be considered endogenous because it is plausible for movies that are expected to generate high box office gross to receive more advertising and distribution. That is, advertising spending and distribution are more likely to be determined by expected box office revenues.

Major studios dominate the movie marketplace regarding film production and distribution. The production budget is an essential predictor because big budgets

translate into the casting of top actors and directors, lavish sets and costumes, special effects, and expensive digital manipulations, leading to heightened audience attractiveness [54, 55]. Previous studies [55–57] used production budget as a direct influencer or moderating variable, but it is also the studio's strategic decision using knowledge about viewers and competitors' actions, that is, the data reflect firm's strategic behavior [58]. While researchers examined endogeneity in advertising responsiveness using a control function approach [14] or price endogeneity using Gaussian copula [9], they did not simultaneously control for multiple endogenous variables or incorporate model uncertainty. The proposed approach can test the effects of three endogenous variables in a generalized framework.

#### **5.1 Description of the data**

Starting from all movies released by major studios from 2006 to 2007, we analyzed 130 movies, including 16 animation and 50 R-rated movies, based on the IMDb database. We have excluded films without the complete prerelease advertising information from TNS Media Intelligence. Advertising data include the total dollar value of prerelease media expenditure across 17 different media. The number of opening screens, production budget, and opening box office gross are obtained from IMDb.c om and BoxOfficeMojo.com. **Table 1** shows the summary statistics of the dependent variable and three endogenous variables. Opening box office gross varies from less than a million to over 100 million dollars. The production budget represents the most significant expense for movie studios [49]. For movies in our sample, they are about \$52 million on average and vary from \$4 million to \$210 million. It becomes crucial for films with high production costs to succeed at the box office to recover their costs, resulting in higher advertising spending and showing at more theaters.

The three endogenous predictors were regressed on eleven potential instruments and thirteen additional covariates, summarized in **Table 2**. Covariates such as genre, MPAA rating, animation, sequels, and release date are publicly available on IMDb and The Numbers. The genre is classified into seven categories (action, comedy, drama, horror, Sci-Fi, mystery/suspense, and romance), and the MPAA rating into two dummy variables (R, PG-13, and others).

The MPAA rating is related to the potential size of viewers. Not R-rated movies are open to more moviegoers from the outset, making it necessary to have wider releases and intensive advertising. Critics' ratings are obtained from the Rotten Tomatoes website, which gives a composite score of 1–100 based on evaluations from movie critics. A monthly seasonality index was obtained by estimating a decomposition model using a time series of monthly box office gross. The seasonal parameter was optimized at 0.56 with the mean absolute percentage error of 10.5%.


**Table 1.** *Summary statistics.* *Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*


#### **Table 2.**

*Description of the instruments (Z) and covariates (W).*

For the two endogenous variables, prerelease advertising and opening number of screens, we have used four common instruments of the 11 variables: (a) movie distributors, (b) release time, (c) average marketability ratings by three industry experts in one of the major studios, and (d) whether the same studio did production and distribution. Studios have considerable discretion over the amount and schedule of prelaunch advertising they allocate to each movie [51]. Because advertising elasticities for motion pictures are significantly higher compared to other industries [52], studios' decisions on prerelease advertising spending and opening screens would have a significant impact on the success at the box office. We have included eight major studios to examine any studio-specific effects on advertising and distribution. Release time is another critical characteristic since movie advertising is seasonal, as heavily supported movies are usually released in peak seasons [51]. Based on the monthly box office gross from 2001 to 2010, we have found a substantial increase in box office gross in May–July and December. A dummy variable is used to indicate those months. For the third endogenous variable, production budget, we exclude release time and expert ratings since they are unavailable at the time of budget decision. Similarly, the seasonal index and critics review were also excluded from the regression of the production budget. Some major studios like 20th Century Fox and Paramount are vertically integrated, having their distribution division. A dummy variable *Direct* indicates whether both production and distribution divisions finance a movie. For the common instruments on each endogenous regressor, the proposed IVBMA approach has a built-in capability of variable selection using the posterior inclusion probability.

#### **5.2 Results**

**Table 3** shows the IVBMA posterior estimates of the first stage. The sum of the models' posterior probabilities containing the variable is called the inclusion probability [16, 23]. In **Table 3**, column *IncProb* shows posterior inclusion probabilities in the first stage, which provide a direct interpretation of the efficacy of an instrumentation strategy. Related to prerelease advertising spending, we find a robust movietype effect for animation, sequels, and PG-13. Animated family films have performed consistently well at the box office, and Pixar and DreamWorks Animation are the most represented studios. Movie sequels build on the original movies' commercial success and can be considered a brand extension of the experiential product [59]. Given the original movie's brand power, a sequel usually achieves box office success [60]. The negative coefficient of *Sequels* results from relatively low advertising costs, which is one of the benefits of brand extensions [61]. The posterior inclusion probabilities of *Animation* and *Sequels* are 0.9, which shows generous production budgets for those movies. The marketability ratings by industry experts are significant predictors for prerelease advertising and opening screens. Considering that the ratings are based on the feedback from advance movie screenings, they are reliable indicators of box office performance accompanied by heavy advertising and broader release.

As expected, a seasonal index shows a high inclusion probability for both endogenous variables, which aligns with the common belief that movies with high expected gross are carefully scheduled to be released in peak seasons. *Release time*, however, shows no impact, and the result is mainly due to the sample characteristic that more than 65% of the movies in the sample were released in historically no peak months. Note that a seasonal index is calculated for the duration under investigation (2006– 2007) while *Release time* is based on a 10-year window. Therefore, a seasonal index captures short-term fluctuations more accurately.

For prerelease advertising, the PG-13 rating is included with probability one. It concerns the size of potential viewers since non-R ratings imply greater reach among moviegoers, which may result in a higher level of advertising. There is empirical evidence from more than one systematic investigation to show that R-rated movies generate smaller revenues than those with less restrictive ratings [47, 62]. We also find that a dummy variable GD5 for Horror films is a significant predictor of prerelease advertising. This result may reflect the popular trend at that time. There are 15 horror movies in the sample including *I am Legend*, *Silent Hills*, and *Saw III*, which have been very successful at the box office. Consistent with previous literature on critics' reviews [49, 63], we find a significant impact of reviews on movie advertising. The industry practice of using critics' quotations in film advertisements supports the continuing authority of film critics. The use of critics' reviews in movie advertisements indicates distributors' beliefs and the significance of critics as a cultural intermediary for audiences [64].

In contrast, critical reviews were not included in explaining opening screens. It is consistent with the findings that the relationship between reviews and distributor's decision is spurious [65], and there is only a positivity bias of exhibitors such that an excellent review allows a movie to stay longer on-screen while negative reviews do not shorten a film's run [66]. That is, critical reviews do not influence an exhibitor's decision to keep or withdraw a movie from a theater.

As shown in **Table 3**, regarding production budget, distributor effects are evident from the high inclusion probabilities of the studios besides movie characteristics such as *Sequels* and *Animation*. Though 21st Century Fox and Columbia have released more


#### *Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

**Table 3.** *IVBMA results (first*

 *stage).* movies than other studios (37 in the sample), Paramount, Universal, and Warner Brothers had a higher average production budget per movie among major studios, and Lions Gate was the leading independent producer/distributor from 2006 to 2007. PG-13 rating, combined with the Action/Adventure genre consistently performs better than others at the box office by broadening its audience appeal [47]. Interestingly, the instrument, *Direct*, has a high inclusion probability only for the production budget. It is the case that the deals struck between distributors and exhibitors when they are separately owned are different as vertically integrated studios that are keen to get more movies through their theaters at all times because this maximizes returns from ticket sales and ancillary items such as food and drink. When the audiences start to fall, an exhibitor will prefer to end its run and show another new movie that will boost attendance figures again. Exhibitors favor signing short-run contracts for movies, but signing can be avoided if the same studio controls production, distribution, and exhibitions [67].

**Table 4** shows the IVBMA posterior estimates of the second-stage regression. As discussed in section 4, we have tested instrumental validity based on a Bayesian approach. As mentioned in Section 4, the validity score represents the probability that the instrument condition is not satisfied. All instruments used in the study are essentially zero, which strongly supports the validity of the instrumentation choices. In the second stage, several variables are essential predictors of opening box office revenues. As expected, the number of opening screens and prerelease advertising are


**Table 4.** *IVBMA results (second stage).*

#### *Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

significant determinants of opening box office gross with high inclusion probabilities. Though it is difficult to disentangle the causal effect of advertising on sales using data on actual box office receipts, it is consistent with previous findings that prerelease advertising has a positive and statistically significant impact on public awareness of a movie and its box office performance [47, 49, 50, 68]. While Elberse and Eliashberg [52] argue that movie attributes and advertising expenditures mostly influence revenues indirectly through their impact on exhibitors'screen allocations, this result supports a significant direct effect of advertising. The number of opening screens is the most important predictor, with an inclusion probability of one, which is also consistent with previous findings [53, 69, 70]. It seems to be the case that the more screens on which new movies were released, the bigger their initial audiences. The higher the audience for a movie in the opening weekend, the higher would be its audience the following week. While audiences inevitably drop off over time, a movie's cinema run would be longer if it got off to a good start. Considering a typically high correlation between opening screens and prerelease advertising, studios' advertising and distribution approaches may be very similar. Other than these two factors, Sequels and Drama show high inclusion probabilities, which may only reflect the characteristics of successful movies in the sample. Though we initially expected a significant effect of seasonality, it turns out to have a weak influence, though it remains relevant. Production budget has low inclusion probability, and it suggests that a movie's production cost is an indicator of the creative talent involved or the extent to which the movie incorporates expensive special effects or uses elaborate set designs [49], but not a good indicator of success. For about 90 films released in the United States from 2008 to 2012 with budgets of more than \$100 million, most of them failed to generate enough revenues at the box office to cover their costs [71]. After all, big budgets do not guarantee success, and the only way to know how audiences react to a movie is to wait until it has been released and moviegoers have had the opportunity to see it.

#### **6. Conclusion**

Market response models often use endogenous regressors since marketing activities are nonrandom and reflect the firm's strategic behavior. Thus, ignoring the endogeneity of marketing actions will lead to incorrect estimates of response parameters and, consequently, to biased inferences [4, 58]. While researchers have developed various approaches to dealing with endogeneity, including the control function approach, Gaussian copula, or instrument-free approaches, the IV approach remains the technique of choice when dealing with endogeneity in econometrics and other areas of applied research. Almost invariably, empirical work in economics and marketing will be subject to much uncertainty about model specifications. This may be the consequence of the existence of different theories or different ways in which theories can be implemented in empirical models or other aspects such as assumptions about heterogeneity or independence of the observables [72]. It is important to realize that this uncertainty is an inherent part of the marketing response modeling.

We have proposed a computationally efficient solution to the problem of incorporating model uncertainty into IV estimation. The IVBMA method leverages an existing Gibbs sampler and shows that by nesting model moves inside this framework, model averaging can be performed with minimal additional effort. In contrast to the approximate solution proposed by [16], our method yields a theoretically justified,

fully Bayesian procedure. The applied examples show this method's benefit, by enabling additional factors to be entertained by the researcher, which are either incorporated where appropriate or promptly dropped.

The CBF approach is only one manner of incorporating model uncertainty in the framework considered. Two other options would be reversible jump schemes [29, 30] or specify a spike and slab prior [73]. We have chosen our approach because it fits nicely into the Gibbs sampling framework, unlike the reversible jump procedure of Koop et al. [29], and still explicitly incorporates uncertainty at the model level, unlike spike and slab type priors at the variable level. However, additional research is needed to explore the tradeoffs between these alternative methods of incorporating model uncertainty.

One assumption crucial to the Gibbs sampler's functioning is the multivariate normality of the residuals in Eq. (2). Conley et al. [74] discuss a Bayesian approach that allows nonparametric estimation of the distribution of error terms in a set of simultaneous equations using a Dirichlet process mixture (DPM). We note that the IVBMA methodology can readily incorporate the DPM framework by simply replacing the IV kernel distributions of [36] with IVBMA kernel distributions. A nonparametric IVBMA approach based on non-normal errors will be one of the model extensions in the future. Another critical issue is assessing instruments' validity in implementing IV methods. The Bayesian version of the Sargan test that we have proposed serves as a natural starting point for more involved methodologies, including latent factors though many features still need to be investigated on this front compared to other strategies.

IVBMA has the potential to be extended to more complicated likelihood frameworks. The proposed model can be extended to latent constructs in the context of structural equations modeling with latent Gaussian factors and, at the same time, selecting the suitable path model [75]. Survival analysis is another area that can benefit from the IVBMA approach in dealing with multiple endogenous regressors and implementing more flexible hazard specifications beyond the proportional hazard model [76]. Since the entire method uses a Gibbs framework, it can be incorporated in any setting where endogeneity, model uncertainty, and latent normality are present. In particular, the linear specification can be relaxed using semiparametric methods such as splines or more flexible approaches involving Gaussian processes. While the algorithms involved would understandably become more complex, the central concept involving using CBFs to assess model uncertainty would remain pertinent.

#### **Appendix A: determining the CBF calculations**

Here we outline the calculation of *pr* <sup>D</sup>j*Mr*, *<sup>β</sup>*�*<sup>r</sup>* ð Þ , *<sup>K</sup>* . Note that

$$pr(\mathcal{D}|M\_r, \mathcal{J}\_{-r}, \mathbf{K}) = \int\_{A\_{\mathcal{M}\_r}} pr(\mathcal{D}|\mathcal{J}\_r, \mathcal{J}\_{-r}, \mathbf{K}) pr(\mathcal{J}\_r|M\_r) d\mathcal{J}\_r.$$

Let *U*ð Þ*<sup>r</sup> Mr* be the submatrix of *<sup>U</sup>*ð Þ*<sup>r</sup>* associated with the variables in *Mr* and set *<sup>Y</sup>*<sup>~</sup> *<sup>r</sup>* as above. Then

$$\int\_{\Lambda\_{\mathsf{M}\_{r}}} pr(\mathcal{D}|\mathfrak{f}\_{r},\mathfrak{f}\_{-r},\mathbf{K}) pr(\mathfrak{f}\_{r}|\mathsf{M}\_{r}) d\mathfrak{f}\_{r} \propto \int\_{\Lambda\_{\mathsf{M}\_{r}}} (2\pi)^{-\frac{|\mathsf{M}\_{r}|}{2}} \exp\left(-\frac{1}{2} \left[-2\hat{\mathfrak{f}}\_{\mathsf{M}\_{r}} \mathfrak{Q}\_{\mathsf{M}\_{r}} \mathfrak{f}\_{r} + \mathfrak{f}\_{r} \mathfrak{Q}\_{\mathsf{M}\_{r}} \mathfrak{f}\_{r}\right]\right) d\mathfrak{f}\_{r},$$

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

where

$$
\Omega\_{M\_r} = K\_{rr} \mathbf{U}\_{M\_r}^{(r)'} \mathbf{U}\_{M\_r}^{(r)} + \mathbb{II}\_{|\mathcal{M}\_r|},
$$

$$
\hat{\boldsymbol{\beta}}\_{M\_r} = K\_{rr} \mathbf{\mathcal{Q}}\_{M\_r}^{-1} \mathbf{U}\_{M\_r}^{(r)'} \mathbf{\tilde{Y}}\_r.
$$

We can now see that the term in the integral is the canonical form of a Gaussian distribution. Appropriate completion therefore yields

$$pr(\mathcal{D}|\mathcal{M}\_r, \mathcal{J}\_{-r}, \mathbf{K}) \propto |\mathcal{Q}\_{\mathcal{M}\_r}|^{-1/2} \exp\left(-\frac{1}{2} \hat{\mathcal{J}}\_{\mathcal{M}\_r}^{\ \ \ \ \prime} \mathbf{a}\_{\mathcal{M}\_r} \hat{\mathcal{J}}\_{\mathcal{M}\_r}\right).$$

#### **Appendix B: Posterior determination in the Poisson Case**

Let

$$Y\_{i1} \sim \mathcal{P}\left(\mathbf{U}\_i^{(r)'} \boldsymbol{\mathfrak{f}}\_i + \boldsymbol{\mathfrak{e}}\_{i1}\right),$$

and for *r*> 1,

$$\mathbf{Y}\_{\dot{r}} = \mathbf{U}\_i^{(r)'} \mathfrak{f}\_r + \varepsilon\_{\dot{r}r},$$

where

$$e\_i \sim \mathcal{N}(\mathbf{0}, \mathbf{K}^{-1}).$$

The MCMC for this model roughly follows the algorithm mentioned above, but with the additional handling of the random effect *ε<sup>i</sup>*<sup>1</sup> and the subsequent updating of *β*1. Note that

$$pr(\varepsilon\_{i1}|\cdot) \propto pr\left(Y\_i|\mathbf{U}\_i^{(1)}, \theta\_1, \varepsilon\_{i1}\right) pr(\varepsilon\_{i1}|\mathbf{e}\_i\backslash\varepsilon\_{i1}, \mathbf{K})$$

where

$$pr(e\_{i1}|e\_i\backslash e\_{i1}, \mathbf{K}) = \mathcal{N}(\eta\_i, \kappa\_i^{-1})$$

with

$$\eta\_i = -\sum\_{r=2}^{\mathbb{R}} \frac{K\_{1r}}{K\_{11}} \varepsilon\_{ir}$$

$$\kappa\_i = \frac{1}{K\_{11}}$$

Further, denote *<sup>μ</sup><sup>i</sup>* <sup>¼</sup> *<sup>U</sup>*ð Þ<sup>1</sup> <sup>0</sup> *<sup>i</sup> β*1. Then

$$\exp(\varepsilon\_{i1}|\cdot) \propto \exp\left(-\exp\left(\mu\_i + \varepsilon\_{i1}\right) + (\mu\_i + \varepsilon\_{i1})Y\_{i1}\right) \exp\left(-\frac{1}{2}\kappa\_i(\varepsilon\_{i1} - \eta\_i)^2\right).$$

Writing

$$f(e\_{i1}) = -\exp\left(\mu\_i + e\_{i1}\right) + \left(\mu\_i + e\_{i1}\right)Y\_{i1} - \frac{1}{2}\kappa\_i(e\_{i1} - \eta\_i)^2$$

we have

$$f'(\varepsilon\_{i1}) = -\exp\left(\mu\_i + \varepsilon\_{i1}\right) + Y\_{i1} - \kappa\_i(\varepsilon\_{i1} - \eta\_i)$$

$$f''(\varepsilon\_{i1}) = -\exp\left(\mu\_i + \varepsilon\_{i1}\right) - \kappa\_i$$

Hence, by setting

$$b(e\_{i1}) = f'(e\_{i1}) - f''(e\_{i1})e\_{i1}$$

$$c(e\_{i1}) = -f''(e\_{i1})$$

we may sample *ε<sup>i</sup>*<sup>1</sup> <sup>0</sup> � N ð Þ *b*ð Þ *ε<sup>i</sup>*<sup>1</sup> *=c*ð Þ *ε<sup>i</sup>*<sup>1</sup> , 1*=c*ð Þ *ε<sup>i</sup>*<sup>1</sup> and accept this update with probability *min* f g *α*, 1 where

$$a = \frac{pr(Y\_{i1}|\mu\_i, \varepsilon\_{i1}\prime)pr(\varepsilon\_{i1}\prime|\eta\_i, \kappa\_i)pr\left(\varepsilon\_{i1}|b\left(\varepsilon\_{i1}\right), \varepsilon\left(\varepsilon\_{i1}\right)\right)}{pr(Y\_{i1}|\mu\_i, \varepsilon\_{i1})pr(\varepsilon\_{i1}|\eta\_i, \kappa\_i)pr\left(\varepsilon\_{i1}^{'}|b\left(\varepsilon\_{i1}\right), \varepsilon(\varepsilon\_{i1})\right)}.$$

Once all *ε<sup>i</sup>*<sup>1</sup> are updated, other updates mostly follow the steps above.

#### **Author details**

Jonathan Lee<sup>1</sup> \* and Alex Lenkoski<sup>2</sup>

1 College of Business and Public Management, University of La Verne, La Verne, CA, United States

2 Norwegian Computing Center, Oslo, Norway

\*Address all correspondence to: jlee2@laverne.edu

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

#### **References**

[1] Papies D, Ebbes P, Van Heerde HJ. Addressing endogeneity in marketing models. In: Leeflang PSH, Wieringa JE, Bijmolt THA, Pauwels KH, editors. Advanced Methods for Modeling Markets. Cham: Springer International Publishing; 2017. pp. 581-627. DOI: 10.1007/978-3-319-53,469-5\_18

[2] Dong X, Chintagunta PK, Manchanda P. A new multivariate count data model to study multi-category physician prescription behavior. Quantitative Marketing and Economics. 2011;**9**: 301-337. DOI: 10.1007/s11129-011- 9102-7

[3] Chintagunta P, Erdem T, Rossi PE, Wedel M. Structural modeling in marketing: Review and assessment. Marketing Science. 2006;**25**:604-616. DOI: 10.1287/mksc.1050.0161

[4] Hult GTM, Hair JF, Proksch D, Sarstedt M, Pinkwart A, Ringle CM. Addressing endogeneity in international marketing applications of partial least squares structural equation modeling. Journal of International Marketing. 2018; **26**:1-21. DOI: 10.1509/jim.17.0151

[5] Manchanda P, Rossi PE, Chintagunta PK. Response modeling with nonrandom marketing-mix variables. Journal of Marketing Research. 2004;**41**:467-478. DOI: 10.1509/jmkr.41.4.467.47005

[6] Chintagunta PK. Endogeneity and heterogeneity in a probit demand model: Estimation using aggregate data. Marketing Science. 2001;**20**:442-456. DOI: 10.1287/mksc.20.4.442.9751

[7] Villas-Boas JM, Winer RS. Endogeneity in brand choice models. Management Science. 1999;**45**: 1324-1338. DOI: 10.1287/mnsc.45.10. 1324

[8] Rossi PE. Even the rich can make themselves poor: A critical examination of IV methods in marketing applications. Marketing Science. 2014;**33**:655-672. DOI: 10.1287/mksc.2014.0860

[9] Park S, Gupta S. Handling endogenous regressors by joint estimation using copulas. Marketing Science. 2012;**31**:567-586. DOI: 10.1287/ mksc.1120.0718

[10] Ebbes P, Papies D, Heerde HJ. The sense and non-sense of holdout sample validation in the presence of endogeneity. Marketing Science. 2011; **30**:1115-1122. DOI: 10.1287/mksc.1110. 0666

[11] Kuksov D, Villas-Boas JM. Endogeneity and individual consumer choice. Journal of Marketing Research. 2008;**45**:702-714. DOI: 10.1509/ jmkr.45.6.702

[12] Louviere J, Train K, Ben-Akiva M, Bhat C, Brownstone D, Cameron TA, et al. Recent progress on endogeneity in choice modeling. Marketing Letters. 2005;**16**:255-265. DOI: 10.1007/s11002- 005-5890-4

[13] Petrin A, Train K. A control function approach to endogeneity in consumer choice models. Journal of Marketing Research. 2010;**47**:3-13. DOI: 10.1509/ jmkr.47.1.3

[14] Luan Y, Sudhir K. Forecasting marketing-mix responsiveness for new products. Journal of Marketing Research. 2010;**47**:444-457. DOI: 10.1509/jmkr. 47.3.444

[15] Moral-Benito E. Dynamic panels with predetermined regressors: Likelihood-based estimation and

Bayesian averaging with an application to cross-country growth. Banco de Espana Working Paper. 2011. DOI: 10.2139/ssrn.1844186

[16] Lenkoski A, Eicher TS, Raftery AE. Two-stage Bayesian model averaging in endogenous variable models. Econometric Reviews. 2014;**33**:37-41. DOI: 10.1080/07474938.2013.807150

[17] Fernández C, Ley E, Steel MFJ. Benchmark priors for Bayesian model averaging. Journal of Econometrics. 2001;**100**:381-427. DOI: 10.1016/S0304- 4076(00)00076-2

[18] Moral-Benito E. Model averaging in economics: An overview. Journal of Economic Surveys. 2015;**29**:46-75. DOI: 10.1111/joes.12044

[19] Abrevaya J, Hausman JA, Khan S. Testing for causal effects in a generalized regression model with endogenous regressors. Econometrica. 2010;**78**: 2043-2061. DOI: 10.3982/ecta7133

[20] Lewis SM, Eccleston JA, Russell KG. Designs for generalized linear models with several variables and model uncertainty. Technometrics. 2006;**48**:284-292. DOI: 10.1198/004017005000000571

[21] Clyde M, George EI. Model uncertainty. Statistical Science. 2004;**19**: 81-94. DOI: 10.1214/0883423040000 00035

[22] Raftery A. Approximate Bayes factors and accounting for model uncertainty in generalized linear models. Biometrika. 1996;**83**:251-266. DOI: 10.1093/biomet/83.2.251

[23] Hinne M, Gronau QF, van den Bergh D, Wagenmakers E-J. A conceptual introduction to Bayesian model averaging. Advances in Methods and Practices in Psychological Science. 2020;

**3**(2):200-215. DOI: 10.1177/ 2515245919898657

[24] Cohen-Cole E, Durlauf S, Fagan J, Nagin D. Model uncertainty and the deterrent effect of capital punishment. American Law and Economics Review. 2009;**11**:335-369. DOI: 10.1093/aler/ ahn001

[25] Durlauf SN, Kourtellos A, Tan CM. Is god in the details? A reexamination of the role of religion in economic growth. Journal of Applied Econometrics. 2012; **27**:1059-1075. DOI: 10.1002/jae.1245

[26] Kleibergen F, Zivot E. Bayesian and classical approaches to instrumental variable regression. Journal of Econometrics. 2003;**114**:29-72. DOI: 10.1016/S0304-4076(02)00219-1

[27] Kass RE, Wasserman L. A reference test for nested hypotheses with large samples. Journal of the American Statistical Association. 1995;**90**:928-934. DOI: 10.1080/01621459.1995.10476592

[28] Mirestean AT, Tsangarides CG, Chen H. Limited information Bayesian model averaging for dynamic panels with short time periods. IMF Working Papers. 2009;**2009**:A001. DOI: 10.5089/ 9781451872217.001

[29] Koop G, Leon-Gonzalez R, Strachan R. Bayesian model averaging in the instrumental variable regression model. Journal of Econometrics. 2012;**171**: 237-250. DOI: 10.1016/j.jeconom.2012. 06.005

[30] Green PJ. Reversible jump Markov Chain Monte Carlo computation and Bayesian model determination. Biometrika. 1995;**82**:711-732. DOI: 10.1093/biomet/82.4.711

[31] Holmes CC, Denison DGT, Mallick BK. Accounting for model uncertainty in *Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

seemingly unrelated regressions. Journal of Computational and Graphical Statistics. 2002;**11**:533-551. DOI: 10.1198/ 106186002475

[32] Drèze JH. Bayesian limited information analysis of the simultaneous equations model. Econometrica. 1976; **44**:1045-1075. DOI: 10.2307/1911544

[33] Kleibergen F, van Dijk HK. Bayesian simultaneous equations analysis using reduced rank structures. Econometric Theory. 1998;**14**:701-743. DOI: 10.1017/ S0266466698146017

[34] Strachan R, Inder B. Bayesian analysis of the error correction model. Journal of Econometrics. 2004;**123**:307-325. DOI: 10.1016/j.jeconom.2003.12.004

[35] Madigan D, York J. Bayesian graphical models for discrete data. International Statistical Review. 1995;**63**: 215-232. DOI: 10.2307/1403615

[36] Rossi PE, Allenby GM, McCulloch R. Bayesian Statistics and Marketing. New York: Wiley; 2006. DOI: 10.1002/ 0470863692

[37] Dickey JM, Gunel E. Bayes factors from mixed probabilities. Journal of the Royal Statistical Society: Series B: Methodological. 1978;**40**:43-46. DOI: 10.1111/j.2517-6161.1978.tb01645.x

[38] Sargan JD. The estimation of economic relationships with instrumental variables. Econometrica. 1958;**26**:393-415. DOI: 10.2307/1907619

[39] McCulloch CE, Searle S, Neuhaus JM. Generalized, Linear, and Mixed Models. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc.; 2008

[40] Natarajan R, Kass RE. Reference Bayesian methods for generalized linear mixed models. Journal of the American

Statistical Association. 2000;**95**:227-237. DOI: 10.1080/01621459.2000.10473916

[41] Zeger SL, Karim MR. Generalized linear models with random effects: A Gibbs sampling approach. Journal of the American Statistical Association. 1991; **86**:79-86. DOI: 10.2307/2289717

[42] McCulloch CE, Nelder JA. Generalized Linear Models. 2nd ed. Chapman and Hall; 1989

[43] van Ravenzwaaij D, Cassey P, Brown SD. A simple introduction to Markov Chain Monte–Carlo sampling. Psychological Bulletin & Review. 2018; **25**:143-154. DOI: 10.3758/s13423-016- 1015-8

[44] Hall DB. Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics. 2000;**56**:1030- 1039. DOI: 10.1111/j.0006- 341X.2000.01030.x

[45] Albert J. A Bayesian analysis of a Poisson random effects model for home run hitters. The American Statistician. 1992;**46**:246-253. DOI: 10.2307/2685306

[46] Kass RE, Raftery AE. Bayes factors. Journal of the American Statistical Association. 1995;**90**:773-795. DOI: 10.1080/01621459.1995.10476572

[47] Gunter B. Predicting Movie Success at the Box Office. Palgrave Macmillan Cham: Springer International Publishing; 2018

[48] Rao VR, Ravid SAA, Gretz RT, Chen J, Basuroy S. The impact of advertising content on movie revenues. Marketing Letters. 2017;**28**:341-355. DOI: 10.1007/ s11002-017-9418-5

[49] Elberse A, Anand B. The effectiveness of pre-release advertising for motion pictures: An empirical

investigation using a simulated market. Information Economics and Policy. 2007;**19**:319-343. DOI: 10.1016/ j.infoecopol.2007.06.003

[50] Zufryden FS. Linking advertising to box office performance of new film releases - a marketing planning model. Journal of Advertising Research. 1996; **36**:29

[51] Joshi A, Hanssens DM. The direct and indirect effects of advertising spending on firm value. Journal of Marketing. 2010;**74**:20-33. DOI: 10.1509/ jmkg.74.1.20

[52] Elberse A, Eliashberg J. Demand and supply dynamics for sequentially released products in international markets: The case of motion pictures. Marketing Science. 2003;**22**:329-354. DOI: 10.1287/mksc.22.3.329.17740

[53] Neelamegham R, Chintagunta PK. A Bayesian model to forecast new product performance in domestic and international markets. Marketing Science. 1999;**18**:115-136. DOI: 10.1287/ mksc.18.2.115

[54] Chang BH, Ki EJ. Devising a practical model for predicting theatrical movie success: Focusing on the experience good property. Journal of Media Economics. 2005;**18**:247-269. DOI: 10.1207/s15327736me1804\_2

[55] Basuroy S, Chatterjee S, Ravid SA. How critical are critical reviews? The box office effects of film critics, star power, and budgets. Journal of Marketing. 2003; **67**:103-117. DOI: 10.1509/jmkg.67.4. 103.18692

[56] Wasserman M, Mukherjee S, Scott K, Zeng XHT, Radicchi F, Amaral LAN. Correlations between user voting data, budget, and box office for films in the internet movie database. Journal of the

Association for Information Science and Technology. 2015;**66**:858-868. DOI: 10.1002/asi.23213

[57] Simonton DK. Cinematic creativity and production budgets: Does money make the movie? The Journal of Creative Behavior. 2005;**39**:1-15. DOI: 10.1002/ j.2162-6057.2005.tb01246.x

[58] Dong X, Manchanda P, Chintagunta P. Quantifying the benefits of individual-level targeting in the presence of firm strategic behavior. Journal of Marketing Research. 2009;**46**:207-221. DOI: 10.1509/jmkr.46.2.207

[59] Sood S, Drze X. Brand extensions of experiential goods: Movie sequel evaluations. Journal of Consumer Research. 2006;**33**:352-360. DOI: 10.1086/508520

[60] Basuroy S, Chatterjee S. Fast and frequent: Investigating box office revenues of motion picture sequels. Journal of Business Research. 2008;**61**: 798-803. DOI: 10.1016/j. jbusres.2007.07.030

[61] Smith D, Park C. The effects of brand extensions on market share and advertising efficiency. Journal of Marketing Research. 1992;**29**(3):296-313. DOI: 10.2307/3172741

[62] Ravid SA. Information, blockbusters and stars? A study of the film industry. The Journal of Business. 1999;**72**:463- 492. DOI: 10.1086/209624

[63] Joshi M, Das D, Gimpel K, Smith NA. Movie reviews and revenues: An experiment in text regression. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, USA: Association for Computational Linguistics: 2010, p. 293–296.

*Incorporating Model Uncertainty in Market Response Models with Multiple Endogenous… DOI: http://dx.doi.org/10.5772/intechopen.108927*

[64] Debenedetti S, Ghariani G. To quote or not to quote? Critics' quotations in film advertisements as indicators of the continuing authority of film criticism. Poetics. 2018;**66**:30-41. DOI: 10.1016/ j.poetic.2018.02.003

[65] Reinstein D, Snyder MC. The influence of expert reviews on consumer demand for experience goods: A case study of movie critics. Journal of Industrial Economics. 2005;**53**: 27-51. DOI: 10.1111/j.0022-1821.2005. 00244.x

[66] Legoux R, Larocque D, Laporte S, Belmati S, Boquet T. The effect of critical reviews on exhibitors' decisions: Do reviews affect the survival of a movie on screen? International Journal of Research in Marketing. 2016;**33**:357-374. DOI: 10.1016/j.ijresmar.2015.07.003

[67] Filson D, Switzer D, Besocke P. At the movies: The economics of exhibition contracts. Economic Inquiry. 2005;**43**: 354-369. DOI: 10.1093/ei/cbi024

[68] Eliashberg J, Jonker JJ, Sawhney MS, Wierenga B. MOVIEMOD: An implementable decision-support system for prerelease market evaluation of motion pictures. Marketing Science. 2000;**19**:226-243. DOI: 10.1287/ mksc.19.3.226.11796

[69] Rao A, Hartmann W. Quality vs. variety: Trading larger screens for more shows in the era of digital cinema. Quantitative Marketing and Economics. 2015;**13**:117-134. DOI: 10.1007/s11129- 015-9156-z

[70] Moul CC, Shugan SM. Theatrical release and the launching of motion pictures. In: Moul CC, editor. A Concise Handbook of Movie Industry Economics. Cambridge: Cambridge University Press; 2005. pp. 80-137. DOI: 10.1017/ CBO9780511614422.005

[71] Ghiassi M, Lio D, Moon B. Preproduction forecasting of movie revenues with a dynamic artificial neural network. Expert Systems with Applications. 2015;**42**:3176-3193. DOI: 10.1016/j.eswa.2014.11.022

[72] Steel MFJ. Model averaging and its use in economics. Journal of Economic Literature. 2020;**58**:644-719. DOI: 10.1257/jel.20191385

[73] George EI, McCulloch RE. Variable selection via Gibbs sampling. Journal of the American Statistical Association. 1993;**88**:881-889. DOI: 10.1080/ 01621459.1993.10476353

[74] Conley TG, Hansen CB, McCulloch RE, Rossi PE. A semi-parametric Bayesian approach to the instrumental variable problem. Journal of Econometrics. 2008;**144**:276-305. DOI: 10.1016/j.jeconom.2008.01.007

[75] Kaplan D, Lee C. Bayesian model averaging over directed acyclic graphs with implications for the predictive performance of structural equation models. Structural Equation Modeling: A Multidisciplinary Journal. 2016;**23**: 343-353. DOI: 10.1080/10705511.2015. 1092088

[76] Volinsky CT, Madigan D, Raftery AE, Kronmal RA. Bayesian model averaging in proportional hazard models: Assessing the risk of a stroke. Journal of the Royal Statistical Society: Series C: Applied Statistics. 1997;**46**:433-448. DOI: 10.1111/1467-9876.00082

### *Edited by Brian W. Sloboda*

Econometrics uses statistical methods and real-world data to predict and establish specific trends. This analytical method sustains limitless potential, but the necessary research for professionals to understand and implement this is often lacking. *Econometrics - Recent Advances and Applications* explores the theoretical and practical aspects of detailed econometric theories and applications within economics, policymaking, and finance. This book covers various topics such as dynamic stochastic general equilibrium (DSGE) models, machine learning, spatial econometrics, and time series analysis. This book is a useful resource for economists, policymakers, financial analysts, researchers, academicians, and graduate students seeking research on the various applications of econometrics.

### *Taufiq Choudhry, Business, Management and Economics Series Editor*

Published in London, UK © 2023 IntechOpen © monsitj / iStock

Econometrics - Recent Advances and Applications

IntechOpen Series

Business, Management and Economics,

Volume 10

Econometrics

Recent Advances and Applications

*Edited by Brian W. Sloboda*