**Step VII**

Assume the variables with the indices *i* = *gr* and *j* =*sm* are the candidates of being out-ofcontrol, where *N* decision-making steps are available. Define *V* (*N* , *di*, *<sup>j</sup>* (*k*)) the probability of correct choice between the variables *i* and *j*, where *di*, *<sup>j</sup>* (*k*) is the acceptable belief. Also, define *CS* the event of correct selection and event *Ei*, *<sup>j</sup>* the existence of two out-of-control candidate variables *i* and *j*. Then, we have:

$$V\left(N, d\_{i,j}(k)\right) = \Pr\{CS \Big| E\_{i,j}\} \triangleq \Pr\_{i,j}\{CS\} \tag{38}$$

where " ≜ " means "defined as."

Assuming *di*, *<sup>j</sup>* \* (*k*) the maximum point of *<sup>V</sup>* (*<sup>N</sup>* , *di*, *<sup>j</sup>* (*k*)), called the minimum acceptable belief, we have

$$V\left(\mathcal{N}\_{\prime}d\_{i,j}^{\ast}(k)\right) \triangleq V\_{i,j}^{\ast}(\mathcal{N}) = \max\_{d\_{i,j}(k)} \left\langle V\left(\mathcal{N}\_{\prime}d\_{i,j}(k)\right) \right\rangle \triangleq \text{Max}\left\langle \text{Pr}\_{i,j}\left\{\text{CS}\right\} \right\rangle \tag{39}$$

Now, let *Bi*, *<sup>j</sup>*

as the out-of-control variable. Hence,

Then, the probability measure Pr*i*, *<sup>j</sup>*

The probability measure Pr*i*, *<sup>j</sup>*

*i*

º

(*i*; *xk* , *Ok* <sup>−</sup>1) + *Bi*, *<sup>j</sup>*

By similar reasoning, we have:

{*CS* | *N Si*, *<sup>j</sup>*

*S*

Since *Bi*, *<sup>j</sup>*

to [0.5,1]. Hence,

The term Pr*i*, *<sup>j</sup>*

*B jx O*

(*i*; *xk* , *Ok* <sup>−</sup>1) denote the probability of correct selection conditioned on selecting *i*

(0.5 )

*T*

*kij k j kij k j k j kij*

(0.5 )

( ) ( ) ( ) , 1

( *j*; *xk* , *Ok* <sup>−</sup>1)=1 and the value of beliefs are not negative, we conclude

max ; , , ; , 0.5 {*B ix O B jx O ij k k ij k k* , 1, 1 ( - - ) ( )} ³ (48)

*S B ix O d k i ij k k ij* º ³ { , 1, ( ; , () - ) } (49)

*S B jx O d k j ij k k ij* º ³ { , 1, ( ; , () - ) } (50)

} denotes the probability of correct selection conditioned on excluding

the candidates *i* and *j* as the solution. In other words, the maximum belief has been less than

*T*

1


1 1

+


() ()

*BO e BO e BO e*

*i k ij k k <sup>T</sup> <sup>T</sup> j k i k*

1

1 1

+


() ()

*BO e BO e*

*j k*

*j k i k*

, 1 (0.5 ) (0.5 )

( ) (; , )

*BO e B ix O*

=

=



{*Si*

( ) ( )

of-control. Regarding to the explained strategy, we have:

( ) (; , )

{*CS* |*Si*

, 1 (0.5 ) (0.5 )

*ij k k <sup>T</sup> <sup>T</sup>*

, 1 , 1, , 1

ì ü ì ü ï ïï ï í íý ý <sup>=</sup> <sup>³</sup> ï ïï ï î þ î þ

*ij k k ij k k ij k k ij ij k k*

*B ix O B i x O Max B ix O d k B jx O*

Without interruption of assumptions, we can change the variation interval of *di*, *<sup>j</sup>*

; , , ;, ; ,

;, ,


2 , 2 2 , , 2 , 2 2 , ,

Using Dynamic Programming Based on Bayesian Inference in Selection Problems

} is calculated using the following equation,

, ,1 Pr { } ( ; , ) *ij i ij k k CS S B i x O* - = (46)

} is defined as the probability of selecting variable *i* to be out-

(45)

141

http://dx.doi.org/10.5772/57423

(47)

(*k*) from [0,1]

Let *Si* and *Sj* be the event of selecting *i* and *j* as the out-of-control variables, respectively, and *N Si*, *<sup>j</sup>* be the event of not selecting any. Then, by conditioning on the probability, we have:

$$\begin{aligned} \left| \boldsymbol{V}\_{i,j}^{\*} \left( \mathbf{N} \right) = \text{Max} \left\{ \mathbf{Pr}\_{i,j} \left\{ \mathbf{CS} \right\} \right\} =\\ \text{Max} \left\{ \mathbf{Pr}\_{i,j} \left\{ \mathbf{CS} \middle| \mathbf{S}\_{i} \right\} \mathbf{Pr}\_{i,j} \left\{ \mathbf{S}\_{i} \right\} + \mathbf{Pr}\_{i,j} \left\{ \mathbf{CS} \middle| \mathbf{S}\_{j} \right\} \mathbf{Pr}\_{i,j} \left\{ \mathbf{S}\_{j} \right\} + \mathbf{Pr}\_{i,j} \left\{ \mathbf{CS} \middle| \mathbf{NS}\_{i,j} \right\} \mathbf{Pr}\_{i,j} \left\{ \mathbf{NS}\_{i,j} \right\} \right\} \end{aligned} \tag{40}$$

At the *k*th iteration, the conditional bi-variate distribution of the sample means for variables *gr* and *sm*, i.e, *Xk* , *<sup>j</sup>*=*gr*,*sm* | *Xk* , *<sup>j</sup>*≠*gr*,*sm*, is determined using the conditional property of multivariate normal distribution given in appendix 1. Moreover, knowing *E*(*xk* , *<sup>j</sup>* ) =*μj* and evaluating the conditional mean and standard deviation (see appendix 1) results in

$$E\left(\mathbf{X}\_{k,i}\middle|\mathbf{X}\_{k,j}\right) = \mu\_i + \rho \frac{\sigma\_i}{\sigma\_j} \left(\mathbf{X}\_{k,j} - \mu\_j\right) \tag{41}$$

and

$$E\left(\mathbf{X}\_{k,i}\middle|\mathbf{X}\_{k,j}\right) = \mu\_i + \rho \frac{\sigma\_i}{\sigma\_j} \left(\mathbf{X}\_{k,j} - \mu\_j\right) \tag{42}$$

Based on the decomposition method of Mason et al. [9], define statistics *Tk* , *<sup>j</sup>* and *Tk* ,*i*<sup>|</sup> *<sup>j</sup>* as

$$T\_{k,j} = \left(\frac{X\_{k,j} - \mu\_j}{\sigma\_j}\right) \tag{43}$$

$$\mathbf{T}\_{k,i\parallel j} = \left(\frac{\mathbf{X}\_{k,i} - \mathbf{E}\left(\mathbf{X}\_{k,i} \left| \mathbf{X}\_{k,j}\right|\right)}{\sigma\_{\mathbf{X}\_{k,i}} \left| \mathbf{x}\_{k,j}}\right)\right) \tag{44}$$

Thus, when the process is in-control, the statistics *Tk* , *<sup>j</sup>* and *Tk* ,*i*<sup>|</sup> *<sup>j</sup>* follow a standard normal distribution [9].

Now, let *Bi*, *<sup>j</sup>* (*i*; *xk* , *Ok* <sup>−</sup>1) denote the probability of correct selection conditioned on selecting *i* as the out-of-control variable. Hence,

$$\begin{aligned} B\_{i,j}(i; \mathbf{x}\_{k}, \mathbf{O}\_{k-1}) &= \frac{B\_{i}(\mathbf{O}\_{k-1})e^{\left(0.5\mathbf{T}\_{k}\cdot\mathbf{q}\right)^{2}}}{B\_{j}(\mathbf{O}\_{k-1})e^{\left(0.5\mathbf{T}\_{k,j}\right)^{2}} + B\_{i}(\mathbf{O}\_{k-1})e^{\left(0.5\mathbf{T}\_{k,j}\right)^{2}}} \\ B\_{i,j}(j; \mathbf{x}\_{k}, \mathbf{O}\_{k-1}) &= \frac{B\_{j}(\mathbf{O}\_{k-1})e^{\left(0.5\mathbf{T}\_{k,j}\right)^{2}}}{B\_{j}(\mathbf{O}\_{k-1})e^{\left(0.5\mathbf{T}\_{k,j}\right)^{2}} + B\_{j}(\mathbf{O}\_{k-1})e^{\left(0.5\mathbf{T}\_{k,j}\right)^{2}}} \end{aligned} \tag{45}$$

Then, the probability measure Pr*i*, *<sup>j</sup>* {*CS* |*Si* } is calculated using the following equation,

$$\text{Pr}\_{i,j}\{\text{CS}\Big|\,\text{S}\_{i}\} = B\_{i,j}(\text{i}; \mathbf{x}\_{k'} O\_{k-1}) \tag{46}$$

The probability measure Pr*i*, *<sup>j</sup>* {*Si* } is defined as the probability of selecting variable *i* to be outof-control. Regarding to the explained strategy, we have:

$$\begin{aligned} \mathcal{S}\_{i} &= \\ \left\{ \mathcal{B}\_{i,j} \left( i; \mathbf{x}\_{k}, \mathcal{O}\_{k-1} \right) = \text{Max} \begin{Bmatrix} \mathcal{B}\_{i,j} \left( i; \mathbf{x}\_{k'}, \mathcal{O}\_{k-1} \right) \\ \mathcal{B}\_{i,j} \left( j; \mathbf{x}\_{k}, \mathcal{O}\_{k-1} \right) \end{Bmatrix}; \mathcal{B}\_{i,j} \left( i; \mathbf{x}\_{k'}, \mathcal{O}\_{k-1} \right) \ge d\_{i,j} \left( k \right) \right\} \end{aligned} \tag{47}$$

Since *Bi*, *<sup>j</sup>* (*i*; *xk* , *Ok* <sup>−</sup>1) + *Bi*, *<sup>j</sup>* ( *j*; *xk* , *Ok* <sup>−</sup>1)=1 and the value of beliefs are not negative, we conclude

$$\max \left| \mathcal{B}\_{i,j} \left( i; \mathbf{x}\_k, \mathcal{O}\_{k-1} \right), \mathcal{B}\_{i,j} \left( j; \mathbf{x}\_k, \mathcal{O}\_{k-1} \right) \right| \ge 0.5 \tag{48}$$

Without interruption of assumptions, we can change the variation interval of *di*, *<sup>j</sup>* (*k*) from [0,1] to [0.5,1]. Hence,

$$S\_i = \left| B\_{i,j} \left( i; \mathbf{x}\_{k'}, \mathbf{O}\_{k-1} \right) \ge d\_{i,j}(k) \right| \tag{49}$$

By similar reasoning, we have:

Assuming *di*, *<sup>j</sup>*

we have

Let *Si*

*N Si*, *<sup>j</sup>*

and

distribution [9].

\*

\* (*k*) the maximum point of *<sup>V</sup>* (*<sup>N</sup>* , *di*, *<sup>j</sup>*

\* \*

140 Dynamic Programming and Bayesian Inference, Concepts and Applications

( ) { { }}

x Pr

= =

, ,

*i j i j*

*V N Ma CS*

( ) { ( )} { { }}

*V N d k V N Max V N d k Max CS* @ @ <sup>=</sup> (39)

and *Sj* be the event of selecting *i* and *j* as the out-of-control variables, respectively, and

be the event of not selecting any. Then, by conditioning on the probability, we have:

{ { } { } { } { } { } { }}

Pr Pr Pr Pr Pr Pr

normal distribution given in appendix 1. Moreover, knowing *E*(*xk* , *<sup>j</sup>*

conditional mean and standard deviation (see appendix 1) results in

*Max CS S S CS S S CS NS NS*

+ +

, , , , , ,, ,

*ij i ij i ij j ij j ij ij ij ij*

At the *k*th iteration, the conditional bi-variate distribution of the sample means for variables *gr* and *sm*, i.e, *Xk* , *<sup>j</sup>*=*gr*,*sm* | *Xk* , *<sup>j</sup>*≠*gr*,*sm*, is determined using the conditional property of multivariate

> ( , , ) ( , ) *<sup>i</sup> ki kj i kj j j*

> ( , , ) ( , ) *<sup>i</sup> ki kj i kj j j*

Based on the decomposition method of Mason et al. [9], define statistics *Tk* , *<sup>j</sup>* and *Tk* ,*i*<sup>|</sup> *<sup>j</sup>*

,

æ ö - <sup>=</sup> ç ÷ ç ÷ è ø

*X*

*kj j*

m

( )

, , , ,,

*ki kj ki ki kj*

Thus, when the process is in-control, the statistics *Tk* , *<sup>j</sup>* and *Tk* ,*i*<sup>|</sup> *<sup>j</sup>* follow a standard normal

*X X*

è ø

*X EX X*

æ ö - ç ÷ <sup>=</sup>

s

*j*

s

s mr

s

s mr

s

 m

 m

*EX X X*

*EX X X*

,

*k j*

*T*

,

*T*

*kij*

, , , , ( ) , ( ) ( ) , ( ) Pr *i j ij ij i j i j d k*

,

(*k*)), called the minimum acceptable belief,

) =*μj*

=+ - (41)

=+ - (42)

(40)

and evaluating the

as

(43)

(44)

$$\mathcal{S}\_{j} = \left\{ \mathcal{B}\_{i,j} \left( j; \mathcal{X}\_{k'} \mathcal{O}\_{k-1} \right) \ge d\_{i,j}(k) \right\} \tag{50}$$

The term Pr*i*, *<sup>j</sup>* {*CS* | *N Si*, *<sup>j</sup>* } denotes the probability of correct selection conditioned on excluding the candidates *i* and *j* as the solution. In other words, the maximum belief has been less than the threshold (minimum acceptable belief) *di*, *<sup>j</sup>* \* (*k*) and the decision making process continues to the next stage. In terms of stochastic dynamic programming approach, the probability of this event is equal to the maximum probability of correct selection when there are *N* −1 stages remaining. The discounted value of this probability in the current stage using the discount factor *α* equals to *α Vi*, *<sup>j</sup>* (*N* −1). Further, since we partitioned the decision space into events {*N Si*, *<sup>j</sup>* ;*Si* ;*Sj* }, we have:

$$\Pr\_{i,j}\left(\text{NS}\_{i,j}\right) = 1 - \left(\Pr\_{i,j}\left\{\mathcal{S}\_i\right\} + \Pr\_{i,j}\left\{\mathcal{S}\_j\right\}\right) \tag{51}$$

**Step VIII: The Decision Step**

**3.4. Method of evaluating** *Vi***,** *<sup>j</sup>*

(*i*;*Ok* ) <*αVi*, *<sup>j</sup>*

This only happens when *di*, *<sup>j</sup>*

(*i*;*Ok* ) <*di*, *<sup>j</sup>*

( *j*;*Ok* ) >*αVi*, *<sup>j</sup>*

This only happens when *di*, *<sup>j</sup>*

with the index *i* is selected.

( *j*;*Ok* ) <*αVi*, *<sup>j</sup>*

(*N* , *di*, *<sup>j</sup>*

In this case, both (*Bi*, *<sup>j</sup>*

are maximizing *Vi*, *<sup>j</sup>*

In this case, (*Bi*, *<sup>j</sup>*

maximize *Vi*, *<sup>j</sup>*

Using *di*, *<sup>j</sup>*

**1.** *Bi*, *<sup>j</sup>*

since *Bi*, *<sup>j</sup>*

**2.** *Bi*, *<sup>j</sup>*

**3.** *Bi*, *<sup>j</sup>*

If the belief *Bgr*,*sm*(*gr*; *xk* , *Ok* <sup>−</sup>1) in the candidate set (*sm*, *gr*) is equal to or greater than *dgr*,*sm*

probability of correct selection would be resulted from this process.

**\* (***N* **)**

a

(*i*;*Ok* ) −*αVi*, *<sup>j</sup>*

\* (*N* −1)

(*N* , *di*, *<sup>j</sup>*

\* (*s* −1)

(*i*;*Ok* ) −*αVi*, *<sup>j</sup>*

\* (*<sup>N</sup>* <sup>−</sup>1)< *Bi*, *<sup>j</sup>*

coefficient. Then, in order to maximize *Vi*, *<sup>j</sup>*

\* (*k*) as the minimum acceptable belief, from equation (53) we have

( ( ) ) { ( ) ( )}

( ) ; ( 1) Pr ;

a

Then, for the decision-making problem at hand, three cases may occur

\*\* \* ,, , , ,

*ij ij k ij ij k ij ij k ij ij k ij i j*

*V N B iO V N B iO d k B jO V N B jO d k V N*

, , , ,,

\* (*<sup>N</sup>* <sup>−</sup>1)) and (*Bi*, *<sup>j</sup>*

\* (*k*)=1, we continue to the next stage.

\* (*<sup>N</sup>* <sup>−</sup>1)) and (*Bi*, *<sup>j</sup>*

(*i*;*Ok* )

equated to zero. To do this, define *h* (*dgr*,*sm*(*k*)) and *r*(*dgr*,*sm*(*k*)) as follows:

\* (*k*)=0.5. In other words, since *Bi*, *<sup>j</sup>*

In this case, one of the probability terms in equation (54) has a positive and the other a negative

(*N* , *di*, *<sup>j</sup>*

( ( ) ) { ( ) ( )}

= - - ³+

\* \* \*


; ( 1) Pr ; ( 1)

In what follows, the procedure to evaluate *Vi*, *<sup>j</sup>*

then choose the variable with index *gr* to be out-of-control. In this case, the decision-making process ends. Otherwise, without having any selection at this stage, obtain another observa‐ tion, lower the number of remaining decision-stages to *N* −1, set *k* =*k* + 1, and return to step **V** above. The process will continue until either the stopping condition is reached or the number of stages is finished. The optimal strategy with *N* decision-making stages that maximizes the

\* (*N* ) of equation (53) is given in detail.

Using Dynamic Programming Based on Bayesian Inference in Selection Problems

 a

(*i*;*Ok* ) −*αVi*, *<sup>j</sup>*

(*i*;*Ok* ) −*αVi*, *<sup>j</sup>*

(*k*)) we need the two probability terms in equation (54) to be maximized.

(*k*)), the two probability terms in equation (54) must be minimized.

\* (*k*)=1, making the probability terms equal to zero. In other words,

\* (*k*)

143

http://dx.doi.org/10.5772/57423

(54)

\* (*N* −1)) are negative. Since we

\* (*N* −1)) are both positive and to

\* (*k*)=0.5, the variable

(*k*) must be

(*i*;*Ok* ) >*di*, *<sup>j</sup>*

(*k*)), the first derivative on *di*, *<sup>j</sup>*

Now we evaluate *Vi*, *<sup>j</sup>* \* (*N* ) as follows,

( ) { ( ) } ( ) { ( ) } { } { } { } , \* , , 1,, 1 , , 1,, 1 , 0.5 ( ) 1 ,, 1 , , , , 1, ( ) ; , Pr ; , ( ) ; , Pr ; , ( ) 1 Pr ( ; , ) ( ) Pr Pr ( ; , ) ( ) *i j i j ij k k ij ij k k ij ij k k ij ij k k ij d k ij ij k k ij ij ij ij k k ij V N B ix O B ix O d k Max B j x O B j x O d k B ix O d k NS CS B jx O d k* - - - - £ £ - - = ì ü ï ï ³ + í ý ³ + ï <sup>ï</sup> æ ö - ³ - <sup>ï</sup> ç ÷ <sup>ï</sup> <sup>³</sup> <sup>î</sup> è ø ( ) { ( ) } ( ) { ( ) } ( ) { ( ) } { ( ) } , , 1,, 1 , , 1,, 1 , 0.5 ( ) 1 ,, 1 , \* , , 1, ; , Pr ; , ( ) ; , Pr ; , ( ) 1 Pr ; , ( ) 1 Pr ; , ( ) *i j ij k k ij ij k k ij ij k k ij ij k k ij d k ij ij k k ij i j ij k k ij B ix O B ix O d k Max B j x O B j x O d k B ix O d k V N B jx O d k* a - - - - £ £ - ï ï ï ï þ ì ü ï ï ³ + = í ý ³ + ï ï æ ö - ³ - ç ÷ - <sup>³</sup> î þ è ø (52)

In other words,

$$V\_{i,j}^{\*}\left(\mathbf{N}\right) = \underset{0 \le k \le l\_{i,j}\left(k\right) \le 1}{\text{Max}} \left\{ \begin{aligned} &\mathbf{B}\_{i,j}\left(\mathbf{i}; \mathbf{O}\_{k}\right) \text{Pr}\_{i,j}\left\{\mathbf{B}\_{i,j}\left(\mathbf{i}; \mathbf{O}\_{k}\right) \ge d\_{i,j}\left(k\right)\right\} + \\ &\mathbf{B}\_{i,j}\left(\mathbf{j}; \mathbf{O}\_{k}\right) \text{Pr}\_{i,j}\left\{\mathbf{B}\_{i,j}\left(\mathbf{j}; \mathbf{O}\_{k}\right) \ge d\_{i,j}\left(k\right)\right\} + \\ &\alpha V\_{i,j}^{\*}\left(\mathbf{N} - \mathbf{1}\right) \begin{pmatrix} 1 - \text{Pr}\_{i,j}\left\{\mathbf{B}\_{i,j}\left(\mathbf{i}; \mathbf{O}\_{k}\right) \ge d\_{i,j}\left(k\right)\right\} \\ - \text{Pr}\left\{\mathbf{B}\_{i,j}\left(\mathbf{j}; \mathbf{O}\_{k}\right) \ge d\_{i,j}\left(k\right)\right\} \end{pmatrix}\right\} \end{aligned} \tag{53}$$

The method of evaluating the minimum acceptable belief *dgr*,*sm* \* (*k*) is given in Appendix 2.
