**Step VIII: The Decision Step**

the threshold (minimum acceptable belief) *di*, *<sup>j</sup>*

142 Dynamic Programming and Bayesian Inference, Concepts and Applications

\* (*N* ) as follows,

, ,

,

a

,

£ £

*i j*

*i j*

*V N*

*ij ij*

*NS CS*

Pr

ï

factor *α* equals to *α Vi*, *<sup>j</sup>*

Now we evaluate *Vi*, *<sup>j</sup>*

In other words,

}, we have:

,

,

*i j*

\*

*i j*

\* ,

*i j*

*V N*

( )

=

{*N Si*, *<sup>j</sup>* ;*Si* ;*Sj* \* (*k*) and the decision making process continues

(*N* −1). Further, since we partitioned the decision space into events

Pr 1 Pr Pr *ij ij* ,, , , {*NS* } =- + ( *ij i ij j* {*S S* } { }) (51)

to the next stage. In terms of stochastic dynamic programming approach, the probability of this event is equal to the maximum probability of correct selection when there are *N* −1 stages remaining. The discounted value of this probability in the current stage using the discount

> ( ) { ( ) } ( ) { ( ) }

; , Pr ; , ( ) ; , Pr ; , ( )

ì ü ï ï

, 1,, 1 ,

*ij k k ij ij k k ij*

*B ix O B ix O d k*


, 1,, 1 , 0.5 ( ) 1


*ij k k ij ij k k ij d k*

, 1,, 1 , 0.5 ( ) 1


, , ,, , 0.5 ( ) 1

*i j ij k ij ij k ij d k*

*V N Max B j O B j O d k*

( ) ; Pr ; ( )

( 1)

,

a

*i j*

*V N*

*ij k k ij ij k k ij d k*

*Max B j x O B j x O d k*

= í ý ³ +

1

*Max B j x O B j x O d k*

{ } { }

( ) { ( ) } ( ) { ( ) }

; , Pr ; , ( ) ; , Pr ; , ( )

ì ü ï ï

, 1,, 1 ,

*ij k k ij ij k k ij*

*B ix O B ix O d k*


( ) { ( ) } { ( ) }

ï ï æ ö - ³ - ç ÷ -

<sup>³</sup> î þ è ø

, 1,

*ij k k ij*

*B jx O d k*


Pr ; , ( )

( ) { ( ) } ( ) { ( ) }

; Pr ; ( )

ì ü ï ï

,, , \*

ï ï æ ö - ³ ç ÷ -


, ,, ,

= í ý ³ +

The method of evaluating the minimum acceptable belief *dgr*,*sm* \* (*k*) is given in Appendix 2.

*ij k ij ij k ij*

*B iO B iO d k*

1 Pr ; , ( )

*ij ij k k ij*

*B ix O d k*


{ ( ) }

³ +

*B iO d k*

{ ( ) }

*B jO d k*

*ij ij k ij*

, ,

*ij k ij*

Pr ; ( )

1 Pr ; ( )

,, 1 , \*

<sup>ï</sup> æ ö - ³ - <sup>ï</sup> ç ÷ <sup>ï</sup> <sup>³</sup> <sup>î</sup> è ø

í ý ³ +

{ }


*B jx O d k*

, 1,

*ij k k ij*

Pr ( ; , ) ( )

,, 1 ,

*ij ij k k ij*

*B ix O d k*


³ +

³ +

ï ï ï ï þ

(52)

(53)

1 Pr ( ; , ) ( )

If the belief *Bgr*,*sm*(*gr*; *xk* , *Ok* <sup>−</sup>1) in the candidate set (*sm*, *gr*) is equal to or greater than *dgr*,*sm* \* (*k*) then choose the variable with index *gr* to be out-of-control. In this case, the decision-making process ends. Otherwise, without having any selection at this stage, obtain another observa‐ tion, lower the number of remaining decision-stages to *N* −1, set *k* =*k* + 1, and return to step **V** above. The process will continue until either the stopping condition is reached or the number of stages is finished. The optimal strategy with *N* decision-making stages that maximizes the probability of correct selection would be resulted from this process.

In what follows, the procedure to evaluate *Vi*, *<sup>j</sup>* \* (*N* ) of equation (53) is given in detail.

#### **3.4. Method of evaluating** *Vi***,** *<sup>j</sup>* **\* (***N* **)**

Using *di*, *<sup>j</sup>* \* (*k*) as the minimum acceptable belief, from equation (53) we have

$$\begin{aligned} \boldsymbol{V}\_{i,j}^{\*} (\mathbf{N}) &= \left( \boldsymbol{B}\_{i,j} \left( \boldsymbol{i}; \boldsymbol{O}\_{k} \right) - \alpha \boldsymbol{V}\_{i,j}^{\*} (\mathbf{N} - \mathbf{1}) \right) \text{Pr} \left\{ \boldsymbol{B}\_{i,j} \left( \boldsymbol{i}; \boldsymbol{O}\_{k} \right) \geq \boldsymbol{d}\_{i,j}^{\*} \left( \mathbf{k} \right) \right\} + \\ \left( \boldsymbol{B}\_{i,j} \left( \boldsymbol{j}; \boldsymbol{O}\_{k} \right) - \alpha \boldsymbol{V}\_{i,j}^{\*} (\mathbf{N} - \mathbf{1}) \right) \text{Pr} \left\{ \boldsymbol{B}\_{i,j} \left( \boldsymbol{j}; \boldsymbol{O}\_{k} \right) \geq \boldsymbol{d}\_{i,j}^{\*} \left( \boldsymbol{k} \right) \right\} + \alpha \boldsymbol{V}\_{i,j}^{\*} (\mathbf{N} - \mathbf{1}) \end{aligned} \tag{54}$$

Then, for the decision-making problem at hand, three cases may occur

$$\mathbf{1}.\quad \mathcal{B}\_{i,j}(\mathbf{i}; \mathcal{O}\_k) \le \alpha \, V\_{i,j}^\*(\mathbf{N}-1).$$

In this case, both (*Bi*, *<sup>j</sup>* (*i*;*Ok* ) −*αVi*, *<sup>j</sup>* \* (*<sup>N</sup>* <sup>−</sup>1)) and (*Bi*, *<sup>j</sup>* (*i*;*Ok* ) −*αVi*, *<sup>j</sup>* \* (*N* −1)) are negative. Since we are maximizing *Vi*, *<sup>j</sup>* (*N* , *di*, *<sup>j</sup>* (*k*)), the two probability terms in equation (54) must be minimized. This only happens when *di*, *<sup>j</sup>* \* (*k*)=1, making the probability terms equal to zero. In other words, since *Bi*, *<sup>j</sup>* (*i*;*Ok* ) <*di*, *<sup>j</sup>* \* (*k*)=1, we continue to the next stage.

$$\mathbf{2.} \quad \mathcal{B}\_{i,j}(j; \mathcal{O}\_k) \succeq \alpha \operatorname{V}\_{i,j}^\*(\mathbf{s} - \mathbf{1}),$$

In this case, (*Bi*, *<sup>j</sup>* (*i*;*Ok* ) −*αVi*, *<sup>j</sup>* \* (*<sup>N</sup>* <sup>−</sup>1)) and (*Bi*, *<sup>j</sup>* (*i*;*Ok* ) −*αVi*, *<sup>j</sup>* \* (*N* −1)) are both positive and to maximize *Vi*, *<sup>j</sup>* (*N* , *di*, *<sup>j</sup>* (*k*)) we need the two probability terms in equation (54) to be maximized. This only happens when *di*, *<sup>j</sup>* \* (*k*)=0.5. In other words, since *Bi*, *<sup>j</sup>* (*i*;*Ok* ) >*di*, *<sup>j</sup>* \* (*k*)=0.5, the variable with the index *i* is selected.

$$\mathbf{3.}\quad \operatorname{B}\_{i,\,\,j}(j;\mathcal{O}\_k) \lhd \alpha \operatorname{V}\_{i,\,\,j}^{\*}(N-1) \lhd \operatorname{B}\_{i,\,\,j}(i;\mathcal{O}\_k)$$

In this case, one of the probability terms in equation (54) has a positive and the other a negative coefficient. Then, in order to maximize *Vi*, *<sup>j</sup>* (*N* , *di*, *<sup>j</sup>* (*k*)), the first derivative on *di*, *<sup>j</sup>* (*k*) must be equated to zero. To do this, define *h* (*dgr*,*sm*(*k*)) and *r*(*dgr*,*sm*(*k*)) as follows:

$$h\left(d\_{gr,sm}(k)\right) = \frac{d\_{gr,sm}(k)B\_{gr}\left(gr,O\_{k-1}\right)}{\left(1-d\_{gr,sm}(k)\right)B\_{sm}\left(sm,O\_{k-1}\right)}\tag{55}$$

ering our experiences and finally, the beliefs are updated and the optimal decision is selected based on the current situation. In a SPC problem, a similar decision-making process exits. First, the decision space can be divided into two candidates; an in-control or out-of-control produc‐ tion process. Second, the problem solution is one of the candidates (in-control or out-of-control process). Finally, a belief is assigned to each candidate so that the belief shows the probability of being a fault in the process. Based upon the updated belief, we may decide about states of

Using Dynamic Programming Based on Bayesian Inference in Selection Problems

http://dx.doi.org/10.5772/57423

145

For simplicity, individual observation on the quality characteristic of interest in any iteration of data gathering process was gathered. At iteration k of data gathering process, *Ok* =(*x*1, *x*2, ......, *xk* )was defined as the observation vector where resemble observations for previous iterations 1, 2, …, *k*. After taking a new observation, *Ok-1* the belief of being in an outof-control state is defined as *B*(*xk* , *Ok* <sup>−</sup>1) =Pr{*Out* −*of* −*control* | *xk* , *Ok* <sup>−</sup>1}. At this iteration, we want to update the belief of being in out-of-control state based on observation vector *Ok* <sup>−</sup><sup>1</sup> and new observation *xk* . If we define *B*(*Ok* <sup>−</sup>1) = *B*(*xk* <sup>−</sup>1, *Ok* <sup>−</sup>2) as the prior belief of an out-of-control state, in order to update the posterior belief *B*(*xk* , *Ok* <sup>−</sup>1), since we may assume that the

1 1

Since in-control or out-of-control state partition the decision space, we can write equation

1


Assuming the quality characteristic of interest follows a normal distribution with mean μ and

Pr{ }Pr{ } (, ) Pr{ }Pr{ } Pr{ }Pr{ }


1 1

*k k k k*

, we use equation (61) to calculate both beliefs for occurring positive or negative

*k k*


*Out of control O x Out of control In control O x In control*


*k k k*

Pr{ , } ( , ) Pr{ , } Pr{ }

*Out of control x O B x O Out of control x O x O*


1


Pr{ }Pr{ , } Pr{ }

*Out of control O x Out of control O x O*

*Out of control O x Out of control Bx O*

*k k*

Pr{*x Out of control O x Out of control <sup>k</sup>* -- = -- , Pr *k k* -1} { } (59)

1

(60)

(61)


*k k*

1


*k k*

the process (in-control or out-of-control process).

**4.1. Learning — The beliefs and approach for its improvement**

observations are taken independently in any iteration, then we will have

With this feature, the updated belief is obtained using Bayesian rule:

1 1


*k k k k*


( )Pr{ }

*B O x Out of control*


(60) as

1

shifts in the process mean μ.


*k k*

variance σ<sup>2</sup>

1 1 1


*B O x Out of control B O*

( )Pr{ } (1 ( ))Pr{

*k k k k k*

$$r\left(d\_{gr,sm}(k)\right) = \frac{d\_{gr,sm}(k)\mathcal{B}\_{sm}\left(sm,\mathcal{O}\_{k-1}\right)}{\left(1-d\_{gr,sm}(k)\right)\mathcal{B}\_{gr}\left(gr,\mathcal{O}\_{k-1}\right)}\tag{56}$$

We first present the method of evaluating Pr{*Bgr*,*sm*(*sm*;*Ok* )≥*dgr*,*sm*(*k*)} as follows.

$$\begin{aligned} \Pr\left\{ \mathcal{B}\_{\mathcal{gr},sm}(sm;\mathcal{O}\_k) \ge d\_{\mathcal{gr},sm}(k) \right\} &= \\ \Pr\left\{ \frac{\mathcal{B}\_{sm}\left(sm,\mathcal{O}\_{k-1}\right)e^{\left(\overline{\mathcal{T}}\_{k,\mathcal{O}\_{\text{pr}}}\right)^2}}{\mathcal{B}\_{sm}\left(sm,\mathcal{O}\_{k-1}\right)e^{\left(\overline{\mathcal{T}}\_{k,\mathcal{O}\_{\text{pr}}}\right)^2} + \mathcal{B}\_{\mathcal{gr}}\left(gr,\mathcal{O}\_{k-1}\right)e^{\left(\overline{\mathcal{T}}\_{k,\mathcal{m}}\right)^2}}} &\ge d\_{\mathcal{gr},sm}(k) \right\} \\ \Pr\left\{ e^{\left(\overline{\mathcal{T}}\_{k,\mathcal{O}\_{\text{pr}}}\right)^2} &\ge h\left(d\_{\mathcal{gr},sm}(k)\right)e^{\left(\overline{\mathcal{T}}\_{k,\mathcal{m}}\right)^2} \right\} \end{aligned} \tag{57}$$

Then, the method of evaluating probability terms in equation (57) is given in appendix 2. With similar reasoning, we have,

$$\Pr\left\{B\_{gr,sm}\left(gr;O\_k\right) \ge d\_{gr,sm}(k)\right\} = \Pr\left\{e^{\left(T\_{k,gr}\right)^2} \ge r\left(d\_{gr,sm}\left(k\right)\right)e^{\left(T\_{k,sm}\right)^2}\right\}\tag{58}$$

The method of determining the minimum acceptable belief is given in appendix 2.
