**2.1 Problem formulation**

The *L*-sensor parallel Bayesian detection network structure with two hypotheses *H*<sup>0</sup> and *H*<sup>1</sup> in the presence of nonideal channels is considered (see **Figure 1**). Assume that *y*1, *y*2, … , *yL* are sensor observations and the *j*th sensor compresses the *nj*dimension vector observation *yj* to one bit: *Ij yj* : *nj* ! f g 0, 1 , *<sup>j</sup>* <sup>¼</sup> 1, … , *<sup>L</sup>*. For notational convenience, *nj* ¼ 1 in the following description. The *L* sensors transmit the compressed data to the fusion center and the fusion center makes the decision between *H*<sup>0</sup> and *H*1. Since external interference and internal errors may occur, the channels are not reliable and the fusion center may not correctly receive the symbol *Ij* sent by the *j*th sensor. Let *I* 0 *<sup>j</sup>* denote the received bit by the fusion center for *j* ¼ 1, 2, … , *L*. Generally speaking, *I* 0 *<sup>j</sup>* may not be equal to *Ij*. The definition and assumptions on channel errors (see e.g., [29, 31]) are summarized below:

Definition 1: The channel errors between the *j*th sensor and the fusion center are described as *Pce*<sup>1</sup> *<sup>j</sup>* <sup>¼</sup> *P I*<sup>0</sup> *<sup>j</sup>* ¼ 0j*Ij* ¼ 1 and *<sup>P</sup>ce*<sup>0</sup> *<sup>j</sup>* <sup>¼</sup> *P I*<sup>0</sup> *<sup>j</sup>* ¼ 1j*Ij* ¼ 0 for *<sup>j</sup>* <sup>¼</sup> 1, 2, … , *<sup>L</sup>*, where *Pce*<sup>1</sup> *<sup>j</sup>* is the probability of channel error when the *j*th sensor sends 1 but the fusion center receives 0, and *Pce*<sup>0</sup> *<sup>j</sup>* is the probability of channel error when the *j*th sensor sends 0 but the fusion center receives 1.

### **Figure 1.**

*The L-sensor parallel binary Bayesian detection network structure in the presence of nonideal channels.*

Assumption 1: The probabilities of channel error are statistically independent of the hypotheses, namely *P I*<sup>0</sup> *<sup>j</sup>* j*Ij*, *H<sup>ν</sup>* � � <sup>¼</sup> *P I*<sup>0</sup> *j* j*Ij* � �, *<sup>ν</sup>* <sup>¼</sup> 0, 1.

Remark 1: Assumption 1 is due to the hierarchical structure based on the Markov property (see [29]).

Assumption 2: The channels that connect the sensors to the fusion center are independent, i.e., *P I*<sup>0</sup> <sup>1</sup> , *I* 0 <sup>2</sup> , … , *I* 0 *<sup>L</sup>*j*I*1, *I*2, … , *IL* � � <sup>¼</sup> <sup>Q</sup>*<sup>L</sup> <sup>j</sup>*¼1*P I*<sup>0</sup> *j* j*Ij* � �.

We consider the parallel binary Bayesian detection network with nonideal channels that is built on the above definition and assumptions. The final decision is made by the fusion center based on the received binary bits *I* 0 <sup>1</sup> , *I* 0 <sup>2</sup> , … , *I* 0 *L* � � from the *L* sensors. From the definition of a general Bayesian cost function given in [25], the *L*-sensor binary Bayesian cost function with channel errors at the fusion center can be written as follows:

$$\begin{aligned} \mathbf{C}(I\_1^0(\mathbf{y}\_1), \dots, I\_L^0(\mathbf{y}\_L); F^0) &= \mathbf{C}\_{00} P\_0 P(F^0 = \mathbf{0} | H\_0) + \mathbf{C}\_{01} P\_1 P(F^0 = \mathbf{0} | H\_1) \\ &+ \mathbf{C}\_{10} P\_0 P(F^0 = \mathbf{1} | H\_0) + \mathbf{C}\_{11} P\_1 P(F^0 = \mathbf{1} | H\_1) \end{aligned} \tag{1}$$
 
$$\begin{aligned} &= c + aP(F^0 = \mathbf{0} | H\_1) - bP(F^0 = \mathbf{0} | H\_0), \end{aligned} \tag{2}$$

where *Cαβ*, *α*, *β* ¼ 0, 1 are suitable cost coefficients, *P*<sup>0</sup> and *P*<sup>1</sup> are the prior probabilities for the hypotheses *H*<sup>0</sup> and *H*1, respectively, *F*<sup>0</sup> is the fusion rule, and *P F*<sup>0</sup> <sup>¼</sup> *<sup>μ</sup>*j*H<sup>ν</sup>* � �, *<sup>μ</sup>*, *<sup>ν</sup>* <sup>¼</sup> 0, 1 denotes the conditional probability of the event that the fusion center decides in favor of hypothesis *μ* when the real hypothesis is *Hν*. The cost function (1) is simplified to (2) by defining *c* ¼ *C*10*P*<sup>0</sup> þ *C*11*P*1, *a* ¼ *P*1ð Þ *C*<sup>01</sup> � *C*<sup>11</sup> , *<sup>b</sup>* <sup>¼</sup> *<sup>P</sup>*0ð Þ *<sup>C</sup>*<sup>10</sup> � *<sup>C</sup>*<sup>00</sup> . *<sup>F</sup>*<sup>0</sup> is actually a function of the disjoint set of all possible binary messages *I* 0 <sup>1</sup> , *I* 0 <sup>2</sup> , … , *I* 0 *L* � �. The received decisions are divided into two sets denoted as *H*<sup>0</sup> 0 and *H*<sup>0</sup> <sup>1</sup> which are given by

$$H\_0^0 = \left\{ \left( u\_1^0, u\_2^0, \dots, u\_L^0 \right) : F^0 \left( \left( I\_1^0, I\_2^0, \dots, I\_L^0 \right) \right) = 0, I\_j^0 = u\_j^0, u\_j^0 = 0/1, j = 1, \dots, L \right\};$$

$$H\_1^0 = \left\{ \left( u\_1^0, u\_2^0, \dots, u\_L^0 \right) : F^0 \left( \left( I\_1^0, I\_2^0, \dots, I\_L^0 \right) \right) = 1, I\_j^0 = u\_j^0, u\_j^0 = 0/1, j = 1, \dots, L \right\}.$$

Obviously, *<sup>H</sup>*<sup>0</sup> <sup>¼</sup> *<sup>u</sup>*<sup>0</sup> <sup>1</sup> , *u*<sup>0</sup> <sup>2</sup> , … , *u*<sup>0</sup> *L* � � : *I* 0 *<sup>j</sup>* <sup>¼</sup> *<sup>u</sup>*<sup>0</sup> *<sup>j</sup>* , *u*<sup>0</sup> *<sup>j</sup>* ¼ 0*=*1, *j* ¼ 1, … , *L* n o <sup>¼</sup> *<sup>H</sup>*<sup>0</sup> <sup>0</sup> ∪ *H*<sup>0</sup> 1 . 0 0 0

For any binary decisions *I* <sup>1</sup> , *I* <sup>2</sup> , … , *I L* � � received by the fusion center, the original sensor decision bits before transmission are ð Þ *I*1, *I*2, … , *IL* and they consist of the set *<sup>H</sup>* <sup>¼</sup> ð Þ *<sup>u</sup>*1, *<sup>u</sup>*2, … , *uL* : *Ij* <sup>¼</sup> *uj*, *uj* <sup>¼</sup> <sup>0</sup>*=*1, *<sup>j</sup>* <sup>¼</sup> 1, … , *<sup>L</sup>* � �*:* Therefore, based on the law of total probability, the conditional probability formula, and Assumption 1:

$$P\{F^0 = 0 | H\_\nu\} = \sum\_{s^0 \in H\_0^0} P(D^0 | H\_\nu) = \sum\_{s^0 \in H\_0^{0s}} \sum\_{\pi \in H} P(D^0 | D)P(D|H\_\nu), \tag{3}$$

where *<sup>D</sup>*<sup>0</sup> <sup>¼</sup> *<sup>I</sup>* 0 <sup>1</sup> , *I* 0 <sup>2</sup> , … , *I* 0 *L* � �, *s* <sup>0</sup> <sup>¼</sup> *<sup>s</sup>* <sup>0</sup>ð Þ<sup>1</sup> , … , *<sup>s</sup>* <sup>0</sup> ð Þ ð Þ *<sup>L</sup>* , *<sup>I</sup>* 0 *<sup>j</sup>* ¼ *s* <sup>0</sup>ð Þ*<sup>j</sup>* , and *<sup>s</sup>* <sup>0</sup>ðÞ¼ *<sup>j</sup>* <sup>0</sup>*=*1 is a specific value of *I* 0 *<sup>j</sup>* ; in the same way, *D* ¼ ð Þ *I*1, *I*2, … , *IL* , *s* ¼ ð Þ *s*ð Þ1 , … , *s L*ð Þ , *Ij* ¼ *s j*ð Þ, and *s j*ðÞ¼ <sup>0</sup>*=*1 is a specific value of *Ij*. Strictly speaking, we should use *P D*<sup>0</sup> <sup>¼</sup> *<sup>s</sup>* <sup>0</sup>j*H<sup>ν</sup>* � � to represent *P D*<sup>0</sup>j*H<sup>ν</sup>* � � and we use the latter for notational simplicity. It is similar to *P D*ð Þ j*H<sup>ν</sup>* . Based on Assumption 2:

*Decision Fusion for Large-Scale Sensor Networks with Nonideal Channels DOI: http://dx.doi.org/10.5772/intechopen.106075*

$$P(D^0|D) = \prod\_{j=1}^{L} P\left(I\_j^0|I\_j\right),\tag{4}$$

where for any 1≤*j*≤*L*

$$P\left(I\_{j}^{0}|I\_{j}\right) = \left(\mathbf{1} - P\_{j}^{\rm ex0}\right)\left(\mathbf{1} - I\_{j}^{0}\right)\left(\mathbf{1} - I\_{j}\right) + P\_{j}^{\rm ex0}I\_{j}^{0}\left(\mathbf{1} - I\_{j}\right) + \left(\mathbf{1} - P\_{j}^{\rm ex1}\right)I\_{j}^{0}I\_{j} + P\_{j}^{\rm ex1}\left(\mathbf{1} - I\_{j}^{0}\right)I\_{j}.\tag{5}$$

Thus, the cost function (2) becomes

$$\mathcal{L}(I\_1^0(\boldsymbol{y}\_1), \dots, I\_L^0(\boldsymbol{y}\_L); \boldsymbol{F}^0) = \boldsymbol{c} + \sum\_{\boldsymbol{\rho}^0 \in H\_0^{0\varepsilon}} \sum\_{\boldsymbol{\epsilon} \in H} \mathcal{P}(\boldsymbol{D}^0 | \boldsymbol{D}) [a \boldsymbol{P}(\boldsymbol{D} | \boldsymbol{H}\_1) - b \boldsymbol{P}(\boldsymbol{D} | \boldsymbol{H}\_0)] \tag{6}$$

$$\triangleq C\left(I\_1(\boldsymbol{\jmath}\_1), \ldots, I\_L(\boldsymbol{\jmath}\_L); F^0; P^{\epsilon \epsilon 0}, P^{\epsilon \epsilon 1}\right), \tag{7}$$

where *<sup>P</sup>ce*<sup>0</sup> <sup>¼</sup> *<sup>P</sup>ce*<sup>0</sup> <sup>1</sup> , … , *Pce*<sup>0</sup> *L* � �, *<sup>P</sup>ce*<sup>1</sup> <sup>¼</sup> *Pce*<sup>1</sup> <sup>1</sup> , … , *Pce*<sup>1</sup> *L* � �. Hence, the cost function now becomes a function of the sensor rules ð Þ *I*1, … , *IL* , the probabilities of channel errors *Pce*0, *Pce*<sup>1</sup> , and the fusion rule *F*0. The goal of this chapter is to optimize the sensor rules and the fusion rule so as to minimize the cost function with known probabilities of channel errors.

We rewrite *aP D*ð Þ� j*H*<sup>1</sup> *bP D*ð Þ j*H*<sup>0</sup> as follows:

$$\begin{split} a\text{PD}|H\_1 - b\text{PD}|H\_0 &= \int\_{\Omega} a p p\_1, \dots, p\_L |H\_1 - b p p\_1, \dots, p\_L|H\_0 d p\_1 \cdots d p\_L \\ &= \int I\_{\Omega} \left[ a p p\_1, \dots, p\_L |H\_1 - b p p\_1, \dots, p\_L |H\_0 \right] d p\_1 \cdots d p\_L, \end{split} \tag{8}$$

where Ω*<sup>s</sup>* ¼ *y*1, … , *yL* � � : *<sup>I</sup>*<sup>1</sup> *<sup>y</sup>*<sup>1</sup> � � <sup>¼</sup> *<sup>s</sup>*ð Þ<sup>1</sup> , … , *IL yL* � � <sup>¼</sup> *s L*ð Þ � �, *<sup>I</sup>*Ω*<sup>s</sup>* is an indicator function on Ω*s*, and the region of integration in (8) is the full space. Assume that *p y*1, *y*2, … , *yL*j*H<sup>ν</sup>* � �, *<sup>ν</sup>* <sup>¼</sup> 0, 1 (or *p y*ð Þ <sup>j</sup>*H<sup>ν</sup>* ) are the known conditional joint probability density functions. If not, we can learn the joint probability density functions from training data using copula functions (see, e.g., [17]). Note that *I*<sup>1</sup> *y*<sup>1</sup> � �, … , *IL yL* � � are indicator functions and *s j*ðÞ¼ 0*=*1, *j* ¼ 1, … , *L*,

$$\begin{split} I\_{\Omega\_{\mathsf{t}}} &= I\_{\left\{ (\mathsf{y}\_{1}, \ldots, \mathsf{y}\_{L}) : I\_{\mathrm{l}}(\mathsf{y}\_{1}) = \mathsf{s}(\mathsf{1}), \ldots, I\_{\mathrm{l}}(\mathsf{y}\_{L}) = \mathsf{s}(L) \right\}} \\ &= I\_{\left\{ \mathsf{y}\_{1} : I\_{\mathrm{l}}(\mathsf{y}\_{1}) = \mathsf{s}(\mathsf{1}) \right\}} \cdots \mathsf{I}\_{\left\{ \mathsf{y}\_{L} : I\_{\mathrm{l}}(\mathsf{y}\_{L}) = \mathsf{s}(L) \right\}} \\ &= [(\mathsf{1} - I\_{\mathrm{l}})(\mathsf{1} - \mathsf{s}(\mathsf{1})) + I\_{\mathrm{l}}\mathsf{s}(\mathsf{1})] \cdots [(\mathsf{1} - I\_{L})(\mathsf{1} - \mathsf{s}(L)) + I\_{L}\mathsf{s}(L)]. \end{split} \tag{9}$$

For simplicity, denote *Qj Ij* � � <sup>¼</sup> <sup>1</sup> � *Ij* � �ð Þþ <sup>1</sup> � *s j*ð Þ *Ijs j*ð Þ. Substituting (8) into (6),

$$\begin{split} & \mathcal{C} \{ I\_1(\mathbf{y}\_1), \dots, I\_L(\mathbf{y}\_L); F^0; P^{\varepsilon 0}, P^{\varepsilon 1} \} \\ &= c + \sum\_{\boldsymbol{\mathcal{P}}} \sum\_{\mathbf{H}^0\_0} \mathbf{P}(\mathbf{D}^0 | \mathbf{D}) \cdot \left[ \mathbf{Q}\_1(I\_1) \cdots \mathbf{Q}\_L(I\_L) [ap(\mathbf{y} | \mathbf{H}\_1) - bp(\mathbf{y} | \mathbf{H}\_0)] d\mathbf{y} = c + \int P\_{H\_0^0} \hat{L}(\mathbf{y}) d\mathbf{y}, \right] \end{split} \tag{10}$$

where *PH*<sup>0</sup> <sup>0</sup> <sup>¼</sup> <sup>P</sup> *s*<sup>0</sup> ∈ *H*<sup>0</sup> 0 P *<sup>s</sup>*<sup>∈</sup> *<sup>H</sup>P D*<sup>0</sup>j*<sup>D</sup>* � �*Q*1ð Þ *<sup>I</sup>*<sup>1</sup> <sup>⋯</sup>*QL*ð Þ *IL* and *L y* ^ð Þ¼ *ap y*ð Þ� <sup>j</sup>*H*<sup>1</sup> *bp y*ð Þ <sup>j</sup>*H*<sup>0</sup> . Note that from the definition of *<sup>H</sup>*<sup>0</sup> 0, *H*<sup>0</sup> <sup>1</sup> , *H*0, and *H*, we have

$$\begin{split} P\_{H\_0^0} &= \sum\_{s^0 \in H^0} \left[ 1 - F^0(D^0) \right] \sum\_{s \in H} P(D^0 | D) Q\_1(I\_1) \cdots Q\_L(I\_L) \\ &= \sum\_{k'=1}^{2^L} \sum\_{k=1}^{2^L} \left[ 1 - F^0(s\_{k'}) \right] P(s\_{k'} | s\_k) \cdot \prod\_{j=1}^L \left\{ \left[ 1 - I\_j(\mathbf{y}\_j) \right] \left[ 1 - s\_k(j) \right] + I\_j(\mathbf{y}\_j) s\_k(j) \right\}, \end{split} \tag{11}$$

where *sk*<sup>0</sup> is the element of *<sup>H</sup>*<sup>0</sup> and *sk* is the element of *<sup>H</sup>*. *<sup>F</sup>*<sup>0</sup> *<sup>D</sup>*<sup>0</sup> � � <sup>¼</sup> <sup>0</sup>*=*1 is used in the first equality. The second equality holds since there are 2*<sup>L</sup>* elements in both *H* and *H*0.

## **2.2 Monte Carlo cost function**

An essential difficulty of the Bayesian cost function (10) is the required high dimensional integration when dealing with large-scale sensor networks. Monte Carlo importance sampling is an attractive method to deal with this problem. In this subsection, we approximate the Bayesian cost function (10) by the Monte Carlo importance sampling method (see, e.g., [34, 35]). According to (10),

$$\begin{split} &C\left(I\_{1}(\boldsymbol{y}\_{1}),\ldots,I\_{L}(\boldsymbol{y}\_{L});F^{0};P^{\varepsilon 0},P^{\varepsilon 1}\right) \\ &=c+\int P\_{H\_{0}^{0}}(I\_{1}(\boldsymbol{y}\_{1}),\ldots,I\_{L}(\boldsymbol{y}\_{L});F^{0};P^{\varepsilon 0},P^{\varepsilon 1})\frac{\hat{L}(\boldsymbol{y})}{\mathbf{g}(\boldsymbol{y})}\cdot\mathbf{g}(\boldsymbol{y})d\mathbf{y} \\ &=\mathbb{E}\_{\mathbb{X}}\frac{P\_{H\_{0}^{0}}(Y)\hat{L}(Y)}{\mathbf{g}(Y)}+c,\tag{13} \end{split} \tag{13}$$

where *y* ¼ *y*1, *y*2, … , *yL* � �, and *g y*ð Þ is a given importance sampling density such that (12) is well-defined (i.e., *g y*ð Þ> 0). In (13), the expectation is taken with respect to the importance sampling density *g*. Consequently, assume that *N* samples *Y*1, … , *YN* are generated from the density *g*, that is, *Y* � *g y*ð Þ, where *Yi* ¼ ½ � *Yi*1, *Yi*2, … , *YiL* . Then

$$\mathbb{C}\left(I\_1(\mathbf{y}\_1), \dots, I\_L(\mathbf{y}\_L); F^0; P^{\epsilon 0}, P^{\epsilon 1}\right) \approx \frac{1}{N} \sum\_{i=1}^N \frac{P\_{H\_0^0}(\mathbf{Y}\_{i1}, \mathbf{Y}\_{i2}, \dots, \mathbf{Y}\_{iL}) \hat{L}(\mathbf{Y}\_i)}{\mathbf{g}(\mathbf{Y}\_i)} + \varepsilon \tag{14}$$

$$\triangleq \mathbb{C}\_{\text{MC}}(I\_1(\mathbf{y}\_1), \dots, I\_L(\mathbf{y}\_L); F^0; P^{\epsilon 0}, P^{\epsilon 1}, N). \tag{15}$$

Based on the strong law of large numbers, the expectation (13) can be approximated by the empirical average (14). Denote (14), namely the Monte Carlo cost function, as *CMC I*<sup>1</sup> *y*<sup>1</sup> � �, … , *IL yL* � �; *F*0; *Pce*0, *Pce*<sup>1</sup> , *N* � �. The optimal importance sampling density is *g y*1, *y*2, … , *yL* � �∝∣*PH*<sup>0</sup> 0 *L y* ^ <sup>1</sup>, *<sup>y</sup>*2, � … , *yL*Þ<sup>∣</sup> (see, e.g., [34, 35]).

The initial goal is to minimize the Bayesian cost function (10). Instead, we can minimize the Monte Carlo cost function (15) by selecting a set of optimal sensor rules *I*<sup>1</sup> *y*<sup>1</sup> � �, *<sup>I</sup>*<sup>2</sup> *<sup>y</sup>*<sup>2</sup> � �, … , *IL yL* � � and an optimal fusion rule *F*0. In this manner, the highdimensional integration problem is converted to a problem where we need to deal with the single summation objective function for large-scale sensor networks. Thus, for dependent observations with channel errors, the computational complexity is reduced significantly by the Monte Carlo importance sampling method. In the following sections, we assume that the samples drawn from the importance sampling

density are fixed so that *CMC I*1, … , *IL*; *F*0; *Pce*0, *Pce*<sup>1</sup> , � *<sup>N</sup>*<sup>Þ</sup> does not have any randomness, since only deterministic decision rules are considered in this chapter.
