**2. Mathematical models**

First, we describe the measurement model. The concentration measurements are modelled using a Lagrange encounters model developed in [13], based on an open field assumption and a two-dimensional geometry. Let *i*th robotic vehicle position (*<sup>i</sup>* <sup>¼</sup> <sup>1</sup>*,* <sup>2</sup>*,* …*, N*) at time *tk* be denoted by **<sup>r</sup>***<sup>i</sup> <sup>k</sup>* ∈ R<sup>2</sup> . Suppose that the emitting source is located at coordinates specified by the vector **r**<sup>0</sup> ¼ ½ � *X*0*; Y*<sup>0</sup> <sup>⊺</sup> and its release rate, or strength, is *Q*0. The goal of the search is to detect and estimate the sourceparameter vector *η*<sup>0</sup> ¼ **r** ⊺ <sup>0</sup> *Q*<sup>0</sup> <sup>⊺</sup> in the shortest possible time. The particles released from the source propagate with combined molecular and turbulent isotropic diffusivity *D*, but can also be advected by wind. The released particles have an average lifetime *τ* before being absorbed. Let the *average* wind characteristics be the speed *U* and direction, which by convention, coincides with the direction of the *x* axis. Suppose a spherical concentration measuring sensor of small radius *a* is mounted on the *i*th robot, whose position at time *k* is<sup>1</sup> **r***<sup>i</sup> <sup>k</sup>* <sup>¼</sup> *xi <sup>k</sup>; y<sup>i</sup> k* � �<sup>⊺</sup> . This sensor will experience a series of encounters with the particles released from the emitting source. The average rate of encounters can be modelled as follows [13]:

$$R\left(\eta\_{0},\mathbf{r}\_{k}^{i}\right) = \frac{Q\_{0}}{\ln\left(\frac{\lambda}{a}\right)} \exp\left[\frac{\left(X\_{0} - \mathbf{x}\_{k}^{i}\right)U}{2D}\right] \cdot K\_{0}\left(\frac{d\_{k}^{i}\left(\mathbf{r}\_{0},\mathbf{r}\_{k}^{i}\right)}{\lambda}\right) \tag{1}$$

where *D*, *τ* and *U* are known environmental parameters,

*di <sup>k</sup>* **r**0*;* **r***<sup>i</sup> k* � � <sup>¼</sup> ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi *xi <sup>k</sup>* � *X*<sup>0</sup> � �<sup>2</sup> <sup>þ</sup> *<sup>y</sup><sup>i</sup> <sup>k</sup>* � *Y*<sup>0</sup> � �<sup>2</sup> <sup>q</sup> is the distance between the source and the *i*th sensor platform, *K*<sup>0</sup> is the modified Bessel function of the second kind of order zero, and *λ* ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi *<sup>D</sup>τ*<sup>∕</sup> <sup>1</sup> <sup>þ</sup> *<sup>U</sup>*2*<sup>τ</sup>* 4*D* <sup>r</sup> � � depends on environmental parameters only.

The probability that a sensor at location **r***<sup>i</sup> <sup>k</sup>* is hit by *z*∈Z<sup>þ</sup>∪f g0 dispersed particles (where *z* is a non-negative integer) during a time interval *t*<sup>0</sup> is Poisson distributed, i.e.,

$$\mathcal{P}(z; \mu\_k^i) = \frac{\left(\mu\_k^i\right)^x}{z!} e^{-\mu\_k^i}.\tag{2}$$

Parameter *μ<sup>i</sup> <sup>k</sup>* <sup>¼</sup> *<sup>t</sup>*<sup>0</sup> � *<sup>R</sup> <sup>η</sup>*0*;* **<sup>r</sup>***<sup>i</sup> k* � � in (2) is the mean number of particles expected to reach the sensor at location **r***<sup>i</sup> <sup>k</sup>* during interval *t*0. Eq. (2) expressed the likelihood function of a concentration measurement *zi <sup>k</sup>* collected by *i*th sensor, i.e., ℓ *z<sup>i</sup> <sup>k</sup>*j*η*<sup>0</sup> � � <sup>¼</sup> <sup>P</sup> *<sup>z</sup><sup>i</sup> <sup>k</sup>*; *μ<sup>i</sup> k* � �.

The motion model of a coordinated group of robots is described next. Let the pose vector of the *i*th robot platform at time *tk* be denoted *θ<sup>i</sup> <sup>k</sup>* <sup>¼</sup> **<sup>r</sup>***<sup>i</sup> k* � �<sup>⊺</sup> *; ϕ<sup>i</sup> k* � �<sup>⊺</sup> , where **r***i <sup>k</sup>* <sup>¼</sup> *xi <sup>k</sup>; y<sup>i</sup> k* � �<sup>⊺</sup> has already been introduced and *ϕ<sup>i</sup> <sup>k</sup>* is the vehicle heading. The group of searching vehicles moves in a formation. The centroid of the formation at time *tk* is specified by coordinates:

$$\mathbf{x}\_k^\varepsilon = \frac{1}{N} \sum\_{i=1}^N \mathbf{x}\_k^i, \quad \mathbf{y}\_k^\varepsilon = \frac{1}{N} \sum\_{i=1}^N \mathbf{y}\_k^i. \tag{3}$$

For each platform *i* ¼ 1*,* …*, N*, the offset Δ*xi;* Δ*yi* � � from the centroid *x<sup>c</sup> <sup>k</sup>; y<sup>c</sup> k* � � is predefined and known to it (i.e., *xi <sup>k</sup>* <sup>¼</sup> *xc <sup>k</sup>* <sup>þ</sup> <sup>Δ</sup>*xi*, *<sup>y</sup><sup>i</sup> <sup>k</sup>* <sup>¼</sup> *<sup>y</sup><sup>c</sup> <sup>k</sup>* þ Δ*yi* ).

The measurements of concentration are taken at time instants *tk*, *k* ¼ 1*,* 2*,* ⋯. Between two consecutive sensing instants, each platform is moving. Let the duration of this interval (referred to as the *travel time*) for the *i*th platform be *T<sup>i</sup> <sup>k</sup>* ≥0. The assumption is that sensing is suppressed during the travel time.

Motion of the *i*th platform during interval *T<sup>i</sup> <sup>k</sup>* is controlled by linear velocity *V<sup>i</sup> k* and angular velocity Ω*<sup>i</sup> <sup>k</sup>*. Given that the motion control vector **u***<sup>i</sup> <sup>k</sup>* <sup>¼</sup> *<sup>V</sup><sup>i</sup> <sup>k</sup>;* Ω*<sup>i</sup> <sup>k</sup>; T<sup>i</sup> k* � �<sup>⊺</sup> is applied to the *i*th platform, its dynamics during a short integration time interval *δ* ≪ *T<sup>i</sup> <sup>k</sup>* can be modelled by a Markov process whose transitional density is *π θ<sup>i</sup> t* j*θi <sup>t</sup>*�*<sup>δ</sup>;* **<sup>u</sup>***<sup>i</sup> k* � � <sup>¼</sup> <sup>N</sup> *<sup>θ</sup><sup>i</sup> t* ; *β θ<sup>i</sup> <sup>t</sup>*�*<sup>δ</sup>;* **<sup>u</sup>***<sup>i</sup> k* � �*;* **Q** � �. The process noise covariance matrix **Q** captures the uncertainty in motion due to the unforeseen disturbances. The vehicle motion function *β θ<sup>i</sup> <sup>t</sup>*�*<sup>δ</sup>;* **<sup>u</sup>***<sup>i</sup> k* � � is:

disconnected patches. The information gain-based methods [13] have been developed specifically for searching in turbulent flows. In the absence of a smooth distribution of concentration (e.g., due to turbulence), this strategy directs the searching robot(s) towards the highest information gain. As a theoretically principled approach, where the source-parameter estimation is carried out in the Bayesian framework and the searching platform motion control is based on the information-theoretic principles, the infotaxic (or cognitive) search strategies have

This chapter summarizes our recent results in development of an autonomous infotaxic coordinated search strategy for a group of robots, searching for an emitting hazardous source in open terrain under turbulent conditions. The assumption is that the search platforms can move and sense. Two types of sensor measurements are collected sequentially: (a) the concentration of the hazardous substance; (b) the platform location within the search domain. Due to the turbulent transport of the emitted substance, the concentration measurements are typically sporadic and fluctuating. The searching platforms form a moving sensor network, thus enabling the exchange of data and a cooperative behaviour. The multi-robot infotaxis have already been studied in [16, 17, 20, 24]. However, all mentioned references assumed *all-to-all* (i.e., fully connected) communication network with *centralised fusion and*

We develop an approach where the group of searching robots operate in a fully

First, we describe the measurement model. The concentration measurements are modelled using a Lagrange encounters model developed in [13], based on an open field assumption and a two-dimensional geometry. Let *i*th robotic vehicle position

*<sup>k</sup>* ∈ R<sup>2</sup>

from the source propagate with combined molecular and turbulent isotropic diffusivity *D*, but can also be advected by wind. The released particles have an average lifetime *τ* before being absorbed. Let the *average* wind characteristics be the speed *U* and direction, which by convention, coincides with the direction of the *x* axis. Suppose a spherical concentration measuring sensor of small radius *a* is mounted on

<sup>⊺</sup> in the shortest possible time. The particles released

strength, is *Q*0. The goal of the search is to detect and estimate the source-

. Suppose that the emitting source is

<sup>⊺</sup> and its release rate, or

decentralised coordinated manner. Decentralised operation means that each searching robot performs the computations (i.e., source estimation and path planning) locally and independently of other platforms. Having a common task, however the robotic platforms must perform in a coordinated manner. This coordination is achieved by exchanging the data with immediate neighbours only, in a manner which does not require the global knowledge of the communication network topology. For this reason, the proposed approach is scalable in the sense that the complexities for sensing, communication, and computing per sensor platform are independent of the sensor network size. In addition, because all sensor platforms are treated equally (no leader-follower hierarchy), this approach is robust to the failure of any of the searching agents. The only requirement for avoiding the break-up of the searching formation is that the communication graph of the sensor network remains *connected* at all times. Source-parameter estimation is carried out sequentially, and on each platform independently, using a Rao-Blackwellised particle filter. Platform path planning, in the spirit of *infotaxis*, is based on entropy-

reduction and is also carried out independently on every platform.

located at coordinates specified by the vector **r**<sup>0</sup> ¼ ½ � *X*0*; Y*<sup>0</sup>

⊺ <sup>0</sup> *Q*<sup>0</sup>

attracted a great deal of interest [3, 14–23].

*Unmanned Robotic Systems and Applications*

*control* of the searching group.

**2. Mathematical models**

parameter vector *η*<sup>0</sup> ¼ **r**

**16**

(*<sup>i</sup>* <sup>¼</sup> <sup>1</sup>*,* <sup>2</sup>*,* …*, N*) at time *tk* be denoted by **<sup>r</sup>***<sup>i</sup>*

<sup>1</sup> Robot locations are assumed to be non-coincidental with the source location **r**0.

$$\beta \left( \boldsymbol{\theta}\_{t-\delta}^{i}, \mathbf{u}\_{k}^{i} \right) = \boldsymbol{\theta}\_{t-\delta}^{i} + \delta \begin{bmatrix} \boldsymbol{V}\_{k}^{i} \cos \left( \boldsymbol{\phi}\_{k-1}^{i} \right) \\ \boldsymbol{V}\_{k}^{i} \sin \left( \boldsymbol{\phi}\_{k-1}^{i} \right) \\ \boldsymbol{\Omega}\_{k}^{i} \end{bmatrix} + \mathbf{B}\_{k-1}^{i},\tag{4}$$

where vector **B***<sup>i</sup> <sup>k</sup>*�<sup>1</sup> <sup>¼</sup> *<sup>ε</sup><sup>i</sup> x δ Ti k εi y δ Ti k* 0 h i<sup>⊺</sup> is introduced to compensate for a distortion of the formation due to process noise with parameters:

$$
\boldsymbol{\varepsilon}\_{\mathbf{x}}^{i} = \overline{\boldsymbol{\pi}}\_{k-1}^{i} - \left(\boldsymbol{\pi}\_{k-1}^{i} - \Delta \boldsymbol{\pi}\_{i}\right) \tag{4a}
$$

**3. Decentralised sequential estimation**

*DOI: http://dx.doi.org/10.5772/intechopen.86540*

will be available at each platform.

where

*ρ* **r**0*;* **r***<sup>i</sup> k* � � <sup>¼</sup> *<sup>t</sup>*0*<sup>R</sup> <sup>η</sup>*0*;* **<sup>r</sup>***<sup>i</sup>*

can be expressed as:

respectively. That is

random variables.

**19**

Estimation and robot motion control are carried out using the measurement dissemination-based decentralised fusion architecture [25]. Measurement locations<sup>2</sup>

exchanged via the communication network. The protocol is iterative. In the first iteration, platform *i* broadcasts its triple to its neighbours and receives from them their measurement triples. In the second, third and all subsequent iterations, platform *i* broadcasts its newly acquired triples to the neighbours, and accepts from them only the triples that this platform has not seen before (newly acquired). Providing that the communication graph is connected, after a sufficient number of iterations (which depends on the topology of the graph), a complete list of measure-

Suppose the posterior density function of the source at discrete-time *k* � 1 and

*g dk*j*η*<sup>0</sup> ð Þ*pi η*<sup>0</sup> ð Þ j*d*<sup>1</sup>:*k*�<sup>1</sup>

*g dk*j*η*<sup>0</sup> ð Þ*pi η*<sup>0</sup> ð Þ j*d*<sup>1</sup>:*k*�<sup>1</sup> *dη*<sup>0</sup>

P *z<sup>i</sup>*

*<sup>X</sup>*<sup>0</sup> � *xi k* � �*U* 2*D* � �

*<sup>k</sup>*; *Q*<sup>0</sup> *ρ* **r**0*;* **r***<sup>i</sup>*

� *K*<sup>0</sup>

*k* � � � � (6)

> *di <sup>k</sup>* **r**0*;* **r***<sup>i</sup> k* � � *λ* !

ð Þ **r**0j*d*<sup>1</sup>:*<sup>k</sup>* (8)

*:* (9)

ð Þ **r**0j*d*<sup>1</sup>:*<sup>k</sup>* will be computed

*pi η*<sup>0</sup> ð Þ j*d*<sup>1</sup>:*k*�<sup>1</sup> and *dk*, the problem of sequential estimation is to compute the poste-

where *g dk*j*η*<sup>0</sup> ð Þ is the likelihood function. Assuming that individual platform

is independent of *Q*0. The posterior density *pi η*<sup>0</sup> ð Þ j*d*<sup>1</sup>:*<sup>k</sup>* is computed using the Rao-Blackwell dimension reduction scheme [26]. Using the chain rule, the posterior

*pi η*<sup>0</sup> ð Þ¼ j*d*<sup>1</sup>:*<sup>k</sup> pi Q*<sup>0</sup> ð Þ� j**r**0*; d*<sup>1</sup>:*<sup>k</sup> pi*

where the posterior of source strength *pi Q*<sup>0</sup> ð Þ j**r**0*; d*<sup>1</sup>:*<sup>k</sup>* will be worked out

using a particle filter. Following [27], we express the posterior *pi Q*<sup>0</sup> ð Þ j**r**0*; d*<sup>1</sup>:*k*�<sup>1</sup> with the Gamma distribution whose shape and scale parameters are *κ<sup>k</sup>*�<sup>1</sup> and *ϑ<sup>k</sup>*�1,

*pi Q*<sup>0</sup> ð Þ¼ j**r**0*; d*<sup>1</sup>:*k*�<sup>1</sup> G *Q*0; *κ<sup>k</sup>*�<sup>1</sup> ð Þ *; ϑ<sup>k</sup>*�<sup>1</sup>

<sup>2</sup> Because the measurement locations are assumed to be known exactly, they will not be treated as

<sup>¼</sup> *<sup>Q</sup>*ð Þ *<sup>κ</sup>k*�1�<sup>1</sup>

*ϑ<sup>κ</sup>k*�<sup>1</sup> *<sup>k</sup>*�<sup>1</sup> <sup>Γ</sup>ð Þ *<sup>κ</sup><sup>k</sup>*�<sup>1</sup>

<sup>0</sup> *e*�*Q*0*=ϑk*�<sup>1</sup>

*N*

*i*¼1

*<sup>k</sup>; y<sup>i</sup> <sup>k</sup>; zi k* � �, are

<sup>1</sup>≤*i*<sup>≤</sup> *<sup>N</sup>*,

(5)

(7)

*<sup>k</sup>; y<sup>i</sup> <sup>k</sup>; z<sup>i</sup> k* � � � �

and the corresponding measured concentration values, i.e., the triple *x<sup>i</sup>*

*Decentralised Scalable Search for a Hazardous Source in Turbulent Conditions*

ment triples from all platforms in the formation, denoted *dk* <sup>¼</sup> *<sup>x</sup><sup>i</sup>*

rior at time *k*, i.e., *pi η*<sup>0</sup> ð Þ j*d*<sup>1</sup>:*<sup>k</sup>* . Using the Bayes rule, the posterior is

*pi η*<sup>0</sup> ð Þ¼ j*d*<sup>1</sup>:*<sup>k</sup>*

*N*

ℓ *z<sup>i</sup> <sup>k</sup>*j*η*<sup>0</sup> � � <sup>¼</sup> <sup>Y</sup>

ln *<sup>λ</sup> a* � � exp

*i*¼1

*g dk*j*η*<sup>0</sup> ð Þ¼ <sup>Y</sup>

*k* � �*=Q*<sup>0</sup> <sup>¼</sup> *<sup>t</sup>*<sup>0</sup>

analytically, while the posterior of source position *pi*

platform *i* be denoted *pi η*<sup>0</sup> ð Þ j*d*<sup>1</sup>:*k*�<sup>1</sup> , where *d*<sup>1</sup>:*k*�<sup>1</sup> � *d*1*, d*2*,* ⋯*, dk*�1. Given

Ð

measurements are conditionally independent, *g dk*j*η*<sup>0</sup> ð Þ can be expressed as

$$
\boldsymbol{\epsilon}\_{\mathcal{Y}}^{i} = \overline{\mathcal{Y}}\_{k-1}^{i} - \left(\boldsymbol{\mathcal{Y}}\_{k-1}^{i} - \Delta \boldsymbol{\mathcal{x}}\_{i}\right). \tag{4b}
$$

Here, *xi <sup>k</sup>*�<sup>1</sup> and *<sup>y</sup><sup>i</sup> <sup>k</sup>*�<sup>1</sup> are the estimates of the coordinates of the formation centroid at *<sup>k</sup>* � 1 (that is of *xc <sup>k</sup>*�<sup>1</sup> and *<sup>y</sup><sup>c</sup> <sup>k</sup>*�1, respectively) available to the *<sup>i</sup>*th platform. Coordinates *xi <sup>k</sup>*�<sup>1</sup> and *<sup>y</sup><sup>i</sup> <sup>k</sup>*�<sup>1</sup> refer to the *known i*th vehicle position at *<sup>k</sup>* � 1. **Figure 1** illustrates the trajectories of *N* ¼ 7 autonomous vehicles in a formation using the described transitional density *π θ<sup>i</sup> t* j*θi <sup>t</sup>*�*δ;* **<sup>u</sup>***<sup>i</sup> k* � �. In the absence of process noise (i.e., **Q** ¼ 0), the vehicles would move in a perfect formation if (a) all control vectors are identical (i.e., **u**<sup>1</sup> *<sup>k</sup>* <sup>¼</sup> **<sup>u</sup>**<sup>2</sup> *<sup>k</sup>* <sup>¼</sup> <sup>⋯</sup> <sup>¼</sup> **<sup>u</sup>***<sup>N</sup> <sup>k</sup>* ), and (b) all headings are identical (i.e., *ϕ*1 *<sup>k</sup>*�<sup>1</sup> <sup>¼</sup> *<sup>ϕ</sup>*<sup>2</sup> *<sup>k</sup>*�<sup>1</sup> <sup>¼</sup> <sup>⋯</sup> <sup>¼</sup> *<sup>ϕ</sup><sup>i</sup> <sup>k</sup>*�1). In this case, each platform would know the true coordinates of the formation centroid (i.e., *xi <sup>k</sup>* <sup>¼</sup> *xc <sup>k</sup>*, *y<sup>i</sup> <sup>k</sup>* <sup>¼</sup> *yc <sup>k</sup>*, for *i* ¼ 1*,* …*, N*), and hence the correction vectors **B***<sup>i</sup> <sup>k</sup>*�<sup>1</sup> would be zero.

A robotic platform can communicate with another platform of the formation, if their mutual distance is smaller than a certain range *R*max. Because of process noise in motion, the distance between the vehicles in the formation will vary and consequently the topology of the communication network graph may also vary. For simplicity, we will assume that communication links (when established) are error free. **Figure 1** illustrates the communication graphs of a formation consisting of *N* ¼ 7 searching platforms at two consecutive time instants.

#### **Figure 1.**

*An example of a formation of N* ¼ *7 searching platforms at k* ¼ *1, 2. The communication graphs (based on established links between the platforms) are indicated with green lines. Note that communication network topology is time-varying. The red line, starting from the centroid of the formation, indicates the instantaneous velocity vector.*

*Decentralised Scalable Search for a Hazardous Source in Turbulent Conditions DOI: http://dx.doi.org/10.5772/intechopen.86540*
