**2. Geostatistical analysis**

2 Will-be-set-by-IN-TECH

collecting relevant data using the onboard sensors. A Sea-Bird Electronics 49 FastCAT CTD had already been installed onboard the MARES AUV to measure conductivity, temperature and depth. MARES' missions for environmental monitoring of wastewater discharges are conducted using a GUI software that fully automates the operational procedures of the campaign (Abreu et al., 2010). By providing visual and audio information, this software guides the user through a series of steps which include: (1) real time data acquisition from CTD and ADCP sensors, (2) effluent plume parameter modeling using the CTD and ADCP data collected, (3) automatic path creation using the plume model parameters, (4) acoustic buoys and vehicle deployment, (5) automatic acoustic network setup and (6) real time tracking

Data processing is the last step of a sewage outfall discharge monitoring campaign. This processing involves the ability to extrapolate from monitoring samples to unsampled locations. Although very chaotic due to turbulent diffusion, the effluent's dispersion process tends to a natural variability mode when the plume stops rising and the intensity of turbulent fluctuations approaches to zero (Hunt et al., 2010). It is likely that after this point the pollutant substances are spatially correlated. In this case, geostatistics appears to be an appropriate technique to model the spatial distribution of the effluent. In fact, geostatistics has been used with success to analyze and characterize the spatial variability of soil properties, to obtain information for assessing water and wind resources, to design sampling strategies for monitoring estuarine sediments, to study the thickness of effluent-affected sediment in the vicinity of wastewater discharges, to obtain information about the spatial distribution of sewage pollution in coastal sediments, among others. As well as giving the estimated values, geostatistics provides a measure of the accuracy of the estimate in the form of the kriging variance. This is one of the advantages of geostatistics over traditional methods of assessing pollution. In this work ordinary block kriging is used to model and map the spatial distribution of temperature and salinity measurements gathered by an AUV on a Portuguese sea outfall monitoring campaign. The aim is to distinguish the effluent plume from the receiving waters, characterize its spatial variability in the vicinity of the discharge

of the AUV mission.

**1.2 Data processing**

and estimate dilution.

Fig. 1. Autonomous Underwater Vehicle MARES.

#### **2.1 Stationary random function models**

The most widely used geostatistical estimation procedures use stationary random function models. A random function is a set of random variables that have some spatial locations and whose dependence on each other is specified by some probabilistic mechanism. A random function is stationary if all the random variables have the same probability distribution and if any pair of random variables has a joint probability distribution that depends only on the separation between the two points and not on their locations. If the random function is stationary, then the expected value and the variance can be used to summarize the univariate behavior of the set of random variables. The parameter that is commonly used to summarize the bivariate behavior of a stationary random function is its covariance function, its correlogram, and its variogram. The complete definition of the probabilistic generating mechanism of a random function is usually difficult even in one dimension. Fortunately, for many of the problems we typically encounter, we do not need to know the probabilistic generating mechanism. We usually adopt a stationary random function as our model and specify only its covariance or variogram (Isaaks & Srivastava, 1989; Kitanidis, 1997; Wackernagel, 2003).

#### **2.2 Ordinary kriging**

Ordinary kriging method is often referred with the acronym BLUE which stands for "Best Linear Unbiased Estimator". "Linear" because its estimates are weighted linear combinations of the available data; "Unbiased" since it tries to have the mean error equal to 0; and "Best"because it aims at minimizing the variance of the errors. Let us then see how the concept of a random function model can be used to decide how to weight the nearby samples so that our estimates are unbiased. For any point at which we want to estimate the unknown value, our model is a stationary random function that consists of *n* random variables, one for the value at each of the *n* sample locations, *Z*(**x**1), *Z*(**x**2),..., *Z*(**x***n*), and one for the unknown value at the point we are trying to estimate *Z*(**x**0). Each of these random variables has the same probability law; at all locations, the expected value of the random variable is *m* and the variance is *σ*2. Every value in this model is seen as an outcome (or realization) of the random variable. Our estimate is also a random variable since it is a weighted linear combination of the random variables at the *n* sampled locations (Cressie, 1993; Goovaerts, 1997; Isaaks & Srivastava, 1989; Kitanidis, 1997; Stein, 1999; Wackernagel, 2003; Webster & Oliver, 2007):

$$
\hat{Z}(\mathbf{x}\_0) = \sum\_{i=1}^n w\_i \cdot Z(\mathbf{x}\_i). \tag{1}
$$

The estimation error is defined as the difference between the random variable modeling the true value and the estimate:

$$
\varepsilon(\mathbf{x}\_0) = Z(\mathbf{x}\_0) - \hat{Z}(\mathbf{x}\_0). \tag{2}
$$

The estimation error is also a random variable. Its expected value, often referred to as the bias, is

$$E\left[\varepsilon(\mathbf{x}\_0)\right] = m\left(1 - \sum\_{i=1}^n w\_i\right). \tag{3}$$

produces the following system of equations:

*<sup>∂</sup>*(*w*1) <sup>=</sup> <sup>−</sup><sup>2</sup> *<sup>C</sup>*<sup>10</sup> <sup>+</sup> <sup>2</sup>

*<sup>∂</sup>*(*wi*) <sup>=</sup> <sup>−</sup><sup>2</sup> *Ci*<sup>0</sup> <sup>+</sup> <sup>2</sup>

*<sup>∂</sup>*(*wn*) <sup>=</sup> <sup>−</sup><sup>2</sup> *Cn*<sup>0</sup> <sup>+</sup> <sup>2</sup>

which can also be written in a compact way as *n* ∑ *j*=1

> ⎡ ⎢ ⎢ ⎢ ⎣

*Z*(**x**0) with the minimum error variance are then given by

*n* ∑ *i*=1

*σ*2

*n* ∑ *j*=1

. . . . . . ... . . .

*∂*(*σ*<sup>2</sup> *ε*(**x**0) ) *<sup>∂</sup>*(*μ*) <sup>=</sup> <sup>2</sup>

*n* ∑ *j*=1

> *n* ∑ *j*=1

*n* ∑ *j*=1

> � *n* ∑ *i*=1

*wjCij* + *μ* = *Ci*0, ∀*i* = 1, 2, . . . , *n*;

*C*<sup>11</sup> ... *C*1*<sup>n</sup>* 1

*Cn*<sup>1</sup> ... *Cnn* 1 1 ... 1 0

� �� � (*n*+1)×(*n*+1)

. .

Based on Geostatistics Using an Autonomous Underwater Vehicle

. . *wjC*1*<sup>j</sup>* + 2 *μ* = 0 ⇒

<sup>243</sup> Mapping and Dilution Estimation of Wastewater Discharges

*wjCij* + 2 *μ* = 0 ⇒

*wjCnj* + 2 *μ* = 0 ⇒

�

*wi* − 1

This system of equations, often referred to as the *ordinary kriging system*, can be written in

⎤ ⎥ ⎥ ⎥ ⎦

·

The set of weights and the Lagrange multiplier that will produce an unbiased estimate of

Multiplying each of the *n* equations given in Eq. 10 by *wi* and summing these *n* equations

� *n* ∑ *i*=1

*n* ∑ *i*=1

*wi wj Cij* =

Substituting this into Equation 8 the minimized error variance comes as follows:

*<sup>ε</sup>*(**x**0) <sup>=</sup> *<sup>σ</sup>*<sup>2</sup> <sup>−</sup>

**C** · **W** = **D**

*w*1 . . . *wn μ*

� �� � (*n*+1)×1

*<sup>ε</sup>*(*x*0) can be obtained in a quicker way using an alternative expression to Eq. 8.

*wi Ci*<sup>0</sup> − *μ*.

�

*wi Ci*<sup>0</sup> + *μ*

⎤ ⎥ ⎥ ⎥ ⎦

=

⎡ ⎢ ⎢ ⎢ ⎣

*C*<sup>10</sup> . . . *Cn*<sup>0</sup> 1

� �� � (*n*+1)×1

⎤ ⎥ ⎥ ⎥ ⎦

**<sup>W</sup>** <sup>=</sup> **<sup>C</sup>**−<sup>1</sup> · **<sup>D</sup>**. (12)

⎡ ⎢ ⎢ ⎢ ⎣

. .

. .

= 0 ⇒

*n* ∑ *j*=1

*n* ∑ *j*=1

*n* ∑ *j*=1

*n* ∑ *i*=1

> *n* ∑ *i*=1

*wjC*1*<sup>j</sup>* + *μ* = *C*<sup>10</sup>

*wjCij* + *μ* = *Ci*<sup>0</sup>

*wjCnj* + *μ* = *Cn*<sup>0</sup>

*wi* = 1. (10)

, (13)

(11)

. .

. .

*wi* = 1

*∂*(*σ*<sup>2</sup> *ε*(**x**0) )

*∂*(*σ*<sup>2</sup> *ε*(**x**0) )

*∂*(*σ*<sup>2</sup> *ε*(**x**0) )

matrix notation as

The value of *σ*<sup>2</sup>

leads to the following:

Setting this expected value to 0, to ensure an unbiasedness estimate results in:

$$\sum\_{i=1}^{n} w\_i = 1.\tag{4}$$

This is known as the condition of unbiasedness (Isaaks & Srivastava, 1989; Kitanidis, 1997; Wackernagel, 2003). The expression of the variance of the modeled error is

$$
\sigma\_{\varepsilon(\mathbf{x}\_0)}^2 = var \left[ Z(\mathbf{x}\_0) \right] - 2\operatorname{cov} \left[ \hat{Z}(\mathbf{x}\_0), Z(\mathbf{x}\_0) \right] + var \left[ \hat{Z}(\mathbf{x}\_0) \right]. \tag{5}
$$

Since we have already assumed that all of the random variables have the same variance *σ*2, then *var* [*Z*(**x**0)] = *σ*2. The second term in Equation 5 can be written as

$$\begin{aligned} -2\operatorname{cov}\left[\mathcal{Z}(\mathbf{x}\_{0}), \mathcal{Z}(\mathbf{x}\_{0})\right] &= -2\sum\_{i=1}^{n} w\_{i} \cdot \operatorname{cov}\left[\mathcal{Z}(\mathbf{x}\_{i}), \mathcal{Z}(\mathbf{x}\_{0})\right] \\ &= -2\sum\_{i=1}^{n} w\_{i}\operatorname{\mathbb{C}}\_{i0}. \end{aligned} \tag{6}$$

The third term of Equation 5 can be expressed as

$$\begin{split} var\left[\hat{Z}(\mathbf{x}\_{0})\right] &= \sum\_{i=1}^{n} \sum\_{j=1}^{n} w\_{i} \cdot w\_{j} \cdot cov\left[Z(\mathbf{x}\_{i}), Z(\mathbf{x}\_{j})\right] \\ &= \sum\_{i=1}^{n} \sum\_{j=1}^{n} w\_{i} \, w\_{j} \, \mathsf{C}\_{ij}. \end{split} \tag{7}$$

Then, the expression of the error variance comes in the following way:

$$
\sigma\_{\varepsilon(\mathbf{x}\_0)}^2 = \sigma^2 - 2\sum\_{i=1}^n w\_i \, \mathbb{C}\_{i0} + \sum\_{i=1}^n \sum\_{j=1}^n w\_i \, w\_j \, \mathbb{C}\_{ij}. \tag{8}
$$

Equation 8 expresses the error variance as function of the *n* weights, once chosen the random model function parameters, namely the variance *σ*<sup>2</sup> and all the covariances *Cij*. The minimization of *σ*<sup>2</sup> *<sup>ε</sup>*(**x**0) is constrained by the unbiasedness condition imposed earlier, which can be solved using the method of Lagrange multipliers. We start by introducing a new parameter *μ*, called the Lagrange multiplier, in Equation 8 in the following way:

$$\sigma\_{\varepsilon(\mathbf{x}\_0)}^2 = \sigma^2 - 2\sum\_{i=1}^n w\_i \mathbf{C}\_{i0} + \sum\_{i=1}^n \sum\_{j=1}^n w\_i \, w\_j \mathbf{C}\_{ij} + 2\underbrace{\mu\left(\sum\_{i=1}^n w\_i - 1\right)}\_{=0}.\tag{9}$$

Then we minimize *σ*<sup>2</sup> *<sup>ε</sup>*(**x**0) by calculating the *<sup>n</sup>* + 1 partial first derivatives of Equation 9 with respect to the *n* weights and the Lagrange multiplier, and setting each one to 0, which produces the following system of equations:

4 Will-be-set-by-IN-TECH

This is known as the condition of unbiasedness (Isaaks & Srivastava, 1989; Kitanidis, 1997;

Since we have already assumed that all of the random variables have the same variance *σ*2,

= −2

*Z*ˆ(**x**0), *Z*(**x**0)

*n* ∑ *i*=1

*n* ∑ *i*=1

*wi* · *wj* · *cov*

*wi Ci*<sup>0</sup> +

Equation 8 expresses the error variance as function of the *n* weights, once chosen the random model function parameters, namely the variance *σ*<sup>2</sup> and all the covariances *Cij*. The

be solved using the method of Lagrange multipliers. We start by introducing a new parameter

*n* ∑ *j*=1

respect to the *n* weights and the Lagrange multiplier, and setting each one to 0, which

*n* ∑ *i*=1

*n* ∑ *i*=1

*<sup>ε</sup>*(**x**0) is constrained by the unbiasedness condition imposed earlier, which can

*wi wj Cij* + 2 *μ*

*<sup>ε</sup>*(**x**0) by calculating the *<sup>n</sup>* + 1 partial first derivatives of Equation 9 with

 *n* ∑ *i*=1

*wi* − 1

 = 0

. (9)

*n* ∑ *j*=1

 + *var*

*wi* · *cov* [*Z*(**x***i*), *Z*(**x**0)]

*Z*(**x***i*), *Z*(**x***j*)

*wi* = 1. (4)

*Z*ˆ(**x**0) 

*wi Ci*0. (6)

*wi wj Cij*. (7)

*wi wj Cij*. (8)

. (5)

Setting this expected value to 0, to ensure an unbiasedness estimate results in:

Wackernagel, 2003). The expression of the variance of the modeled error is

*<sup>ε</sup>*(**x**0) <sup>=</sup> *var* [*Z*(**x**0)] <sup>−</sup> <sup>2</sup> *cov*

then *var* [*Z*(**x**0)] = *σ*2. The second term in Equation 5 can be written as

*Z*ˆ(**x**0), *Z*(**x**0)

 = −2

*n* ∑ *j*=1

*n* ∑ *j*=1

*n* ∑ *i*=1

*σ*2

<sup>−</sup> <sup>2</sup> *cov*

The third term of Equation 5 can be expressed as

*var*

*σ*2

minimization of *σ*<sup>2</sup>

Then we minimize *σ*<sup>2</sup>

*σ*2

*<sup>ε</sup>*(**x**0) <sup>=</sup> *<sup>σ</sup>*<sup>2</sup> <sup>−</sup> <sup>2</sup>

*Z*ˆ(**x**0) = *n* ∑ *i*=1

> = *n* ∑ *i*=1

Then, the expression of the error variance comes in the following way:

*<sup>ε</sup>*(**x**0) <sup>=</sup> *<sup>σ</sup>*<sup>2</sup> <sup>−</sup> <sup>2</sup>

*μ*, called the Lagrange multiplier, in Equation 8 in the following way:

*wi Ci*<sup>0</sup> +

*n* ∑ *i*=1

*n* ∑ *i*=1

$$\begin{aligned} \frac{\partial(\sigma\_{\varepsilon(\mathbf{x}\_{0})}^{2})}{\partial(w\_{l})} &= -2\operatorname{C}\_{10} + 2\sum\_{j=1}^{n} w\_{j}\mathbf{C}\_{1j} + 2\,\mu = 0 \quad \Rightarrow & \sum\_{j=1}^{n} w\_{j}\mathbf{C}\_{1j} + \mu = \mathbf{C}\_{10} \\ & \vdots & \vdots \\ \frac{\partial(\sigma\_{\varepsilon(\mathbf{x}\_{0})}^{2})}{\partial(w\_{i})} &= -2\operatorname{C}\_{i0} + 2\sum\_{j=1}^{n} w\_{j}\mathbf{C}\_{ij} + 2\,\mu = 0 \quad \Rightarrow & \sum\_{j=1}^{n} w\_{j}\mathbf{C}\_{ij} + \mu = \mathbf{C}\_{i0} \\ & \vdots & \vdots \\ \frac{\partial(\sigma\_{\varepsilon(\mathbf{x}\_{0})}^{2})}{\partial(w\_{n})} &= -2\operatorname{C}\_{n0} + 2\sum\_{j=1}^{n} w\_{j}\mathbf{C}\_{nj} + 2\,\mu = 0 \quad \Rightarrow & \sum\_{j=1}^{n} w\_{j}\mathbf{C}\_{nj} + \mu = \mathbf{C}\_{n0} \\ & \frac{\partial(\sigma\_{\varepsilon(\mathbf{x}\_{0})}^{2})}{\partial(\mu)} = 2\left(\sum\_{i=1}^{n} w\_{i} - 1\right) = 0 \quad \Rightarrow & \sum\_{i=1}^{n} w\_{i} = 1 \end{aligned}$$

which can also be written in a compact way as

$$\sum\_{j=1}^{n} w\_j \mathbf{C}\_{ij} + \mu = \mathbf{C}\_{i0} \quad \forall i = 1, 2, \dots, n; \qquad \sum\_{i=1}^{n} w\_i = 1. \tag{10}$$

This system of equations, often referred to as the *ordinary kriging system*, can be written in matrix notation as

$$\begin{array}{ccccc} \mathbf{C} & \cdot & \mathbf{W} & = & \mathbf{D} \\ \begin{bmatrix} \mathbf{C}\_{11} \ \dots \ \mathbf{C}\_{1n} \ 1 \\ \vdots \ \vdots \ \ddots \ \vdots \\ \mathbf{C}\_{n1} \ \dots \ \mathbf{C}\_{nn} \ 1 \\ 1 \ \dots \ \mathbf{1} & 0 \end{bmatrix} & \underbrace{\begin{bmatrix} w\_{1} \\ \vdots \\ w\_{n} \\ \mu \end{bmatrix}}\_{(n+1)\times(n+1)} = \underbrace{\begin{bmatrix} \mathbf{C}\_{10} \\ \vdots \\ \mathbf{C}\_{n0} \\ \mathbf{C}\_{n0} \\ 1 \end{bmatrix}}\_{(n+1)\times 1} \end{array} \tag{11}$$

The set of weights and the Lagrange multiplier that will produce an unbiased estimate of *Z*(**x**0) with the minimum error variance are then given by

$$\mathbf{W} = \mathbf{C}^{-1} \cdot \mathbf{D}.\tag{12}$$

The value of *σ*<sup>2</sup> *<sup>ε</sup>*(*x*0) can be obtained in a quicker way using an alternative expression to Eq. 8. Multiplying each of the *n* equations given in Eq. 10 by *wi* and summing these *n* equations leads to the following:

$$\sum\_{i=1}^{n}\sum\_{j=1}^{n}w\_{i}\,w\_{j}\,\mathsf{C}\_{ij} = \sum\_{i=1}^{n}w\_{i}\,\mathsf{C}\_{i0} - \mu.$$

Substituting this into Equation 8 the minimized error variance comes as follows:

$$
\sigma\_{\varepsilon(\mathbf{x\_0})}^2 = \sigma^2 - \left(\sum\_{i=1}^n w\_i \mathbb{C}\_{i0} + \mu\right) \tag{13}
$$

distance and direction **h** (known as lag) have the same joint probability distribution. The covariance function, *C*(**h**) is the covariance between random variables separated by a lag **h** (Isaaks & Srivastava, 1989; Kitanidis, 1997; Wackernagel, 2003). For a stationary random

<sup>245</sup> Mapping and Dilution Estimation of Wastewater Discharges

The covariance between random variables at identical locations is the variance of the random

The semivariogram, or simply variogram, is half the expected squared difference between

The quantity *γ*(**h**) is known as the semivariance at lag **h**. The "semi"refers to the fact that it is half of a variance. The variogram between random variables at identical locations is zero, i.e. *γ*(**0**) = 0. Using Equations 19, 20 and 21, we can relate the variogram with the covariance

In practice, the pattern of spatial continuity chosen for the random function is usually taken from the spatial continuity evident in the sample data set. Geostatisticians usually define the spatial continuity of the sample data set through the variogram and solve the ordinary kriging system using covariance (Isaaks & Srivastava, 1989). The maximum value reached by the variogram is called the sill. The distance at which the sill is reached is called the range. The vertical jump from zero at the origin to the value of semivariance at extremely small separation distances is called the nugget effect. The estimator of the variogram usually used, known as Matheron's method-of-moments estimator (MME) is (Matheron, 1965; Webster & Oliver,

 <sup>=</sup> <sup>1</sup>

*<sup>C</sup>*(**h**) = *<sup>E</sup>* [*Z*(**x**) *<sup>Z</sup>*(**<sup>x</sup>** <sup>+</sup> **<sup>h</sup>**)] <sup>−</sup> {*<sup>E</sup>* [*Z*(**x**)]}<sup>2</sup> . (19)

*<sup>γ</sup>*(**h**) = *<sup>C</sup>*(**0**) <sup>−</sup> *<sup>C</sup>*(**h**) = *<sup>σ</sup>*<sup>2</sup> <sup>−</sup> *<sup>C</sup>*(**h**). (22)

[*Z*(**x***i*) − *Z*(**x***<sup>i</sup>* + **h**)]


*<sup>N</sup>*(**h**) <sup>+</sup> 0.045 [*N*(**h**)] 2 1/2<sup>4</sup>

<sup>−</sup> {*<sup>E</sup>* [*Z*(**x**)]}<sup>2</sup> <sup>=</sup> *var* [*Z*(**x**)] <sup>=</sup> *<sup>σ</sup>*2. (20)

<sup>2</sup> *var* [*Z*(**x**) <sup>−</sup> *<sup>Z</sup>*(**<sup>x</sup>** <sup>+</sup> **<sup>h</sup>**)] . (21)

<sup>2</sup> , (23)

. (24)

function, the covariance function *C*(**h**) is:

random variables separated by a lag **h**:

*<sup>γ</sup>*(**h**) = <sup>1</sup> 2 *E* 

follows (Cressie & Hawkins, 1980):

*<sup>γ</sup>*(**h**) = <sup>1</sup> 2 ×

*C*(**0**) = *E*

Based on Geostatistics Using an Autonomous Underwater Vehicle

*<sup>γ</sup>*(**h**) = <sup>1</sup>

2*N*(**h**)

 1 *N*(**h**) *N*(**h**) ∑ *i*=1

*N*(**h**) ∑ *i*=1

0.457 + 0.494

Once the sample variogram has been calculated, a function (called the variogram model) has to be fit to it. First, because the matrices **C** and **D** may need semivariance values for lags that are not available from the sample data. And second, because the use of the variogram does

where *z*(**x***i*) is the value of the variable of interest at location **x***<sup>i</sup>* and *N*(**h**)is the number of pairs of points separated by the particular lag vector **h**. Cressie and Hawkins (Cressie & Hawkins, 1980) developed an estimator of the variogram that should be robust to the presence of outliers and enhance the variogram spatial continuity, having also the advantage of not spreading the effect of outliers in computing the maps. This estimator (CRE) is defined as

{*Z*(**x**)}<sup>2</sup>

{*Z*(**x**) <sup>−</sup> *<sup>Z</sup>*(**<sup>x</sup>** <sup>+</sup> **<sup>h</sup>**)}<sup>2</sup>

function:

function as:

2007)

or, in terms of matrices as

$$
\sigma\_{\varepsilon(\mathbf{x}\_0)}^2 = \sigma^2 - \mathbf{W}^T \cdot \mathbf{D}.\tag{14}
$$

The minimized error variance is usually called the *ordinary kriging variance*.

#### **2.3 Block kriging**

A consideration in many environmental applications has been that ordinary kriging usually exhibits large prediction errors (Bivand et al., 2008). This is due to the larger variability in the observations. When predicting averages over larger areas, i.e. within blocks, much of the variability averages out and consequently block mean values have lower prediction errors. If the blocks are not too large the spatial patterns do not disappear. The block kriging system is similar to the point kriging system given by Equation 11. The matrix **C** is the same since it is independent of the location at which the block estimate is required. The covariances for the vector **D** are point-to-block covariances. Supposing that the mean value over a block *V* is approximated by the arithmetic average of the *N* point variables contained within that block (Goovaerts, 1997; Isaaks & Srivastava, 1989), i.e.

$$Z\_V \approx \frac{1}{N} \sum\_{j=1}^{N} Z(\mathbf{x}\_j),\tag{15}$$

the point-to-block covariances required for vector **D** are

$$\overline{\mathbf{C}}\_{iV} = \text{cov}\left[Z(\mathbf{x}\_{i}), Z\_{V}\right] = \frac{1}{N} \sum\_{j=1}^{N} \mathbf{C}\_{ij\prime} \quad \forall i = 1, 2, \dots, n. \tag{16}$$

The block kriging variance is

$$
\sigma\_V^2 = \overline{\mathbb{C}}\_{VV} - \left(\sum\_{i=1}^n w\_i \overline{\mathbb{C}}\_{IV} + \mu\right),
\tag{17}
$$

where *CVV* is the average covariance between pairs of points within *V*:

$$\overline{\mathbb{C}}\_{VV} = \frac{1}{N^2} \sum\_{i=1}^{N} \sum\_{j=1}^{N} \mathbb{C}\_{ij}. \tag{18}$$

An equivalent procedure, that can be computationally more expensive than block kriging, is to obtain the block estimate by averaging the *N* kriged point estimates within the block (Goovaerts, 1997; Isaaks & Srivastava, 1989).

#### **2.4 Spatial continuity**

Spatial continuity exists in most earth science data sets. When we look at a contour map, or anything similar, the values do not appear to be randomly located, but rather, low values tend to be near other low values and high values tend to be near other high values. I.e. two measurements close to each other are most likely to have similar values than two measurements far apart (Isaaks & Srivastava, 1989). To compute the set of weights and the Lagrange multiplier, that will produce each estimate and the resulting minimized error variance, we need to know the covariances of **C** and **D** matrices. As we said before, since our random function is stationary, all pairs of random variables separated by a 6 Will-be-set-by-IN-TECH

A consideration in many environmental applications has been that ordinary kriging usually exhibits large prediction errors (Bivand et al., 2008). This is due to the larger variability in the observations. When predicting averages over larger areas, i.e. within blocks, much of the variability averages out and consequently block mean values have lower prediction errors. If the blocks are not too large the spatial patterns do not disappear. The block kriging system is similar to the point kriging system given by Equation 11. The matrix **C** is the same since it is independent of the location at which the block estimate is required. The covariances for the vector **D** are point-to-block covariances. Supposing that the mean value over a block *V* is approximated by the arithmetic average of the *N* point variables contained within that

> *ZV* <sup>≈</sup> <sup>1</sup> *N*

*N* ∑ *j*=1

*N*

 *n* ∑ *i*=1

*N*<sup>2</sup>

An equivalent procedure, that can be computationally more expensive than block kriging, is to obtain the block estimate by averaging the *N* kriged point estimates within the

Spatial continuity exists in most earth science data sets. When we look at a contour map, or anything similar, the values do not appear to be randomly located, but rather, low values tend to be near other low values and high values tend to be near other high values. I.e. two measurements close to each other are most likely to have similar values than two measurements far apart (Isaaks & Srivastava, 1989). To compute the set of weights and the Lagrange multiplier, that will produce each estimate and the resulting minimized error variance, we need to know the covariances of **C** and **D** matrices. As we said before, since our random function is stationary, all pairs of random variables separated by a

*N* ∑ *i*=1

*N* ∑ *j*=1

*N* ∑ *j*=1

*wiCiV* + *μ*

*<sup>ε</sup>*(**x**0) <sup>=</sup> *<sup>σ</sup>*<sup>2</sup> <sup>−</sup> **<sup>W</sup>***<sup>T</sup>* · **<sup>D</sup>**. (14)

*Z*(**x***j*), (15)

*Cij*, ∀*i* = 1, 2, . . . , *n*. (16)

, (17)

*Cij*. (18)

*σ*2

The minimized error variance is usually called the *ordinary kriging variance*.

block (Goovaerts, 1997; Isaaks & Srivastava, 1989), i.e.

the point-to-block covariances required for vector **D** are

block (Goovaerts, 1997; Isaaks & Srivastava, 1989).

*CiV* <sup>=</sup> *cov* [*Z*(**x***i*), *ZV*] <sup>=</sup> <sup>1</sup>

*σ*2

*<sup>V</sup>* = *CVV* −

*CVV* <sup>=</sup> <sup>1</sup>

where *CVV* is the average covariance between pairs of points within *V*:

or, in terms of matrices as

The block kriging variance is

**2.4 Spatial continuity**

**2.3 Block kriging**

distance and direction **h** (known as lag) have the same joint probability distribution. The covariance function, *C*(**h**) is the covariance between random variables separated by a lag **h** (Isaaks & Srivastava, 1989; Kitanidis, 1997; Wackernagel, 2003). For a stationary random function, the covariance function *C*(**h**) is:

$$\mathcal{C}(\mathbf{h}) = \mathbb{E}\left[Z(\mathbf{x})Z(\mathbf{x} + \mathbf{h})\right] - \left\{\mathbb{E}\left[Z(\mathbf{x})\right]\right\}^2. \tag{19}$$

The covariance between random variables at identical locations is the variance of the random function:

$$\mathcal{C}(\mathbf{0}) = E\left[ \left\{ Z(\mathbf{x}) \right\}^2 \right] - \left\{ E\left[ Z(\mathbf{x}) \right] \right\}^2 = var\left[ Z(\mathbf{x}) \right] = \sigma^2. \tag{20}$$

The semivariogram, or simply variogram, is half the expected squared difference between random variables separated by a lag **h**:

$$\gamma(\mathbf{h}) = \frac{1}{2} \operatorname{E} \left[ \left\{ Z(\mathbf{x}) - Z(\mathbf{x} + \mathbf{h}) \right\}^2 \right] = \frac{1}{2} \operatorname{var} \left[ Z(\mathbf{x}) - Z(\mathbf{x} + \mathbf{h}) \right]. \tag{21}$$

The quantity *γ*(**h**) is known as the semivariance at lag **h**. The "semi"refers to the fact that it is half of a variance. The variogram between random variables at identical locations is zero, i.e. *γ*(**0**) = 0. Using Equations 19, 20 and 21, we can relate the variogram with the covariance function as:

$$
\gamma(\mathbf{h}) = \mathbb{C}(\mathbf{0}) - \mathbb{C}(\mathbf{h}) = \sigma^2 - \mathbb{C}(\mathbf{h}).\tag{22}
$$

In practice, the pattern of spatial continuity chosen for the random function is usually taken from the spatial continuity evident in the sample data set. Geostatisticians usually define the spatial continuity of the sample data set through the variogram and solve the ordinary kriging system using covariance (Isaaks & Srivastava, 1989). The maximum value reached by the variogram is called the sill. The distance at which the sill is reached is called the range. The vertical jump from zero at the origin to the value of semivariance at extremely small separation distances is called the nugget effect. The estimator of the variogram usually used, known as Matheron's method-of-moments estimator (MME) is (Matheron, 1965; Webster & Oliver, 2007)

$$\gamma(\mathbf{h}) = \frac{1}{2N(\mathbf{h})} \sum\_{i=1}^{N(\mathbf{h})} \left[ Z(\mathbf{x}\_i) - Z(\mathbf{x}\_i + \mathbf{h}) \right]^2,\tag{23}$$

where *z*(**x***i*) is the value of the variable of interest at location **x***<sup>i</sup>* and *N*(**h**)is the number of pairs of points separated by the particular lag vector **h**. Cressie and Hawkins (Cressie & Hawkins, 1980) developed an estimator of the variogram that should be robust to the presence of outliers and enhance the variogram spatial continuity, having also the advantage of not spreading the effect of outliers in computing the maps. This estimator (CRE) is defined as follows (Cressie & Hawkins, 1980):

$$\gamma(\mathbf{h}) = \frac{1}{2} \times \frac{\left\{ \frac{1}{N(\mathbf{h})} \sum\_{i=1}^{N(\mathbf{h})} \left| Z(\mathbf{x}\_i) - Z(\mathbf{x}\_i + \mathbf{h}) \right|^{1/2} \right\}^4}{0.457 + \frac{0.494}{N(\mathbf{h})} + \frac{0.045}{\left[ N(\mathbf{h}) \right]^2}}. \tag{24}$$

Once the sample variogram has been calculated, a function (called the variogram model) has to be fit to it. First, because the matrices **C** and **D** may need semivariance values for lags that are not available from the sample data. And second, because the use of the variogram does

made of HDPE, has a diameter of 710 mm. The diffuser, which consists of 10 ports spaced 8 or 12 meters apart, is 93.5 m long. The ports, nominally 0.175 m in diameter, are discharging upwards at an angle of 90◦ to the pipe horizontal axis; the port height is about 1 m. The outfall direction is southeast-northwest (315.5◦ true bearing) and is discharging at a depth of about 31 m. In that area the coastline itself runs at about a 225◦ angle with respect to true north and the isobaths are oriented parallel to the coastline. A seawater quality monitoring program for the outfall has already started in May 2006. Its main purposes are to evaluate the background seawater quality both in offshore and nearshore locations around the vicinity of the sea outfall and to follow the impacts of wastewater discharge in the area. During the campaign the discharge remained fairly constant with an average flowrate of approximately 0.11 m3/s. The operation area specification was based on the outputs of a plume prediction model (Hunt et al., 2010) which include mixing zone length, spreading width, maximum rise height and thickness. The model inputs are, besides the diffuser physical characteristics, the water column stratification, the current velocity and direction, and the discharge flowrate. Information on density stratification was obtained from a vertical profile of temperature and salinity acquired in the vicinity of the diffuser two weeks before the campaign (see Fig. 3). The water column was weakly stratified due to both low-temperature and salinity variations. The total difference in density over the water column was about 0.13 *σ*-unit. The current direction of 110◦ was estimated based on predictions of wind speed and direction of the day of the campaign. A current velocity of 0.12 m/s was estimated based on historic data. The effluent flowrate consider for the plume behavior simulation was 0.11 m3/s. According to the predictions of the model, the plume was spreading 1 m from the surface, detached from the bottom and forming a two-layer flow. The end of the mixing zone length was predicted to be 141 m downstream from the diffuser. Fig. 2(b) shows the diffuser and a plan view of the AUV operation area (specified according to the model predictions), mainly in the northeast

<sup>247</sup> Mapping and Dilution Estimation of Wastewater Discharges

Based on Geostatistics Using an Autonomous Underwater Vehicle

The vehicle collected CTD data at 1.5 m and 3 m depth, in accordance to the plume minimum dilution height prediction. During the mission transited at a fairly constant velocity of 1 m/s (2 knots) recording data at a rate of 16 Hz. Maximum vertical oscillations of the AUV in

> −150 −100 −50 0 50 100

North (m)

−50 0 50 100 150

East (m)

(b) AUV operation area.

performing the horizontal trajectories were less than 0.5 m (up and down).

direction from the diffuser, covering about 20000 m2.

(a) Map of the study site (©2011 Google -

Fig. 2. Vicinity of Foz do Arelho sea outfall.

Images).

not guarantee the existence and uniqueness of solution to the ordinary kriging system. The most commonly used variogram models are the spherical model, the exponential model, the Gaussian model and the Matern model (Isaaks & Srivastava, 1989).

#### **2.5 Cross-validation**

Cross-validation is a procedure used to compare the performance of several competing models (Webster & Oliver, 2007). It starts by splitting the data set into two sets: a modelling set and a validation set. Then the modelling set is used for variogram modelling and kriging on the locations of the validation set. Finally the measurements of the validation set are compared to their predictions (Bivand et al., 2008). If the average of the cross-validation errors (or Mean Error, ME) is close to 0,

$$\text{ME} = \frac{1}{m} \sum\_{i=1}^{m} \left[ Z(\mathbf{x}\_{i}) - \hat{Z}(\mathbf{x}\_{i}) \right]. \tag{25}$$

we may say that apparently the estimates are unbiased (*Z*(**x***i*) and *Z*ˆ(**x***i*) are, respectively, the measurement and estimate at point *xi* and *m* is the number of measurements of the validation set). A significant negative (positive) mean error can represent systematic overestimation (underestimation). The magnitude of the Root Mean Squared Error (RMSE) is particularly interesting for comparing different models (Wackernagel, 2003; Webster & Oliver, 2007):

$$\text{RMSE} = \sqrt{\frac{1}{m} \sum\_{i=1}^{m} \left[ Z(\mathbf{x}\_i) - \hat{Z}(\mathbf{x}\_i) \right]^2}. \tag{26}$$

The RMSE value should be as small as possible indicating that estimates are close to measurements. The kriging standard deviation represents the error predicted by the estimation method. Dividing the cross-validation error by the corresponding kriging standard deviation allows to compare the magnitudes of both actual and predicted error (Wackernagel, 2003; Webster & Oliver, 2007). Therefore, the average of the standardized squared cross-validation errors (or Mean Standardized Squared Error, MSSE)

$$\text{MSSE} = \frac{1}{m} \sum\_{i=1}^{m} \frac{\left[ Z(\mathbf{x}\_i) - \hat{Z}(\mathbf{x}\_i) \right]^2}{\sigma\_{R(\mathbf{x}\_i)}^2},\tag{27}$$

should be about one, indicating that the model is accurate. A scatterplot of true versus predicted values provides additional evidence on how well an estimation method has performed. The coefficient of determination R<sup>2</sup> is a good index for summarizing how close the points on the scatterplot come to falling on the 45-degree line passing through the origin (Isaaks & Srivastava, 1989). R<sup>2</sup> should be close to one.

#### **3. Results**

#### **3.1 Study site**

A map of the study site is shown in Fig. 2(a). Foz do Arelho outfall is located off the Portuguese west coast near Óbidos lagoon. In operation since June 2005, is presently discharging about 0.11 m3/s of mainly domestic wastewater from the WWTPs of Óbidos, Carregal, Caldas da Rainha, Gaeiras, Charneca and Foz do Arelho, but it can discharge up to 0.35 m3/s. The total length of the outfall, including the diffuser, is 2150 m. The outfall pipe, 8 Will-be-set-by-IN-TECH

not guarantee the existence and uniqueness of solution to the ordinary kriging system. The most commonly used variogram models are the spherical model, the exponential model, the

Cross-validation is a procedure used to compare the performance of several competing models (Webster & Oliver, 2007). It starts by splitting the data set into two sets: a modelling set and a validation set. Then the modelling set is used for variogram modelling and kriging on the locations of the validation set. Finally the measurements of the validation set are compared to their predictions (Bivand et al., 2008). If the average of the cross-validation errors

we may say that apparently the estimates are unbiased (*Z*(**x***i*) and *Z*ˆ(**x***i*) are, respectively, the measurement and estimate at point *xi* and *m* is the number of measurements of the validation set). A significant negative (positive) mean error can represent systematic overestimation (underestimation). The magnitude of the Root Mean Squared Error (RMSE) is particularly interesting for comparing different models (Wackernagel, 2003; Webster & Oliver, 2007):

> *m* ∑ *i*=1

The RMSE value should be as small as possible indicating that estimates are close to measurements. The kriging standard deviation represents the error predicted by the estimation method. Dividing the cross-validation error by the corresponding kriging standard deviation allows to compare the magnitudes of both actual and predicted error (Wackernagel, 2003; Webster & Oliver, 2007). Therefore, the average of the standardized

*<sup>Z</sup>*(**x***i*) <sup>−</sup> *<sup>Z</sup>*ˆ(**x***i*)

*<sup>Z</sup>*(**x***i*) <sup>−</sup> *<sup>Z</sup>*ˆ(**x***i*)

*<sup>Z</sup>*(**x***i*) <sup>−</sup> *<sup>Z</sup>*ˆ(**x***i*)

*σ*2 *R*(**x***i*) 2

2

. (25)

. (26)

, (27)

Gaussian model and the Matern model (Isaaks & Srivastava, 1989).

ME <sup>=</sup> <sup>1</sup> *m*

RMSE =

 1 *m*

squared cross-validation errors (or Mean Standardized Squared Error, MSSE)

*m*

*m* ∑ *i*=1 should be about one, indicating that the model is accurate. A scatterplot of true versus predicted values provides additional evidence on how well an estimation method has performed. The coefficient of determination R<sup>2</sup> is a good index for summarizing how close the points on the scatterplot come to falling on the 45-degree line passing through the

A map of the study site is shown in Fig. 2(a). Foz do Arelho outfall is located off the Portuguese west coast near Óbidos lagoon. In operation since June 2005, is presently discharging about 0.11 m3/s of mainly domestic wastewater from the WWTPs of Óbidos, Carregal, Caldas da Rainha, Gaeiras, Charneca and Foz do Arelho, but it can discharge up to 0.35 m3/s. The total length of the outfall, including the diffuser, is 2150 m. The outfall pipe,

MSSE <sup>=</sup> <sup>1</sup>

origin (Isaaks & Srivastava, 1989). R<sup>2</sup> should be close to one.

*m* ∑ *i*=1 

**2.5 Cross-validation**

**3. Results**

**3.1 Study site**

(or Mean Error, ME) is close to 0,

made of HDPE, has a diameter of 710 mm. The diffuser, which consists of 10 ports spaced 8 or 12 meters apart, is 93.5 m long. The ports, nominally 0.175 m in diameter, are discharging upwards at an angle of 90◦ to the pipe horizontal axis; the port height is about 1 m. The outfall direction is southeast-northwest (315.5◦ true bearing) and is discharging at a depth of about 31 m. In that area the coastline itself runs at about a 225◦ angle with respect to true north and the isobaths are oriented parallel to the coastline. A seawater quality monitoring program for the outfall has already started in May 2006. Its main purposes are to evaluate the background seawater quality both in offshore and nearshore locations around the vicinity of the sea outfall and to follow the impacts of wastewater discharge in the area. During the campaign the discharge remained fairly constant with an average flowrate of approximately 0.11 m3/s. The operation area specification was based on the outputs of a plume prediction model (Hunt et al., 2010) which include mixing zone length, spreading width, maximum rise height and thickness. The model inputs are, besides the diffuser physical characteristics, the water column stratification, the current velocity and direction, and the discharge flowrate. Information on density stratification was obtained from a vertical profile of temperature and salinity acquired in the vicinity of the diffuser two weeks before the campaign (see Fig. 3). The water column was weakly stratified due to both low-temperature and salinity variations. The total difference in density over the water column was about 0.13 *σ*-unit. The current direction of 110◦ was estimated based on predictions of wind speed and direction of the day of the campaign. A current velocity of 0.12 m/s was estimated based on historic data. The effluent flowrate consider for the plume behavior simulation was 0.11 m3/s. According to the predictions of the model, the plume was spreading 1 m from the surface, detached from the bottom and forming a two-layer flow. The end of the mixing zone length was predicted to be 141 m downstream from the diffuser. Fig. 2(b) shows the diffuser and a plan view of the AUV operation area (specified according to the model predictions), mainly in the northeast direction from the diffuser, covering about 20000 m2.

The vehicle collected CTD data at 1.5 m and 3 m depth, in accordance to the plume minimum dilution height prediction. During the mission transited at a fairly constant velocity of 1 m/s (2 knots) recording data at a rate of 16 Hz. Maximum vertical oscillations of the AUV in performing the horizontal trajectories were less than 0.5 m (up and down).

(a) Map of the study site (©2011 Google - Images).

(b) AUV operation area.

Fig. 2. Vicinity of Foz do Arelho sea outfall.

Temperature@1.5 m Temperature@3.0 m

Salinity@1.5 m Salinity@3.0 m

Density

Fig. 4. Histograms of temperature measurements at depths of 1.5 m (left) and 3 m (right).

Temperature (°C)

15.35 15.40 15.45 15.50 15.55 15.60

Samples 20,026 10,506 Mean 15.463ºC 15.469ºC Median 15.466ºC 15.472ºC Minimum 15.359ºC 15.393ºC Maximum 15.562ºC 15.536ºC Coefficient of skewness -0.31 -0.70 Coefficient of variation 0.002 0.001

<sup>249</sup> Mapping and Dilution Estimation of Wastewater Discharges

Samples 20,026 10,506 Mean 35.991 psu 35.996 psu Median 35.990 psu 35.998 psu Minimum 35.957 psu 35.973 psu Maximum 36.003 psu 36.008 psu

Coefficient of skewness -0.63 -1.1 Coefficient of variation 0.0002 0.0001

Table 1. Summary statistics of temperature measurements.

Based on Geostatistics Using an Autonomous Underwater Vehicle

Table 2. Summary statistics of salinity measurements.

Temperature (°C)

15.35 15.40 15.45 15.50 15.55 15.60

Density

Fig. 3. Vertical STD profile used in the plume behavior simulation.

#### **3.2 Exploratory analysis**

In order to obtain elementary knowledge about the temperature and salinity data sets, conventional statistical analysis was conducted (see the results in Table 1 and Table 2). At the depth of 1.5 m the temperature ranged from 15.359ºC to 15.562ºC and at the depth of 3 m the temperature ranged from 15.393ºC to 15.536ºC. The mean value of the data sets was 15.463ºC and 15.469ºC, respectively at the depths of 1.5 m and 3 m, which was very close to the median value that was respectively 15.466ºC and 15.472ºC. The coefficient of skewness is relatively low (-0.309) for the 1.5 m data set and not very high (-0.696) for the 3 m data set, indicating that in the first case the histogram is approximately symmetric and in the second case that distribution is only slightly asymmetric. The very low values of the coefficient of variation (0.002 and 0.001) reflect the fact that the histograms do not have a tail of high values. At the depth of 1.5 m the salinity ranged from 35.957 psu to 36.003 psu and at the depth of 3 m the salinity ranged from 35.973 psu to 36.008 psu. The mean value of the data sets was 35.991 psu and 35.996 psu, respectively at the depths of 1.5 m and 3 m, which was very close to the median value that was respectively 35.990 and 35.998 psu. The coefficient of skewness is not to much high in both data sets (-0.63 and -1.1) indicating that distributions are only slightly asymmetric. The very low values of the coefficient of variation (0.0002 and 0.0001) reflect the fact that the histograms do not have a tail of high values. The ordinary kriging method works better if the distribution of the data values is close to a normal distribution. Therefore, it is interesting to see how close the distribution of the data values comes to being normal. Fig. 4 shows the plots of the normal distribution adjusted to the histograms of the temperature measured at depths of 1.5 m and 3 m, and Fig. 5 shows the plots of the normal distribution adjusted to the histograms of the salinity measured at depths of 1.5 m and 3 m. The density value in the histogram is the ratio between the number of samples in a bin and the total number of samples divided by the width of the bin (constant). Apart from some erratic high values it can be seen that the histograms are reasonably close to the normal distribution. 10 Will-be-set-by-IN-TECH

In order to obtain elementary knowledge about the temperature and salinity data sets, conventional statistical analysis was conducted (see the results in Table 1 and Table 2). At the depth of 1.5 m the temperature ranged from 15.359ºC to 15.562ºC and at the depth of 3 m the temperature ranged from 15.393ºC to 15.536ºC. The mean value of the data sets was 15.463ºC and 15.469ºC, respectively at the depths of 1.5 m and 3 m, which was very close to the median value that was respectively 15.466ºC and 15.472ºC. The coefficient of skewness is relatively low (-0.309) for the 1.5 m data set and not very high (-0.696) for the 3 m data set, indicating that in the first case the histogram is approximately symmetric and in the second case that distribution is only slightly asymmetric. The very low values of the coefficient of variation (0.002 and 0.001) reflect the fact that the histograms do not have a tail of high values. At the depth of 1.5 m the salinity ranged from 35.957 psu to 36.003 psu and at the depth of 3 m the salinity ranged from 35.973 psu to 36.008 psu. The mean value of the data sets was 35.991 psu and 35.996 psu, respectively at the depths of 1.5 m and 3 m, which was very close to the median value that was respectively 35.990 and 35.998 psu. The coefficient of skewness is not to much high in both data sets (-0.63 and -1.1) indicating that distributions are only slightly asymmetric. The very low values of the coefficient of variation (0.0002 and 0.0001) reflect the fact that the histograms do not have a tail of high values. The ordinary kriging method works better if the distribution of the data values is close to a normal distribution. Therefore, it is interesting to see how close the distribution of the data values comes to being normal. Fig. 4 shows the plots of the normal distribution adjusted to the histograms of the temperature measured at depths of 1.5 m and 3 m, and Fig. 5 shows the plots of the normal distribution adjusted to the histograms of the salinity measured at depths of 1.5 m and 3 m. The density value in the histogram is the ratio between the number of samples in a bin and the total number of samples divided by the width of the bin (constant). Apart from some erratic high values it can be seen that the histograms are reasonably close to the normal distribution.

Fig. 3. Vertical STD profile used in the plume behavior simulation.

**3.2 Exploratory analysis**


Table 1. Summary statistics of temperature measurements.


Table 2. Summary statistics of salinity measurements.

Fig. 4. Histograms of temperature measurements at depths of 1.5 m (left) and 3 m (right).

Depth Variogram

Based on Geostatistics Using an Autonomous Underwater Vehicle

Depth Variogram

<sup>a</sup> The preferred model.

depth of 1.5 m and 3 m fitted by the preferred models.

RMSE was 0.0019793 psu (Table 6).

1.5

3.0

1.5

3.0

1.5

3.0

and 3.0 m.

3 m.

Estimator Model Nugget Sill Range

MME Matern (*ν* = 0.4) 0.000 0.001 75.0 CRE Matern (*ν* = 0.5) 0.000 0.002 80.1

<sup>251</sup> Mapping and Dilution Estimation of Wastewater Discharges

MME Matern (*ν* = 0.3) 0.000 0.0002 101.3 CRE Matern (*ν* = 0.7) 0.000 0.002 107.5

Estimator Model Nugget Sill Range

MME Matern (*ν* = 0.6) 0.436 11.945 134.6 CRE Matern (*ν* = 0.6) 0.153 10786.109 51677.1

MME Matern (*ν* = 0.8) 0.338 11.724 181.6 CRE Gaussian 0.096 120.578 390.1

MBK 0.9184 2.0174e-4 8.0530e-5 8.9739e-3 CBKa 0.9211 1.6758e-4 7.7880e-5 8.8248e-3

MBK 0.8748 1.0338e-4 3.6295e-5 6.0244e-3 CBKa 0.8827 0.6538e-4 3.4008e-5 5.8316e-3

Table 4. Parameters of the fitted variogram models for salinity measured at depths of 1.5 and

Depth Method *R*<sup>2</sup> ME MSE RMSE

Table 5. Cross-validation results for the temperature maps at depths of 1.5 and 3 m.

difference in performance between the two estimators: block kriging using the MME estimator (MBK) or block kriging using the CRE estimator (CBK) is not substantial. Fig. 6 shows the omnidirectional sample variograms for temperature at the depth of 1.5 m and 3 m fitted by the preferred models. Fig. 7 shows the omnidirectional sample variograms for salinity at the

Fig. 8 and Fig. 9 show the scatterplots of true versus estimated values for the most satisfactory models. The dark line is the 45º line passing through the origin and the discontinuous line is the OLS (Ordinary Least Squares) regression line. These plots show that observed and predicted values are highly positively correlated. The *R*<sup>2</sup> value for the temperature at the depth of 1.5 m was 0.9211 and the RMSE was 0.0088248ºC, and at the depth of 3 m was 0.8827 and the RMSE was 0.0058316ºC (Table 5). The *R*<sup>2</sup> value for the salinity at the depth of 1.5 m was 0.9513 and the RMSE was 0.0016435 psu, and at the depth of 3 m was 0.8982 and the

Table 3. Parameters of the fitted variogram models for temperature measured at depths of 1.5

Fig. 5. Histograms of salinity measurements at depths of 1.5 m (left) and 3 m (right).

#### **3.3 Variogram modeling**

For the purpose of this analysis, the temperature and the salinity measurements were divided into a modeling set (comprising 90% of the samples) and a validation set (comprising 10% of the samples). Modeling and validation sets were then compared, using Student's-t test, to check that they provided unbiased sub-sets of the original data. Furthermore, sample variograms for the modeling sets were constructed using the MME estimator and the CRE estimator. This robust estimator was chosen to deal with outliers and enhance the variogram's spatial continuity. An estimation of semivariance was carried out using a lag distance of 2 m. Table 3 and Table 4 show the parameters of the fitted models to the omnidirectional sample variograms constructed using MME and CRE estimators. All the variograms were fitted to Matern models (for several shape parameters *ν*) with the exception to the salinity data measured at the depth of 3 m. The range value (in meters) is an indicator of extension where autocorrelation exists. The variograms of salinity show significant differences in range. The autocorrelation distances are always larger for the CRE estimator which may demonstrate the enhancement of the variogram's spatial continuity. All variograms have low nugget values which indicates that local variations could be captured due to the high sampling rate and to the fact that the variables under study have strong spatial dependence. Anisotropy was investigated by calculating directional variograms. However, no anisotropy effect could be shown.

#### **3.4 Cross-validation**

The block kriging method was preferred since it produced smaller prediction errors and smoother maps than the point kriging. Using the 90% modeling sets of the two depths, a two-dimensional ordinary block kriging, with blocks of 10 <sup>×</sup> 10 m2, was applied to estimate temperature at the locations of the 10% validation sets. The validation results for both parameters measured at depths of 1.5 m and 3 m depths are shown in Table 5 and Table 6. At both depths temperature was best estimated by the variogram constructed using CRE. Salinity at the depth of 1.5 m was best estimated by the variogram constructed using CRE and at the depth of 3 m was best estimated using the Gaussian model with the MME. The 12 Will-be-set-by-IN-TECH

Density

0

Fig. 5. Histograms of salinity measurements at depths of 1.5 m (left) and 3 m (right).

For the purpose of this analysis, the temperature and the salinity measurements were divided into a modeling set (comprising 90% of the samples) and a validation set (comprising 10% of the samples). Modeling and validation sets were then compared, using Student's-t test, to check that they provided unbiased sub-sets of the original data. Furthermore, sample variograms for the modeling sets were constructed using the MME estimator and the CRE estimator. This robust estimator was chosen to deal with outliers and enhance the variogram's spatial continuity. An estimation of semivariance was carried out using a lag distance of 2 m. Table 3 and Table 4 show the parameters of the fitted models to the omnidirectional sample variograms constructed using MME and CRE estimators. All the variograms were fitted to Matern models (for several shape parameters *ν*) with the exception to the salinity data measured at the depth of 3 m. The range value (in meters) is an indicator of extension where autocorrelation exists. The variograms of salinity show significant differences in range. The autocorrelation distances are always larger for the CRE estimator which may demonstrate the enhancement of the variogram's spatial continuity. All variograms have low nugget values which indicates that local variations could be captured due to the high sampling rate and to the fact that the variables under study have strong spatial dependence. Anisotropy was investigated by calculating directional variograms. However, no anisotropy effect could be

The block kriging method was preferred since it produced smaller prediction errors and smoother maps than the point kriging. Using the 90% modeling sets of the two depths, a two-dimensional ordinary block kriging, with blocks of 10 <sup>×</sup> 10 m2, was applied to estimate temperature at the locations of the 10% validation sets. The validation results for both parameters measured at depths of 1.5 m and 3 m depths are shown in Table 5 and Table 6. At both depths temperature was best estimated by the variogram constructed using CRE. Salinity at the depth of 1.5 m was best estimated by the variogram constructed using CRE and at the depth of 3 m was best estimated using the Gaussian model with the MME. The

 20

 40

 60

 80

 100

Salinity (psu)

35.95 35.96 35.97 35.98 35.99 36.00 36.01

Salinity (psu)

35.95 35.96 35.97 35.98 35.99 36.00 36.01

Density

0

shown.

**3.4 Cross-validation**

**3.3 Variogram modeling**

 20

 40

 60

 80

 100


Table 3. Parameters of the fitted variogram models for temperature measured at depths of 1.5 and 3.0 m.


Table 4. Parameters of the fitted variogram models for salinity measured at depths of 1.5 and 3 m.


<sup>a</sup> The preferred model.

Table 5. Cross-validation results for the temperature maps at depths of 1.5 and 3 m.

difference in performance between the two estimators: block kriging using the MME estimator (MBK) or block kriging using the CRE estimator (CBK) is not substantial. Fig. 6 shows the omnidirectional sample variograms for temperature at the depth of 1.5 m and 3 m fitted by the preferred models. Fig. 7 shows the omnidirectional sample variograms for salinity at the depth of 1.5 m and 3 m fitted by the preferred models.

Fig. 8 and Fig. 9 show the scatterplots of true versus estimated values for the most satisfactory models. The dark line is the 45º line passing through the origin and the discontinuous line is the OLS (Ordinary Least Squares) regression line. These plots show that observed and predicted values are highly positively correlated. The *R*<sup>2</sup> value for the temperature at the depth of 1.5 m was 0.9211 and the RMSE was 0.0088248ºC, and at the depth of 3 m was 0.8827 and the RMSE was 0.0058316ºC (Table 5). The *R*<sup>2</sup> value for the salinity at the depth of 1.5 m was 0.9513 and the RMSE was 0.0016435 psu, and at the depth of 3 m was 0.8982 and the RMSE was 0.0019793 psu (Table 6).

15.35 15.40 15.45 15.50 15.55 15.60

Based on Geostatistics Using an Autonomous Underwater Vehicle

15.35 15.40 15.45 15.50 15.55 15.60

Observed temperature (°C)

35.95 35.96 35.97 35.98 35.99 36.00 36.01

Observed salinity (psu)

15.35 15.40 15.45 15.50 15.55 15.60

35.95 35.96 35.97 35.98 35.99 36.00 36.01

Fig. 9. Predicted versus observed salinity at the depths of 1.5 m (left) and 3 m (right) using

Fig. 10 shows the block kriged maps of temperature on a 2 <sup>×</sup> 2 m<sup>2</sup> grid using the preferred models. Fig. 13 shows the block kriged maps of salinity on a 2 <sup>×</sup> 2 m<sup>2</sup> grid using the preferred models. In the 1.5 m kriged map the temperature ranges between 15.407ºC and 15.523ºC and the average value is 15.469ºC (the measured range is 15.359ºC–15.562ºC and the average value is 15.463ºC). In the 3 m kriged map the temperature ranges between 15.429ºC and 15.502ºC and the average value is 15.467ºC (the measured range is 15.393ºC–15.536ºC and the average value is 15.469ºC). We may say that estimated values are in accordance with the measurements since their distributions are similar (identical average values, medians, and quartiles). The difference in the ranges width is due to only 5.0% of the samples in the 1.5 m depth map (2.5% on each side of the distribution) and only 5.3% of the samples in the 3.0 m depth map

Predicted salinity (psu)

Fig. 8. Predicted versus observed temperature at the depths of 1.5 m (left) and 3 m (right)

Predicted temperature (°C)

<sup>253</sup> Mapping and Dilution Estimation of Wastewater Discharges

Observed temperature (°C)

35.95 35.96 35.97 35.98 35.99 36.00 36.01

Observed salinity (psu)

15.35 15.40 15.45 15.50 15.55 15.60

35.95 35.96 35.97 35.98 35.99 36.00 36.01

**3.5 Mapping**

the preferred models.

Predicted salinity (psu)

using the preferred models.

Predicted temperature (°C)


<sup>a</sup> The preferred model.

Table 6. Cross-validation results for the salinity maps at depths of 1.5 and 3 m.

Fig. 6. Variograms for temperature at depths of 1.5 m (left) and 3 m (right).

Fig. 7. Variograms for salinity at depths of 1.5 m (left) and 3 m (right).

Fig. 8. Predicted versus observed temperature at the depths of 1.5 m (left) and 3 m (right) using the preferred models.

Fig. 9. Predicted versus observed salinity at the depths of 1.5 m (left) and 3 m (right) using the preferred models.

#### **3.5 Mapping**

14 Will-be-set-by-IN-TECH

MBK 0.9471 3.1113e-5 2.8721e-6 1.6947e-3 CBKa 0.9513 -3.1579e-5 2.7010e-6 1.6435e-3

MBKa 0.8982 -7.1735e-5 3.9175e-6 1.9793e-3 CBK 0.7853 -8.1264e-5 8.2589e-6 2.8738e-3

Semivariance (°C2

Semivariance (psu2

2

4

6

)

0.0002

0.0004

0.0006

0.0008

)

Distance (m)

Distance (m)

20 40 60 80 100 120

20 40 60 80 100 120

Depth Method *R*<sup>2</sup> ME MSE RMSE

Table 6. Cross-validation results for the salinity maps at depths of 1.5 and 3 m.

1.5

3.0

Semivariance (°C2

Semivariance (psu2

2

4

6

8

10

12

)

0.0005

0.0010

)

<sup>a</sup> The preferred model.

Distance (m)

Distance (m)

20 40 60 80 100 120

Fig. 7. Variograms for salinity at depths of 1.5 m (left) and 3 m (right).

20 40 60 80 100 120

Fig. 6. Variograms for temperature at depths of 1.5 m (left) and 3 m (right).

Fig. 10 shows the block kriged maps of temperature on a 2 <sup>×</sup> 2 m<sup>2</sup> grid using the preferred models. Fig. 13 shows the block kriged maps of salinity on a 2 <sup>×</sup> 2 m<sup>2</sup> grid using the preferred models. In the 1.5 m kriged map the temperature ranges between 15.407ºC and 15.523ºC and the average value is 15.469ºC (the measured range is 15.359ºC–15.562ºC and the average value is 15.463ºC). In the 3 m kriged map the temperature ranges between 15.429ºC and 15.502ºC and the average value is 15.467ºC (the measured range is 15.393ºC–15.536ºC and the average value is 15.469ºC). We may say that estimated values are in accordance with the measurements since their distributions are similar (identical average values, medians, and quartiles). The difference in the ranges width is due to only 5.0% of the samples in the 1.5 m depth map (2.5% on each side of the distribution) and only 5.3% of the samples in the 3.0 m depth map

East (m)

East (m)

0 50 100 150

0 50 100 150

Based on Geostatistics Using an Autonomous Underwater Vehicle

35.95

0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035

Fig. 12. Variance of the estimation error for the maps of temperature distribution at depths of

Environmental effects are all related to concentration *C* of a particular contaminant *X*. Defining *Ca* as the background concentration of substance *X* in ambient water and *C*<sup>0</sup> as the concentration of *X* in the effluent discharge, the local dilution comes as follows (Fischer et al.,

> *<sup>S</sup>* <sup>=</sup> *<sup>C</sup>*<sup>0</sup> <sup>−</sup> *Ca C* − *Ca*

Fig. 11. Prediction map of salinity distribution (psu) at depths of 1.5 m (left) and 3 m (right).

m. Results of the same order were obtained for salinity. It's interesting to observe that, as expected, the variance of the estimation error is less the closer is the prediction from the trajectory of the vehicle. The dark blue regions correspond to the trajectory of MARES AUV.

> North (m)

> > −150

−100

−50

0

50

East (m)

East (m)

0 50 100 150

, (28)

0 50 100 150

35.95

0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035

35.96

35.97

35.98

35.99

36.00

35.96

35.97

35.98

North (m)

−150

−100

−50

0

50

35.99

36.00

<sup>255</sup> Mapping and Dilution Estimation of Wastewater Discharges

North (m)

−150

North (m)

−150

1979):

1.5 m (left) and 3 m (right)

**3.6 Dilution estimation**

−100

−50

0

50

−100

−50

0

50

(3.1% on the left side and 2.2% on the rigth side of the distribution). These samples should then be identified as outliers not representing the behaviour of the plume in the established area. In the 1.5 m kriged map the salinity ranges between 35.960 psu and 36.004 psu and the average value is 35.992 psu, which is in accordance with the measurements (the measured range is 35.957psu – 36.003psu and the average value is 35.991 psu). In the 3 m kriged map the salinity ranges between 35.977 psu and 36.004 psu and the average value is 35.995 psu, which is in accordance with the measurements (the measured range is 35.973psu – 36.008psu and the average value is 35.996 psu). As predicted by the plume prediction model, the effluent was found dispersing close to the surface. From the temperature and salinity kriged maps it is possible to distinguish the effluent plume from the background waters. It appears as a region of lower temperature and lower salinity when compared to the surrounding ocean waters at the same depth. At the depth of 1.5 m the major difference in temperature compared to the surrounding waters is about -0.116ºC while at the depth of 3 m this difference is about -0.073ºC. At the depth of 1.5 m the major difference in salinity compared to the surrounding waters is about -0.044 psu while at the depth of 3 m this difference is about -0.027 psu. It is important to note that these very small differences in temperature and salinity were detected due to the high resolution of the CTD sensor. (Washburn et al., 1992) observed temperature and salinity anomalies in the plume in the order, respectively of -0.3ºC and -0.1 psu, when compared with the surrounding waters within the same depth range. The small plume-related anomalies observed in the maps are evidence of the rapid mixing process. Due to the large differences in density between the rising effluent plume and ambient ocean waters, entrainment and mixing processes are vigorous and the properties within the plume change rapidly (Petrenko et al., 1998; Washburn et al., 1992). The effluent plume was found northeast from the diffuser beginning, spreading downstream in the direction of current. Using the navigation data, we could later estimate current velocity and direction and the values found were, respectively, 0.4 m/s and 70ºC, which is in accordance with the location of the plume.

Fig. 10. Prediction map of temperature distribution (ºC) at depths of 1.5 m (left) and 3 m (right).

Fig. 12 shows the variance of the estimation error (kriging variance) for the maps of temperature distribution at depths of 1.5 m and 3 m. The standard deviation of the estimation error is less than 0.0195ºC at the depth of 1.5 m and less than 0.0111ºC at the depth of 3 16 Will-be-set-by-IN-TECH

(3.1% on the left side and 2.2% on the rigth side of the distribution). These samples should then be identified as outliers not representing the behaviour of the plume in the established area. In the 1.5 m kriged map the salinity ranges between 35.960 psu and 36.004 psu and the average value is 35.992 psu, which is in accordance with the measurements (the measured range is 35.957psu – 36.003psu and the average value is 35.991 psu). In the 3 m kriged map the salinity ranges between 35.977 psu and 36.004 psu and the average value is 35.995 psu, which is in accordance with the measurements (the measured range is 35.973psu – 36.008psu and the average value is 35.996 psu). As predicted by the plume prediction model, the effluent was found dispersing close to the surface. From the temperature and salinity kriged maps it is possible to distinguish the effluent plume from the background waters. It appears as a region of lower temperature and lower salinity when compared to the surrounding ocean waters at the same depth. At the depth of 1.5 m the major difference in temperature compared to the surrounding waters is about -0.116ºC while at the depth of 3 m this difference is about -0.073ºC. At the depth of 1.5 m the major difference in salinity compared to the surrounding waters is about -0.044 psu while at the depth of 3 m this difference is about -0.027 psu. It is important to note that these very small differences in temperature and salinity were detected due to the high resolution of the CTD sensor. (Washburn et al., 1992) observed temperature and salinity anomalies in the plume in the order, respectively of -0.3ºC and -0.1 psu, when compared with the surrounding waters within the same depth range. The small plume-related anomalies observed in the maps are evidence of the rapid mixing process. Due to the large differences in density between the rising effluent plume and ambient ocean waters, entrainment and mixing processes are vigorous and the properties within the plume change rapidly (Petrenko et al., 1998; Washburn et al., 1992). The effluent plume was found northeast from the diffuser beginning, spreading downstream in the direction of current. Using the navigation data, we could later estimate current velocity and direction and the values found were, respectively, 0.4 m/s and 70ºC, which is in accordance with the location of the plume.

East (m)

0 50 100 150

15.38 15.40 15.42 15.44 15.46 15.48 15.50 15.52

Fig. 10. Prediction map of temperature distribution (ºC) at depths of 1.5 m (left) and 3 m

Fig. 12 shows the variance of the estimation error (kriging variance) for the maps of temperature distribution at depths of 1.5 m and 3 m. The standard deviation of the estimation error is less than 0.0195ºC at the depth of 1.5 m and less than 0.0111ºC at the depth of 3

North (m)

−150

−100

−50

0

50

East (m)

0 50 100 150

15.38 15.40 15.42 15.44 15.46 15.48 15.50 15.52

North (m)

−150

(right).

−100

−50

0

50

Fig. 11. Prediction map of salinity distribution (psu) at depths of 1.5 m (left) and 3 m (right).

m. Results of the same order were obtained for salinity. It's interesting to observe that, as expected, the variance of the estimation error is less the closer is the prediction from the trajectory of the vehicle. The dark blue regions correspond to the trajectory of MARES AUV.

Fig. 12. Variance of the estimation error for the maps of temperature distribution at depths of 1.5 m (left) and 3 m (right)

#### **3.6 Dilution estimation**

Environmental effects are all related to concentration *C* of a particular contaminant *X*. Defining *Ca* as the background concentration of substance *X* in ambient water and *C*<sup>0</sup> as the concentration of *X* in the effluent discharge, the local dilution comes as follows (Fischer et al., 1979):

$$S = \frac{\mathbb{C}\_0 - \mathbb{C}\_a}{\mathbb{C} - \mathbb{C}\_a} \,\prime \tag{28}$$

kriging, suggested the Matern model (*ν* = 0.5 − 1.5 m and *ν* = 0.7 − 3.0 m) with semivariance estimated by CRE. In the case of salinity, the validation results, using a two-dimensional ordinary block kriging, suggested the Matern model (*ν* = 0.6 − 1.5 m and *ν* = 0.8 − 3.0 m) with semivariance estimated by CRE, for the depth of 1.5 m, and with semivariance estimated by MME, for the depth of 3 m. The difference in performance between the two estimators was not substantial. Block kriged maps of temperature and salinity at depths of 1.5 m and 3 m show the spatial variation of these parameters in the area studied and from them it is possible to identify the effluent plume that appears as a region of lower temperature and lower salinity when compared to the surrounding waters, northeast from the diffuser beginning, spreading downstream in the direction of current. Using salinity distribution at depths of 1.5 m and 3 m we estimated dilution at those depths. The values found are in accordance with Portuguese legislation. The results presented demonstrate that geostatistical methodology can provide good estimates of the dispersion of effluent that are very valuable in assessing the

<sup>257</sup> Mapping and Dilution Estimation of Wastewater Discharges

This work was partially funded by the Foundation for Science and Technology (FCT) under the Program for Research Projects in all scientific areas (Programa de Projectos de Investigação em todos os domínos científicos) in the context of WWECO project - Environmental Assessment and Modeling of Wastewater Discharges using Autonomous

Abreu, N., Matos, A., Ramos, P. & Cruz, N. (2010). Automatic interface for AUV mission

Abreu, N. & Ramos, P. (2010). An integrated application for geostatistical analysis of sea

Bellingham, J. (1997). New Oceanographic Uses of Autonomous Underwater Vehicles, *Marine*

Bellingham, J., Goudey, C., Consi, T. R. & Chryssostomidis, C. (1992). A Small Long Range

Bivand, R. S., Pebesma, E. J. & Gómez-Rubio, V. (2008). *Applied spatial data analysis with R*.

Cressie, N. & Hawkins, D. M. (1980). Robust estimation of the variogram, I, *Jour. Int. Assoc.*

Fernandes, P. G., Brierley, A. S., Simmonds, E. J., Millard, N. W., McPhail, S. D., Armstrong,

Fischer, H. B., List, J. E., Koh, R. C. Y., Imberger, J. & Brooks, N. H. (1979). *Mixing in Inland and*

Goovaerts, P. (1997). *Geostatistics for natural resources evaluation, Applied Geostatistics Series*.

planning and supervision, *MTS/IEEE International Conference Oceans 2010*, Seattle,

outfall discharges based on R software, *MTS/IEEE International Conference Oceans*

Vehicle for Deep Ocean Exploration, *Proceedings of the International Offshore and Polar*

F., Stevenson, P. & Squires, M. (2000). Fish do not Avoid Survey Vessels, *Nature*

Underwater Vehicles Bio-optical Observations (Ref. PTDC/MAR/74059/2006).

environmental impact and managing sea outfalls.

Based on Geostatistics Using an Autonomous Underwater Vehicle

**5. Acknowledgment**

**6. References**

USA.

*2010*, Seattle, USA.

ISBN: 97 -0-387-78170-9.

*Math. Geol.* 12(2): 115–125.

*Coastal Waters*, Academic Press.

ISBN13: 9780195115383, ISBN10: 0195115384.

404: 35–36.

*Technology Society Journal* 31(3): 34–47.

Cressie, N. (1993). *Statistics for spatial data*, New York.

*Engineering Conference, San Francisco*, pp. 151–159.

which can be rearranged to give *C* = *Ca <sup>S</sup>*−<sup>1</sup> *S* + <sup>1</sup> *S C*0. In the case of variability of the background concentration of substance *X* in ambient water the local dilution is given by

$$S = \frac{\mathbb{C}\_0 - \mathbb{C}\_{a0}}{\mathbb{C} - \mathbb{C}\_a} \text{ } \tag{29}$$

where *Ca*<sup>0</sup> is the background concentration of substance *X* in ambient water at the discharge depth. This expression in 29 can be arranged to give *C* = *Ca* + <sup>1</sup> *S* (*C*<sup>0</sup> − *Ca*0), which in simple terms means that the increment of concentration above background is reduced by the dilution factor *S* from the point of discharge to the point of measurement of *C*. Using salinity distribution at depths of 1.5 m and 3 m we estimated dilution using Equation 29 (see the contour maps in Fig. 13). We assumed *C*<sup>0</sup> = 2.3 psu, *Ca*<sup>0</sup> = 35.93 psu, *Ca* = 36.008 psu at 1.5 m depth and *Ca* = 36.006 psu at 3 m depth. The minimum dilution estimated at the depth of 1.5 m was 705 and at the depth of 3.0 m was 1164 which is in accordance with Portuguese legislation that suggests that outfalls should be designed to assure a minimum dilution of 50 when the plume reaches surface (INAG, 1998). (Since dilution increases with the plume rising we should expect that the minimum values would be greater if the plume reached surface (Hunt et al., 2010)).

Fig. 13. Dilution maps at depths of 1.5 m (left) and 3 m (right).

#### **4. Conclusion**

Through geostatistical analysis of temperature and salinity obtained by an AUV at depths of 1.5 m and 3 m in an ocean outfall monitoring campaign it was possible to produce kriged maps of the sewage dispersion in the field. The spatial variability of the sampled data has been analyzed and the results indicated an approximated normal distribution of the temperature and salinity measurements, which is desirable. The Matheron's classical estimator and Cressie and Hawkins' robust estimator were then used to compute the omnidirectional variograms that were fitted to Matern models (for several shape parameters) and to a Gaussian model. The performance of each competing model was compared using a split-sample approach. In the case of temperature, the validation results, using a two-dimensional ordinary block kriging, suggested the Matern model (*ν* = 0.5 − 1.5 m and *ν* = 0.7 − 3.0 m) with semivariance estimated by CRE. In the case of salinity, the validation results, using a two-dimensional ordinary block kriging, suggested the Matern model (*ν* = 0.6 − 1.5 m and *ν* = 0.8 − 3.0 m) with semivariance estimated by CRE, for the depth of 1.5 m, and with semivariance estimated by MME, for the depth of 3 m. The difference in performance between the two estimators was not substantial. Block kriged maps of temperature and salinity at depths of 1.5 m and 3 m show the spatial variation of these parameters in the area studied and from them it is possible to identify the effluent plume that appears as a region of lower temperature and lower salinity when compared to the surrounding waters, northeast from the diffuser beginning, spreading downstream in the direction of current. Using salinity distribution at depths of 1.5 m and 3 m we estimated dilution at those depths. The values found are in accordance with Portuguese legislation. The results presented demonstrate that geostatistical methodology can provide good estimates of the dispersion of effluent that are very valuable in assessing the environmental impact and managing sea outfalls.
