**2. Methods**

### **2.1 Outline**

The whole idea behind the presented forecast and hindcast methods lies in the theory of linear dynamic systems (e.g. [40]). The concept is to consider, at least for technical purposes, the dynamics of semi-enclosed basins (e.g. [41]) as linear. The linearity allows to compute the unit response function of the basin and use it to determine (by using the convolution integral) the response of water level due to any wind time series.

More in detail, at this step a basin is discretized in small areas. In each area, the wind must be considered homogeneous and constant. The total number of the areas must be chosen in order to capture the variation of wind field across the basin (i.e. gradients). Considering a generic point of interest in the domain (hereinafter referred as POI), the elevation *η*ð Þ*t* due to one wind stress impulse acting on a generic area *i* of the domain will be

$$\eta\_i(\mathbf{t}) = U F\_i^U(\mathbf{t}) + V F\_i^V(\mathbf{t}) \tag{1}$$

where *F<sup>U</sup> <sup>i</sup>* ð Þ*<sup>t</sup>* and *<sup>F</sup><sup>V</sup> <sup>i</sup>* ð Þ*t* are the unit response functions induced by wind with a duration equal to Δ*t* acting on area *i*, while *U* and *V* are the components of wind stress impulse along with the two Cartesian directions (zonal and meridional, respectively) acting on the same area.

Considering *M* wind stress impulses, Eq. (1) becomes

$$\eta\_i(t) = \sum\_{j=1}^{j \le M} U\_{\vec{\eta}} F\_i^U(t - j\Delta t + \Delta t) + V\_{\vec{\eta}} F\_i^V(t - j\Delta t + \Delta t) \tag{2}$$

The value obtained with Eq. (2) is the water elevation of a POI due to a series of wind stress impulses *Uij* and *Vij* happened between *t* ¼ ð Þ *j* � 1 Δ*t* and *t* ¼ *j*Δ*t* on area *i*. The level due to the contribution of *N* areas can be easily calculated by using the superposition of the role of each area; therefore, Eq. (2) must be modified to as

$$\eta(\mathbf{t}\_k) = \sum\_{i}^{N} \sum\_{j=1}^{j \le M} U\_{ij}^{(\mathbf{r})} F\_i^U(\mathbf{t}\_k - j\Delta t + \Delta t) + V\_{ij}^{(\mathbf{r})} F\_i^V(\mathbf{t}\_k - j\Delta t + \Delta t) \tag{3}$$

with *tk* ¼ ð Þ *k* � 1 Δ*τ*.

The values of *U* and *V* can be forecast or hindcast data provided by global general circulation models and must be taken at a point at each area.

The unit response functions *F<sup>U</sup> <sup>i</sup>* and *F<sup>V</sup> <sup>i</sup>* , instead, can be evaluated using a numerical model (e.g. [42, 43]). It has to be stressed that the computation of the unit response functions has to be done once for all for each considered basin. In this way, it is possible to limit the computational costs of the methods.

However, the computed level is due to the wind field effect and does not consider the role of the barotropic field in the storm surge generation. For this reason, it can be viewed as a "raw level".

The pressure field is considered using statistical techniques. The forecast and the hindcast models have the physics-based module as a common part but correct the raw level in a different manner.

### **2.2 Forecast method statistical correction**

As previously underlined the statistical corrections are often carried out using regression models or, alternatively, artificial neural networks (ANNs).

In the proposed method, the use of a series of ANNs aimed at correcting the forecast for each lead time is suggested. Without claiming to be exhaustive, the ANNs could be defined as a statistical tool that reproduces the human ability of learning. They are made up of a layer of input neurons that, interacting with connections characterized by their own weight (i.e. hidden layers, activation function, etc.), produce one or more output neurons.

The learning phase is basically the way in which the system acquires information in order to predict future events on the basis of past experience. Mathematically, this phase consists of a training aimed at reducing the mean square error (MSE) between the output neuron(s) and the wished output by changing the weights of the links between neurons. The methodology is iterative, and it can be stopped only (after a fixed number of training epochs) when the MSE is lower than a given threshold. After the learning phase, the net needs to be tested in order to check its performance. If results are not consistent with the desired accuracy, it is possible to repeat the training phase. It has to be noticed that this phase is a "black box", so results from a training and another one could be very different from each other. At the end of this process, the network can be used in operational situations.

In the presented case, the choice was to use a series on ANNs instead of only one because each ANN operates only on one lead time (to correct a forecast of 48 h, 48 ANNs are needed).

A crucial point in ANNS is the choice of input neurons. In the case at hand, the use of (a) the raw level time series, (b) recent level measurements at the POI and (c) the pressures at the centre of each area used within the frame of the physicsbased module is proposed. Of course, a sensitivity analysis could be useful in order to perform this choice in other cases; i.e. the procedure is site-dependent.

### **2.3 Hindcast method statistical correction**

The statistical correction in the hindcast method is different from that of the forecast one. The reason lies in the different purposes of the method. Indeed, when the measured data time series are not long enough to be considered representative

*Simplified Methods for Storm Surge Forecast and Hindcast in Semi-Enclosed Basins: A Review DOI: http://dx.doi.org/10.5772/intechopen.92171*

(i.e. statistically) for high return periods, the use of hindcast methods is the only way to work with a subset of reliable data.

In these cases, the interest is not focused on the timing of the reconstructed time series, but on the reliability in reproducing the extreme events.

For these reasons, the correction of the raw level (obtained using reanalysis data) is carried out by adding the pressure field using a coefficient *Cp* that can be estimated by comparing synchronous observed residual levels and pressure values at a generic point of interest. This means that Eq. (3) is modified as follows:

$$\eta(\mathbf{t}\_k) = \sum\_{i}^{N} \sum\_{j=1}^{j \le M} U\_{ij}^{(\tau)} F\_i^U(\mathbf{t}\_k - j\Delta t + \Delta t) + V\_{ij}^{(\tau)} F\_i^V(\mathbf{t}\_k - j\Delta t + \Delta t) + \mathbf{C}\_p \Delta p(\mathbf{t}\_k) \tag{4}$$

where Δ*p t*ð Þ*<sup>k</sup>* represents the atmospheric pressure anomaly.

Due to the aim of the method (i.e. achieve reliable estimates of extreme events), a correction on the maximum hindcast error for a given temporal window has to be defined. Having available measured data, it is possible to define the maximum error (ϵ*max*) by comparing the maximum (not necessary synchronous) measured (*η<sup>M</sup> max*) and hindcast values (*η<sup>H</sup> max*) occurring within a time frame of a given duration. Following (e.g. [44]) this error could be defined as the relation between *η<sup>M</sup> max* and *ηH max*, and this allows to define a calibration coefficient as

$$\mathbf{C}\_{cal} = \frac{\eta\_{\text{max}}^{\mathcal{M}}}{\eta\_{\text{max}}^{H}}.\tag{5}$$

Following the technique proposed by [44], it is possible to evaluate the quantile of the ECDF of the random variable *Ccal*. This quantile should be viewed as the comparison between extreme values of observation and the corrected hindcast series. It is clear that this method requires the availability (even for a few years) of measured tide levels and pressure values.
