**8. Applications in hydraulics**

216 Genetic Programming – New Approaches and Successful Applications

presented in the paper.

were calibrated to forecast the temporal difference between the current and future discharge rather than absolute value of discharge for the lead times of 1 to 12 hours. In fact the paper discusses the phase lag associated with temporal time series forecasting models and removal of it by forecasting the temporal difference. The results were superior to conceptual numerical model. The model was then calibrated using a hybrid method in that the surface runoff value was first forecasted by using a conceptual forecasting model and then using the simulation error and GP to forecast the stream flow. The hybrid models provided a many fold improvement over the raw GP models. The paper in our opinion serves as a basic paper in the field of application of GP in Hydrology and readers may read the paper in original for all details about the GP models developed. The details are not produced here to save the space. Linear Genetic Programming technique was used to predict daily river discharge one day ahead using previous values of the same at Schuylkill River at Berne, PA, USA [8]. Additionally the models were developed using multilayer perceprton as well as Generalized Regression Neural Networks (GRNN). The statistical ARMA method was also used to develop the stream flow forecasting model. The results showed that both LGP and NN techniques predicted the daily time series of discharge with quite good agreement as indicated by high value of coefficient of determination and low values of error measures with the observed data. LGP models generally predicted the maximum and minimum discharge values better than the NN models though LGP results were also far from accurate. The robustness of the developed models was tested by using applied data which was neither used in training or testing and the results were judged using Akaike Information Criterion (AIC). For LGP parameters readers are requested to refer the comprehensive list

The potential of the GP-based model for flood routing between two river gauging stations on river Walla in USA was explored for single peaked as well as multi-peaked flood hydrographs by [32]. The accuracy of GP models was far superior than modified Muskingum method which is a traditional physics based hydrologic flood routing model which also showed time lag in predictions. The inputs were current and antecedent discharge at upstream station and antecedent discharge at downstream station while the output was current discharge at the downstream station. The LGP was employed for the flood routing exercise. The optimal GP parameters used in this study were: crossover rate, 0.9; mutation rate, 0.5; population size, 200; number of generations, 500; and functional set,

The utility of genetic programming in modeling the eddy-covariance (EC) measured evapotranspiration flux was investigated by [33]. The performance of the GP technique was compared with artificial neural network and Penman-Monteith model estimates. EC measured evapo-transpiration fluxes from two distinct case-studies with different climatic and topographic conditions were considered for the analysis and latent heat is modeled as a function of net radiation, ground temperature, air temperature, wind speed and relative humidity. Results from the study indicated that both data-driven models (ANN and GP) performed better than the Penman-Monteith method. However, the performance of the GP

i.e. simple arithmetic functions (plus, minus, multiply, divide).

A few applications of GP in Hydraulic Engineering are also reported in reputed journals which are from open channel hydraulics. Various GP models were developed by [36] to predict velocities across a compound channel with vegetated floodplains. The velocity data was collected in a laboratory flume with steady flow and deep channel and relatively shallow vegetated floodplain on either side. The GP model was developed with all 12 variables in dimensional form depicted accurate results though the evolved equation was complex. The GP

models were developed with dimensionless variables and separate for main channel and floodplain. Both the velocity prediction on flood plain and main channels showed good correlations with measured values. However the resulting expressions were complex. A dimensionally aware GP was then used to predict the velocity separately in main channel and flood plains. The performance of the symbolic expressions induced by the dimensionless GP for the floodplain and main channel was marginally better than those for the dimensionally aware GP. However, the expressions were more complex and not particularly useful for knowledge induction. The dimensionally aware GP was shown to hold more scientific information, as units of measurement were included, although it was also shown to be open ended in that it does not strictly adhere to the dimensional analysis framework, thereby allowing improved goodness-offit whilst yielding on goodness-of-dimension. The paper provides no information about the initial values of GP parameters used in evolving the GP model. GP was applied to the determination of the Chezy's roughness coefficient for corrugated channels in wakeinterference flow, i.e. hyper-turbulent flow by [37]. The GP models were calibrated using the experimental data devised by carrying out experiments for 3 plastic corrugated pipes with variations of discharge and slope. GP quite easily and quickly supplied at least two good formulae that fit the experimental data better and are more parsimonious than the monomial formula (mathematical). Moreover, GP has supplied six parsimonious expressions (one or two constants compared to four for the monomial formula) for the Chezy's resistance coefficient, all confirming the dependencies on hydraulic radius, slope and roughness index. It can be said that the two new formulae for the Chezys resistance coefficient, derived from these GP formulae by means of 'mathematical/physical post-refinement', are suitable for explaining the effect of the macro-roughness elements, with respect to the behavior of the rough commercial channels and their traditional expressions for resistance coefficients. The work indicated that this approach, which combines data-mining techniques together with a theoretical understanding, provides very good results. It was also commented that strictly speaking, GP is a data-driven technique, but prior knowledge during the setting up of the evolutionary search and final physical postrefinement of the hypothesis should make it very close to a white box technique, especially when GP is used in scientific discovery problems. The initial model parameters can be found in the original paper. To save space the list is not provided here.

Genetic Programming: A Novel Computing Approach in Modeling Water Flows 219

buoys 42001, 42003, 42007, 42036, 42039,42040) were developed in the Gulf Of Mexico (Figure 2) around USA coastline to estimate the wave heights at a location using wave heights at other five locations in the network. The required data from these six buoys was measured by National Data Buoy Center (NDBC, http://www.ndbc.noaa.gov) of National Oceanic and Atmospheric administration of USA (NOAA, http://www.noaa.gov ). The common wave data at all the above six locations for the years 2002-2004 was used in the present work. The networks were developed by having one station as target location at a time and remaining five locations as inputs turn by turn. Approximately 70% of the total values were used to calibrate the model and the remaining was kept unseen for testing. While doing this a particular event which occurred during Hurricane Ivan in 2004 at buoy 42040 which involved a Significant Wave Height of 15.96 m was focused for studying the performance of developed models during extreme events. It is to be noted that the exercise was of estimation and not of forecasting for which both the tools did not performed well as

Thus a network was developed with wave buoy 42040 as the target and buoys 42001, 42003, 42007, 42036, 42039 as inputs. Along with 42040 the other locations namely 42003, 42007, 42039 also experienced largest ever wave heights of 11.04, 9.09, 12.05 making the entire event a truly extra ordinary event having a return period of over 5000 years [39]. The initial parameters selected for a GP run were as follows: initial population size 500, mutation frequency 95%, and crossover frequency 50%. The fitness criterion was the mean squared

Additionally a three layer Feed Forward Neural Network was also developed for the same buoy network. The results were also compared with a large-scale continuous wave modeling /forecasting systems (NOAA's WAVEWATCH III model) which follows the approach of physics-based model. Though WAVEWATCH III is a continuous running forecasting model it was the only source of information for wave environment at a location and therefore in absence of any reliable observed data, these results were used for comparison. The GP model estimated a wave height of 13.67m as against 15.96 m as compared to 9.05m that of ANN model and 7.82m of WAVEWATCH III, which was an excellent result as far as GP approach is considered. Figure 3 shows the wave plot at 42040

From results of all the models developed by both the approaches (ANN & GP), it was observed that all models performed reasonably well in testing as evident by wave height plots, scatter plots along with the correlation coefficient ranging from 0.85 to 0.98, MAE from 0.13 to 0.28, RMSE from 0.20 to 0.45 m and coefficient of efficiency from 0.67 and 0.96. When it was tried to remove 42001 from the network as it is away from the prevailing wind direction by training a separate GP model with 42003, 42007, 42036, and 42039 as 'input buoys' and 42040 as 'target buoy', though the value of correlation coefficient was increased, the peak prediction was not in a fair range of accuracy for extreme event of Hurricane Ivan. Due to better performance of the network with inclusion of buoy 42001 especially for extreme event, buoy 42001 was retained in the network. Also it was found that 42039 was a potential candidate for redeployment in any other suitable position outside the network as

noted in the section on applications of GP in Ocean Engineering.

error.

in testing.

An alternative approach of GP was proposed in the estimation of relative scour depth using field data by [38]. The comparison between the GP model with ANN found that the GP model has good ability of forecasting the scour depth. The discharge intensity and height of fall were used as inputs to estimate scour depth below tail water. The predictive ability of this approach is however clouded by use of very small number of data (total 91 data sets) used for calibration and testing of the model. The values of initial model parameters can be referred from the original paper.
