**3. Genetic programming:**

202 Genetic Programming – New Approaches and Successful Applications

the authors are discussed at the end.

**2. The evolutionary computation** 

process maintains a high fidelity.

evolutionary process to occur as per [3] and they are

genotype copying process is not perfect.

Engineering, Environmental Engineering and lastly Hydraulic Engineering which is the focus of this chapter. The technique of ANN is now well established in the field of Civil Engineering to model various random and complex phenomena. Other techniques such as FL and EL caught attention of many research workers as a complimentary or alternative technique to ANN, particularly after knowing the drawbacks of ANN [2]. The soft computing tool of Genetic Programming which is essentially classified as an Evolutionary Computation (EC) technique has found its foot in the field of Hydraulic Engineering in general and modeling of water flows in particular since last 12 years or so. Modeling of water flows is perhaps the most daunting task ever faced by researchers in the field of Hydraulic Engineering owing to the randomness involved in many natural processes associated with the water flows. In pursuit of achieving more and more accuracy in estimation/forecasting of water related variables the researchers have made of use Genetic Programming for various tasks such as forecasting of runoff with or without rainfall, forecasting of ocean waves, currents, spatial mapping of waves to name a few. The present chapter takes a stalk of the applications of GP to model water flows which will enable the future researchers who want to pursue their research in this field. The chapter is organized as follows. Next section deals with basics of GP. A review of applications of GP in the field of Ocean Engineering is presented in the next section followed by review of applications in the field of hydrology. Few applications in the field of Hydraulics are discussed in the subsequent section. It may be noted that papers published in reputed international journals are only considered for review. Two case studies are presented next which are based on publications of the first author. The concluding remarks and future scope as envisaged by

The paradigm of evolutionary processes distinguishes between an organism's genotype, which is constructed of genetic material that is inherited from its parent or parents, and the organism's phenotype, which is the coming to full physical presence of the organism in a certain given environment and is represented by a body and its associated collection of characteristics or phenotypic traits. Within this paradigm, there are three main criteria for an

Criterion of Heredity: Offspring are similar to their parents: the genotype copying

Criterion of Variability: Offspring are not exactly the same as their parents: the

 Criterion of Fecundity: Variants leave different numbers of offspring: specific variations have an effect on behavior and behavior has an effect on reproductive success.

The evolutionary techniques can be differentiated into four main streams of Evolutionary Algorithm (EA) development [4] namely Evolution Strategies (ES), Evolutionary Programming (EP), Genetic Algorithms (GA) and Genetic Programming (GP) [5]. However, all evolutionary algorithms share the common property of applying evolutionary processes Like genetic algorithm (GA) the concept of Genetic Programming (GP) follows the principle of 'survival of the fittest' borrowed from the process of evolution occurring in nature. But unlike GA its solution is a computer program or an equation as against a set of numbers in the GA and hence it is convenient to use the same as a regression tool rather than an optimization one like the GA. GP operates on parse trees rather than on bit strings as in a GA, to approximate the equation (in symbolic form) or computer program that best describes how the output relates to the input variables. A good explanation of various concepts related to GP can be found in [5] Koza (1992). GP starts with a population of randomly generated computer programs on which computerized evolution process operates. Then a 'tournament' or competition is conducted by randomly selecting four programs from the population. GP measures how each program performs the user designated task. The two programs that perform the task best 'win' the tournament. GP algorithm then copies the two winner programs and transforms these copies into two new programs via crossover and mutation operators i.e. winners now have the 'children.' These two new child programs are then inserted into the population of programs, replacing the two loser programs from the tournament. Crossover is inspired by the exchange of genetic material occurring in sexual reproduction in biology. The creation of offspring's continues (in an iterative manner) till a specified number of offspring's in a generation are produced and further till another specified number of generations are created. The resulting offspring at the end of all this process (an equation or a computer program) is the solution of the problem. The GP thus transforms one population of individuals into another one in an iterative manner by following the natural genetic operations like reproduction, mutation and cross-over. Figure 1 shows general flowchart of GP as given by [5].

The tree based GP corresponds to the expressions (syntax trees) from a 'functional programming language' [5]. In this type, Functions are located at the inner nodes; while leaves of the tree hold input values and constants. A population of random trees representing the programs is initially constructed and genetic operations are performed on these trees to generate individuals with the help of two distinct sets; the terminal set T and the function set F.

**Population:** These are the programs initially constructed from the data sets in the form of trees to perform genetic operations using Terminal set and Function set. The function set for a run is comprised of operators to be used in evolving programs eg. addition, subtraction, absolute value, logarithm, square root etc. The terminal set for a run is made up of the values on which the function set operates. There can be four types of terminals namely inputs, constant, temporary variables, conditional flags. The population size is the number of programs in the population to be evolved. Larger population can solve more complicated problem. The maximum size of population depends upon RAM of the computer and length of programs in the population.

Genetic Programming: A Novel Computing Approach in Modeling Water Flows 205

The second variant of GP is Linear genetic Programming (LGP) which uses a specific linear representation of computer programs. The name 'linear' refers to the structure of the (imperative) program representation only and does not stand for functional genetic programs that are restricted to a linear list of nodes only. On the contrary, it usually represents highly nonlinear solutions. Each individual (Program) in LGP is represented by a variable-length sequence of simple C language instructions, which operate on the registers or constants from predefined sets. The function set of the system can be composed of arithmetic operations (+, - , X, /), conditional branches, and function calls (f {x, xn, sqrt, ex ,sin, cos, tan, log, ln }). Each function implicitly includes an assignment to a variable which facilitates use of multiple program outputs in LGP. LGP utilizes twopoint string cross-over. A segment of random position and random length of an instruction is selected from each parents and exchanged. If one of the resulting children exceeds the maximum length, this cross-over is abandoned and restarted by exchanging equalized segments. An operand or operator of an instruction is changed by mutation into another symbol over the same set. The readers are referred to [7] and [8] for further

Gene-Expression Programming (GEP) is an extension of GP, developed by [5]. The genome is encoded as linear chromosomes of fixed length, as in Genetic Algorithm (GA); however, in GEP the genes are then expressed as a phenotype in the form of expression trees. GEP combines the advantages of both its predecessors, GA and GP, and removes their limitations. GEP is a full fledged genotype/phenotype system in which both are dealt with separately, whereas GP is a simple replicator system. As a consequence of this difference, the complete genotype/phenotype GEP system surpasses the older GP system by a factor of 100 to 60,000. In GEP, just like in other evolutionary methods, the process starts with the random generation of an initial population consisting of individual chromosomes of fixed length. The chromosomes may contain one or more than one genes. Each individual chromosome in the initial population is then expressed and its fitness is evaluated using one of the fitness function equations available in the literature. These chromosomes are then selected based on their fitness values using a roulette wheel selection process. Fitter chromosomes have greater chances of selection for passage to the next generation. After selection, these are reproduced with some modifications performed by the genetic operators. In Gene Expression Programming, genetic operators such as mutation, inversion, transposition and recombination are used for these modifications. Mutation is the most efficient genetic operator, and it is sometime used as the only means of modification. The new individuals are then subjected to the same process of modification, and the process continues until the maximum number of generations is reached or the required

It is a known fact that many variables in the domain of Hydraulic Engineering are of random nature having a complex underlying phenomenon. For example the generation

details.

accuracy is achieved.

**5. Why use GP in modeling water flows?** 
