**3.1. Data division schemes**

144 Genetic Programming – New Approaches and Successful Applications

portfolios only at discrete intervals of time to reduce transactions costs.

Non parametric hedging strategies as an alternative to the existing parametric model basedstrategies, have been proposed [26,27]. Those studies estimated pricing formulas by nonparametric or semi-parametric statistical methods such as neural networks and kernel regression, and they measured their performance in terms of delta-hedging. Few researches have focused on the dynamic hedging using genetic programming, however. Chen et al. [28] have applied genetic programming to price and hedge S&P500 index options. By distinguishing the case in-the-money from the case out-of-the-money, the performance of genetic programming is compared with the Black-Scholes model in terms of hedging accuracy. Based on the post-sample performance, it is found that in approximately 20% of the 97 test

paths, genetic programming has lower tracking error than the Black-Scholes formula.

which opens up an alternative path besides other data-based approaches.

**3. Research design and methodology** 

Based on the literature survey, one can conclude that the genetic programming could be used to efficiently forecast volatility and implement accurate dynamic hedging strategies,

Accurate volatility forecasting is an essential element in conducting good dynamic hedging strategies. The first thrust of this paper deals with generation of implied volatility from option markets using static and dynamic training of genetic programming, respectively. While the static training [8] is characterized by training the genetic programming independently on a single Sub-sample, the dynamic training allows the genetic

etc. Most of the existing literature on hedging a target contract using other exchange-traded options focuses on static strategies, motivated at least in part by the desire to avoid the high costs of frequent trading. The goal of static hedging is to construct a buy-and-hold portfolio of exchange traded claims that perfectly replicates the payoff of a given over-the-counter product [24,25]. The static hedging strategy does not require any rebalancing and therefore, it does not incur significant transaction costs. Unfortunately, the odds of coming up with a perfect static hedge for a given over-the-counter claim are small, given the limited number of exchange listed option contracts with sufficient trading volume. In other words, the static hedge can only be efficient if traded options are available with sufficiently similar maturity and moneyness as the over-the-counter product that has to be hedged. Under a stochastic volatility, a perfect hedge can in principle be constructed with a dynamically rebalanced portfolio consisting of the underlying and one additional option. In practice, the dynamic replication strategy for European options will only be perfect if all of the assumptions underlying the Black-Scholes formula hold. For general contingent claims on a stock, under market frictions, the delta might still be used as first-order approximation to set up a riskless portfolio. However, if the volatility of the underlying stock varies stochastically, then the delta hedging method might fail severely. A simple method to limit the volatility risk is to consider the volatility sensitivity vega of the contract. The portfolio will have to be rebalanced frequently to ensure delta-vega neutrality. With transaction costs, frequent rebalancing might result in considerable losses. In practice, investors can rebalance their

> Data used in this study consist of daily prices for the European-style S&P 500 index calls and puts options traded on the Chicago Board of Options Exchange from 02 January to 29 August 2003. The data base include the time of the quote, the expiration date, the exercise price and the daily bid and ask quotes for call and put options. Similar information for the underlying S&P 500 index is also available on a daily basis. S&P500 index options are among the most actively traded financial derivatives in the world. The minimum tick for series trading below 3 is 1/16 and for all other series 1/8. Strike price intervals are 5 points, and 25 points for far months. The expiration months are three near term months followed by three additional months from the March quarterly cycle (March, June, September, and December). Following a standard practice, we used the average of an option's bid and ask price as a stand-in for the market value of the option. The risk free interest rate is approximated by using 3 month US Treasury bill rates. It is assumed that there are no transaction costs and no dividend.

<sup>1</sup> GP system is built around the Evolving Object library, which is an ANSI-C++ evolutionary computation Framework (EO library).

To reduce the likelihood of errors, data screening procedures are used [29,30]. We apply four exclusion filters to construct the final option sample. First, as implied volatilities of short-term options are very sensitive to small errors in the option price and may convey liquidity-related biases, options with time to maturity less than 10 days are excluded. Second, options with low quotes are eliminated to mitigate the impact of price discreteness on option valuation. Third, deep-in-the-money and deep-out-of-the money option prices are also excluded due to the lack of trading volume. Finally, option prices not satisfying the arbitrage restriction [31], *<sup>r</sup> C S Ke* , are not included.

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 147

method; in the second one, we used dynamic training-subset selection methods. We

Our genetic programming software is referred to as symbolic regression written in C++ language. It is designed to find a function that relates a set of inputs to an output without making any assumptions about the structure of that function. Symbolic regression was one of the earliest applications of genetic programming [3], and has continued to be widely studied [32-35]. The following pseudo code describes the genetic programming's algorithm

*Evaluate the performance of each individual according to the fitness criterion* 

The genetic programming's algorithm structure consists of the following steps: nodes definition, initialization, fitness evaluation, selection, genetic operators (crossover and

*Nodes Definition*: The nodes in the tree structure of genetic programming can be classified into terminal (leaf) nodes and function (non-terminal) nodes. The terminal and function sets

The terminal set includes the inputs variables, notably, the option price divided by strike

mathematical functions, notably, cosinus function (cos), sinus function (sin), log function (ln), exponential function (exp), square root function ( ) and the normal cumulative distribution function ( ). Binary nodes consist of the four basic mathematical operators, notably, addition (+), subtraction (-), multiplication ( ) and division (%). The basic division operation is protected against division by zero and the log and square root functions are

*<sup>K</sup>* for puts), the index price divided by strike price *<sup>S</sup>*

. The function set includes unary and binary nodes. Unary nodes consist of

*K*

and time to

 *- Select individuals in the population using the selection algorithm - Perform crossover and mutation operations on the selected individuals* 

describe training and test samples used in these experiments.

*Until the offspring population is fully populated do* 

 *Replace the existing population by the new population* 

**Algorithm 1** Pseudo code of genetic programming

mutation) and termination condition.

used are described in Table 1.

for calls and *<sup>P</sup>*

protected against negative arguments.

 *- Insert new individuals in the offspring population* 

*3.2.1. The design of genetic programming:* 

*While (termination condition not satisfied) do*

structure used in this paper.

*Report the best solution found* 

*Initialize population* 

*Begin* 

*End while* 

*End* 

price ( *<sup>C</sup> K*

maturity

The final sample contains 6670 daily option quotes, with at-the-money (ATM), in-the-money (ITM) and out-of-the money (OTM) options respectively taking up 37%, 34% and 29% of the total sample.

In this paper, two data division schemes are used. The full sample is sorted first, by time series (TS) and second by moneyness-time to maturity (MTM). For time series, data are divided into 10 successive samples (S1, S2…S10), each contains 667 daily observations. The first nine samples are used as training sub-samples. For moneyness-time to maturity, data are divided into nine classes with respect to moneyness and time to maturity criteria. According to moneyness criterion: A call option is said out-of-the money (OTM) if *S K*/ 0.98 ; at-the-money (ATM) if *S K*/ 0.98,1.03 ; and in-the-money (ITM) if *S K*/ 1.03 . According to time to maturity criterion: A call option is Short Term (ST) if 60 days; Medium Term (MT) if 60,180 days; and Long Term (LT) if 180 days. Each class Ci is divided on training set Ci L and test set Ci T, which produces respectively nine training and nine test MTM sub-classes. Figure 2 illustrates the two division schemes.

**Figure 2.** Data division schemes

#### **3.2. Implied volatility forecasting using genetic programming:**

This subsection describes the design of genetic programming and the experiments accomplished using the genetic programming method to forecast implied volatility. In the first experiment, the genetic programming is trained using static training-subset selection method; in the second one, we used dynamic training-subset selection methods. We describe training and test samples used in these experiments.

## *3.2.1. The design of genetic programming:*

146 Genetic Programming – New Approaches and Successful Applications

arbitrage restriction [31], *<sup>r</sup> C S Ke*

60 days; Medium Term (MT) if

**Figure 2.** Data division schemes

Each class Ci is divided on training set Ci

total sample.

To reduce the likelihood of errors, data screening procedures are used [29,30]. We apply four exclusion filters to construct the final option sample. First, as implied volatilities of short-term options are very sensitive to small errors in the option price and may convey liquidity-related biases, options with time to maturity less than 10 days are excluded. Second, options with low quotes are eliminated to mitigate the impact of price discreteness on option valuation. Third, deep-in-the-money and deep-out-of-the money option prices are also excluded due to the lack of trading volume. Finally, option prices not satisfying the

The final sample contains 6670 daily option quotes, with at-the-money (ATM), in-the-money (ITM) and out-of-the money (OTM) options respectively taking up 37%, 34% and 29% of the

In this paper, two data division schemes are used. The full sample is sorted first, by time series (TS) and second by moneyness-time to maturity (MTM). For time series, data are divided into 10 successive samples (S1, S2…S10), each contains 667 daily observations. The first nine samples are used as training sub-samples. For moneyness-time to maturity, data are divided into nine classes with respect to moneyness and time to maturity criteria. According to moneyness criterion: A call option is said out-of-the money (OTM) if *S K*/ 0.98 ; at-the-money (ATM) if *S K*/ 0.98,1.03 ; and in-the-money (ITM) if *S K*/ 1.03 . According to time to maturity criterion: A call option is Short Term (ST) if

L and test set Ci

training and nine test MTM sub-classes. Figure 2 illustrates the two division schemes.

60,180 days; and Long Term (LT) if 180

T, which produces respectively nine

days.

**3.2. Implied volatility forecasting using genetic programming:** 

This subsection describes the design of genetic programming and the experiments accomplished using the genetic programming method to forecast implied volatility. In the first experiment, the genetic programming is trained using static training-subset selection

, are not included.

Our genetic programming software is referred to as symbolic regression written in C++ language. It is designed to find a function that relates a set of inputs to an output without making any assumptions about the structure of that function. Symbolic regression was one of the earliest applications of genetic programming [3], and has continued to be widely studied [32-35]. The following pseudo code describes the genetic programming's algorithm structure used in this paper.


**Algorithm 1** Pseudo code of genetic programming

The genetic programming's algorithm structure consists of the following steps: nodes definition, initialization, fitness evaluation, selection, genetic operators (crossover and mutation) and termination condition.

*Nodes Definition*: The nodes in the tree structure of genetic programming can be classified into terminal (leaf) nodes and function (non-terminal) nodes. The terminal and function sets used are described in Table 1.

The terminal set includes the inputs variables, notably, the option price divided by strike price ( *<sup>C</sup> K* for calls and *<sup>P</sup> <sup>K</sup>* for puts), the index price divided by strike price *<sup>S</sup> K* and time to maturity . The function set includes unary and binary nodes. Unary nodes consist of mathematical functions, notably, cosinus function (cos), sinus function (sin), log function (ln), exponential function (exp), square root function ( ) and the normal cumulative distribution function ( ). Binary nodes consist of the four basic mathematical operators, notably, addition (+), subtraction (-), multiplication ( ) and division (%). The basic division operation is protected against division by zero and the log and square root functions are protected against negative arguments.


Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 149

is used as target output. It is defined

) output volatility, computed as follows:

(2)

(1)

, and

other 50% are generated via the grow method. In the full method, the initial trees have the property that every path from root to endpoint is of maximum depth. In the grow method, initial trees can be of various depths, subject to the constraint that they do not exceed the

*Fitness function:* The fitness function assigned to a particular individual in the population must reflect how closely the output of an individual program comes to the target function.

as the standard deviation which equates the Black-Scholes price *CBS* 2 to the market option

\*

, ,, , ,

*K T*

*BS BS t t t*

The generated genetic programming trees provide at each time t the forecast value ˆ*<sup>t</sup>*

*C S K KT C KT*

the fitness function used to measure the accuracy of forecast is the mean squared error

1 <sup>1</sup> <sup>ˆ</sup> *N*

*Selection:* Based on the fitness criterion, the selection of the individuals for reproduction is done with the tournament selection algorithm. A group of individuals is selected from the population with a uniform random probability distribution. The fitness values of each member of this group are compared and the actual best is selected. The size of the group is

*Genetic operators:* Crossover and mutation are the two basic operators which are applied to the selected individuals in order to generate new individuals for the next generation. As described in Figure 4, the subtree crossover creates new offspring trees from two selected parents by exchanging their sub-trees. As indicated in Table 2, the crossover operator is used to generate about 60% of the individuals in the population. The maximum tree size (measured by depth) allowed after the crossover is 17. This is a popular number used to limit the size of tree [3]. It is large enough to accommodate complicated formulas and works

*t*

*N*

 

*BS t t*

2

! , 0,

 

*BS t*

) and forecasted ( ˆ*<sup>t</sup>*

*MSE*

given by the tournament size which is equal to 4, as indicated in Table 2.

1 2 1 2 1

*S*

ln 0.5 , , *<sup>r</sup>*

 .

> 

*r*

 

> 

*t* 

maximum depth.

price \* *Ct* [36]:

in practice.

*BS*

(MSE) between the target ( *BS*

Where, N is the number of data sample.

<sup>2</sup> <sup>2</sup>

*<sup>K</sup> C SN d Ke N d d d d*

In this paper, the Black-Scholes implied volatility *BS*

*t* 

**Table 1.** Terminal set and function set

Individuals are encoded as LISP S-expressions which can also be depicted as a parse tree. The search space for genetic programming is the space of all possible parse trees that can be recursively created from the terminal and function sets.

**Figure 3.** Example of a tree structure for GP and the corresponding functions

*Initialization*: The generated genetic programming volatility models are performed using a ramped half and half as initialization method [3]. This method involves generating an equal number of trees using a maximum initial depth that ranges from 2 to 6, as specified in Table 2. For each level of depth, 50% of the initial trees are generated via the full method and the other 50% are generated via the grow method. In the full method, the initial trees have the property that every path from root to endpoint is of maximum depth. In the grow method, initial trees can be of various depths, subject to the constraint that they do not exceed the maximum depth.

*Fitness function:* The fitness function assigned to a particular individual in the population must reflect how closely the output of an individual program comes to the target function. In this paper, the Black-Scholes implied volatility *BS t* is used as target output. It is defined as the standard deviation which equates the Black-Scholes price *CBS* 2 to the market option price \* *Ct* [36]:

$$\begin{aligned} \exists! & \qquad \sigma\_t^{BS}(K, T) \succ 0, \\ \mathcal{C}\_{BS} \left( \mathcal{S}\_{t'} K, \pi, \sigma\_t^{BS} \left( K, T \right) \right) = \mathcal{C}\_t^\* \left( K, T \right) \end{aligned} \tag{1}$$

The generated genetic programming trees provide at each time t the forecast value ˆ*<sup>t</sup>* , and the fitness function used to measure the accuracy of forecast is the mean squared error (MSE) between the target ( *BS t* ) and forecasted ( ˆ*<sup>t</sup>* ) output volatility, computed as follows:

$$MSE = \frac{1}{N} \sum\_{t=1}^{N} \left(\sigma\_t^{BS} - \hat{\sigma}\_t\right)^2 \tag{2}$$

Where, N is the number of data sample.

148 Genetic Programming – New Approaches and Successful Applications

0

**Table 1.** Terminal set and function set

recursively created from the terminal and function sets.

**Figure 3.** Example of a tree structure for GP and the corresponding functions

**Expression Definition**

Terminal Set C/K Call price / Strike price

Function Set + (plus) Addition

<sup>0</sup> (divide) Protected division: x <sup>0</sup>

S/K Index price / Strike price

ln Protected natural log: ln ln *x x*

Exp Exponential function: exp *<sup>x</sup> x e*

Sqrt Protected square root: *x x*

Individuals are encoded as LISP S-expressions which can also be depicted as a parse tree. The search space for genetic programming is the space of all possible parse trees that can be

*Initialization*: The generated genetic programming volatility models are performed using a ramped half and half as initialization method [3]. This method involves generating an equal number of trees using a maximum initial depth that ranges from 2 to 6, as specified in Table 2. For each level of depth, 50% of the initial trees are generated via the full method and the

Ncdf Normal cumulative distribution function

<sup>0</sup> y = 1 if y=0; x <sup>0</sup>

<sup>0</sup> y = x <sup>0</sup>

<sup>0</sup> y otherwise

τ Time to maturity


\* (multiply) Multiplication

*Selection:* Based on the fitness criterion, the selection of the individuals for reproduction is done with the tournament selection algorithm. A group of individuals is selected from the population with a uniform random probability distribution. The fitness values of each member of this group are compared and the actual best is selected. The size of the group is given by the tournament size which is equal to 4, as indicated in Table 2.

*Genetic operators:* Crossover and mutation are the two basic operators which are applied to the selected individuals in order to generate new individuals for the next generation. As described in Figure 4, the subtree crossover creates new offspring trees from two selected parents by exchanging their sub-trees. As indicated in Table 2, the crossover operator is used to generate about 60% of the individuals in the population. The maximum tree size (measured by depth) allowed after the crossover is 17. This is a popular number used to limit the size of tree [3]. It is large enough to accommodate complicated formulas and works in practice.

$$\tau \colon \mathbb{C}\_{BS} = \text{SN}\left(d\_1\right) - \text{Ke}^{-rr}\text{N}\left(d\_2\right),\\d\_1 = \frac{\ln\left(\frac{S}{K}\right) + \left(r + 0.5\sigma^2\right)\tau}{\sigma\sqrt{\tau}},\\d\_2 = d\_1 - \sigma\sqrt{\tau} \ge 0$$

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 151

Point mutation operator consists of replacing a single node in a tree with another randomly-

Expansion mutation operator randomly selects a terminal node in the tree, and then replaces

As indicated in Table 2, Branch mutation is applied with a rate of 20%; Point and Expansion

*Replacement:* The method of replacing parents for the next generation is comma replacement strategy [40], which selects the best offspring to replace the parents. It assumes that offspring size is higher than parents' size. If µ is the population size and λ is the number of the new individuals (which can be larger than µ), the population is constructed using the

*Termination criterion:* The stopping criterion is the maximum number of generations. It is fixed at 400 and 1000 for static and dynamic training- subset selection, respectively. In the dynamic

**Figure 6.** Example of point mutation

generated node of the same arity [39].

**Figure 7.** Example of expansion mutation

best µ out of the λ new individuals.

mutations are applied with a rate of 10%, respectively.

it with a new randomly-generated subtree.

**Figure 4.** Example of subtree crossover

The mutation operator randomly changes a tree by randomly altering nodes or sub-trees to create a new offspring. Often multiple types of mutation are beneficially used simultaneously [37,38]. In this paper, three mutation operators are used simultaneously, they are described below:

Branch (or subtree) mutation operator randomly selects an internal node in the tree, and then it replaces the subtree rooted at that node with a new randomly-generated subtree [3].

**Figure 5.** Example of subtree mutation

**Figure 6.** Example of point mutation

**Figure 4.** Example of subtree crossover

**Figure 5.** Example of subtree mutation

they are described below:

[3].

The mutation operator randomly changes a tree by randomly altering nodes or sub-trees to create a new offspring. Often multiple types of mutation are beneficially used simultaneously [37,38]. In this paper, three mutation operators are used simultaneously,

Branch (or subtree) mutation operator randomly selects an internal node in the tree, and then it replaces the subtree rooted at that node with a new randomly-generated subtree Point mutation operator consists of replacing a single node in a tree with another randomlygenerated node of the same arity [39].

Expansion mutation operator randomly selects a terminal node in the tree, and then replaces it with a new randomly-generated subtree.

**Figure 7.** Example of expansion mutation

As indicated in Table 2, Branch mutation is applied with a rate of 20%; Point and Expansion mutations are applied with a rate of 10%, respectively.

*Replacement:* The method of replacing parents for the next generation is comma replacement strategy [40], which selects the best offspring to replace the parents. It assumes that offspring size is higher than parents' size. If µ is the population size and λ is the number of the new individuals (which can be larger than µ), the population is constructed using the best µ out of the λ new individuals.

*Termination criterion:* The stopping criterion is the maximum number of generations. It is fixed at 400 and 1000 for static and dynamic training- subset selection, respectively. In the dynamic

training- subset selection approach, the maximum number of generations is increased to allow the genetic programming to train on the maximum of samples simultaneously. The number of generations to change sample varied between 20 and 100 generations.

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 153

"generation-wise" consists of extracting a single fixed sample of data instances from the training set at each generation, and all individuals of that generation will have their fitness evaluated on that data sample. This method avoids the disadvantages of the two previous approaches, and as such seems more effective. In particular, an individual will only survive for several generations if it has a good predictive accuracy across different data samples. The dynamic approach proposed in this study differs from the three previous approaches as it doesn't extract a fixed sample of data instances from the training set, but selects it from the whole sub-samples data which are already built up and use it to evaluate the fitness of all individuals when the generations' number to change sample is reached. In this paper, we proposed four dynamic training-subset selection methods: *Random Subset Selection* method (*RSS*), *Sequential Subset Selection* method (*SSS*), *Adaptive-Sequential Subset Selection* method (ASSS) and *Adaptive-Random Subset Selection* method (*ARSS*). The *RSS* and *SSS* allow the genetic programming to learn on all training samples in turn (*SSS*) or randomly (*RSS*). However, with these methods, there is no certainty that genetic programming will focus on the samples which are difficult to learn. Then, the *ASSS* and the *ARSS,* which are variants of *the adaptive subset selection (ASS),* are introduced to focus the genetic programming's attention onto the difficult samples i.e. having the greatest MSE and then to improve the learning algorithm. Dynamic subset selection is easily added to the basic GP algorithm with no additional

Let S be the set of training samples Si (i=1…k), where k is the total number of samples. A selection probability P (Si) is allocated to each sample Si from S. The training sample *Si* is changed each g generations (g is the number of generations to change sample) according to this selection probability and the dynamic training-subset selection method used. Once a new training sample is selected, the best individuals are used as population for the next training samples. This procedure is repeated until the maximum number of generations is reached. This permits genetic programming to adapt its generating process to changing data in response to feedback from the fitness function which is the mean squared error computed as in static approach. By the end of the evolution, only individuals with the desirable

It selects randomly the training samples with replacement. At each g generations, all the samples from S have the same probability to be selected as the current training sample: P (Si) =1/k, 1≤ i ≤ k. This method differs from that proposed by Gathercole and Ross [42] as random selection concerns training samples which are already constructed according to data

As selection of training samples is random, the performance of the current population changes with the training sample used for evolving the genetic program. Figure 8 illustrates an example of the best fitness (MSE) curve along evolution using RSS method. With the sample change, the MSE may increase, but it is improved during the following generations,

characteristics that are well adapted to the environmental changes will survive.

computational cost according to the static subset selection.

a. Random training-Subset Selection method (RSS):

the time that the population adapts itself to the new environment.

division scheme, rather than data instances.

The implementation of genetic programming involves a series of trial and error experiments to determine the optimal set of genetic parameters which is listed in Table 2. By varying genetic parameters, each program is run ten times with ten different random seeds. The choice of the best genetic program is made according to the mean and median of Mean Squared Errors (MSE) for training and testing sets.


**Table 2.** Summary of genetic programming parameters

#### *3.2.2. Dynamic training-subset selection method*

As data are divided in several sub-samples, the genetic programming is trained, first, independently on each sub-sample relative to each data division scheme (algorithm 1). This approach is called static training-subset selection method [8]. Second, the genetic programming is trained simultaneously on the entire data sub-samples relative to each data division scheme, rather than just a single subset by changing the training sub-sample during the run process. This approach is called dynamic training-subset selection method. The main goal of this method is to make genetic programming adaptive to all training samples and able to generate general models and solutions that are more robust across different learning data samples. In the context of evolutionary algorithms, there are at least three approaches for defining the frequency of resampling [41]. The first approach called "individual-wise" consists of extracting a new sample of data instances from the training set for each individual of the population. As a result, different individuals will probably be evaluated on different data samples, which cast some doubts on the fairness of the selection procedure of the evolutionary algorithm. The second approach called "run-wise" consists of extracting a single fixed sample of data instances from the training set used to evaluate the fitness of all individuals throughout the evolutionary run, which will probably reduce significantly the robustness and predictive accuracy of the evolutionary algorithm. The third approach called "generation-wise" consists of extracting a single fixed sample of data instances from the training set at each generation, and all individuals of that generation will have their fitness evaluated on that data sample. This method avoids the disadvantages of the two previous approaches, and as such seems more effective. In particular, an individual will only survive for several generations if it has a good predictive accuracy across different data samples. The dynamic approach proposed in this study differs from the three previous approaches as it doesn't extract a fixed sample of data instances from the training set, but selects it from the whole sub-samples data which are already built up and use it to evaluate the fitness of all individuals when the generations' number to change sample is reached. In this paper, we proposed four dynamic training-subset selection methods: *Random Subset Selection* method (*RSS*), *Sequential Subset Selection* method (*SSS*), *Adaptive-Sequential Subset Selection* method (ASSS) and *Adaptive-Random Subset Selection* method (*ARSS*). The *RSS* and *SSS* allow the genetic programming to learn on all training samples in turn (*SSS*) or randomly (*RSS*). However, with these methods, there is no certainty that genetic programming will focus on the samples which are difficult to learn. Then, the *ASSS* and the *ARSS,* which are variants of *the adaptive subset selection (ASS),* are introduced to focus the genetic programming's attention onto the difficult samples i.e. having the greatest MSE and then to improve the learning algorithm.

Dynamic subset selection is easily added to the basic GP algorithm with no additional computational cost according to the static subset selection.

Let S be the set of training samples Si (i=1…k), where k is the total number of samples. A selection probability P (Si) is allocated to each sample Si from S. The training sample *Si* is changed each g generations (g is the number of generations to change sample) according to this selection probability and the dynamic training-subset selection method used. Once a new training sample is selected, the best individuals are used as population for the next training samples. This procedure is repeated until the maximum number of generations is reached. This permits genetic programming to adapt its generating process to changing data in response to feedback from the fitness function which is the mean squared error computed as in static approach. By the end of the evolution, only individuals with the desirable characteristics that are well adapted to the environmental changes will survive.

a. Random training-Subset Selection method (RSS):

152 Genetic Programming – New Approaches and Successful Applications

Squared Errors (MSE) for training and testing sets.

Maximum depth of the tree:

**Table 2.** Summary of genetic programming parameters

*3.2.2. Dynamic training-subset selection method* 

Population size: Offspring size:

Tournament size: Crossover probability: Mutation probability: Branch mutation: Point mutation: Expansion mutation:

generations to change sample varied between 20 and 100 generations.

Maximum number of generations for static method: Maximum number of generations for dynamic method:

Generations' number to change sample Maximum depth of new individual:

training- subset selection approach, the maximum number of generations is increased to allow the genetic programming to train on the maximum of samples simultaneously. The number of

The implementation of genetic programming involves a series of trial and error experiments to determine the optimal set of genetic parameters which is listed in Table 2. By varying genetic parameters, each program is run ten times with ten different random seeds. The choice of the best genetic program is made according to the mean and median of Mean

As data are divided in several sub-samples, the genetic programming is trained, first, independently on each sub-sample relative to each data division scheme (algorithm 1). This approach is called static training-subset selection method [8]. Second, the genetic programming is trained simultaneously on the entire data sub-samples relative to each data division scheme, rather than just a single subset by changing the training sub-sample during the run process. This approach is called dynamic training-subset selection method. The main goal of this method is to make genetic programming adaptive to all training samples and able to generate general models and solutions that are more robust across different learning data samples. In the context of evolutionary algorithms, there are at least three approaches for defining the frequency of resampling [41]. The first approach called "individual-wise" consists of extracting a new sample of data instances from the training set for each individual of the population. As a result, different individuals will probably be evaluated on different data samples, which cast some doubts on the fairness of the selection procedure of the evolutionary algorithm. The second approach called "run-wise" consists of extracting a single fixed sample of data instances from the training set used to evaluate the fitness of all individuals throughout the evolutionary run, which will probably reduce significantly the robustness and predictive accuracy of the evolutionary algorithm. The third approach called

100 200 400 1000 20-100 6 17 4 60% 40% 20% 10% 10%

> It selects randomly the training samples with replacement. At each g generations, all the samples from S have the same probability to be selected as the current training sample: P (Si) =1/k, 1≤ i ≤ k. This method differs from that proposed by Gathercole and Ross [42] as random selection concerns training samples which are already constructed according to data division scheme, rather than data instances.

> As selection of training samples is random, the performance of the current population changes with the training sample used for evolving the genetic program. Figure 8 illustrates an example of the best fitness (MSE) curve along evolution using RSS method. With the sample change, the MSE may increase, but it is improved during the following generations, the time that the population adapts itself to the new environment.

Figure 8 shows that some training samples could be duplicated, but some others could be eliminated.

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 155

(3)

*f X*

*j*

1 1 \*

*M g*

*t j*

 

*i*

*W S*

*g M*

*f X*

*j*

generations. Selection is made according to a weight computed proportionally to the

1 1 \*

*M g*

*t j*

 

Where, M is the size of Si ( *X S j i* ), g is the number of generations to change sample, and

At each g generations, training samples are re-ordered, so that the most difficult training samples, which have higher weights, will be moved to the beginning of the ordered training list, and the easiest training samples, which have smaller weights, will be moved to the end

**Step 1.** Let the first generation t be set to 0. Each training sample is assigned an equal

**Step 2.** The probability P (Si) that a training sample Si is selected to be included in the training set and evolve genetic programming is determined using the Roulette wheel

����� <sup>=</sup> �����

Moreover, the probability P (Si) is positively related to the fitness of the parse tree generated

����� <sup>=</sup> �����

Compute a fitness function which is the mean squared error for each individual in the

As illustrated in Figure 10, selection of training samples is made in the order for the first t generations using the SSS method. Some training samples could be duplicated to improve

Where, ����� is the average fitness of individuals relative to the training sample.

training sample and then the average fitness. Update the weights:

**Step 3.** t=t+g. If t<T (T is the total number of generations), then go to step 2.

∑ �����

∑ �����

*g M*

sample's average fitness. Each g generations, the weights are updated as follows:

*i*

*W S*

1. Adaptive-Sequential training-Subset Selection method (ASSS):

It uses the following procedure (step 1 to step 3):

Where, the summation is over all training samples.

relative to the corresponding training sample.

weight, i.e., W(Si) = 1 for 1≤ i ≤ k.

*<sup>f</sup> Xj* is the MSE of the individual *Xj* .

of the ordered training list.

selection scheme.

**Figure 8.** Example of fitness curve of the best individuals generated by genetic programming using RSS method for time series samples

#### b. Sequential training-Subset Selection method (SSS)

It selects all the training samples in the order. If, at generation g-1, the current training sample is Si, then at generation g: P (Sj) = 1, with j= i+1 if i<k, or j=1 if i=k.

**Figure 9.** Example of curve fitness of the best individuals generated by genetic programming using SSS method for moneyness-time to maturity classes

As illustrated in Figure 9, all the learning subsets are used during the evolution in an iterative way.

c. Adaptive training-Subset Selection method (ASS):

Instead of selecting a training subset data in a random or sequential way, one can use an adaptive approach to dynamically select difficult training subsets data which are frequently misclassified. This approach is inspired from the dynamic subset selection method proposed by Gathercole and Ross [42] which is based on the idea of dynamically selecting instances, not training samples, which are difficult and/or have not been selected for several generations. Selection is made according to a weight computed proportionally to the sample's average fitness. Each g generations, the weights are updated as follows:

$$W(\mathbf{S}\_i) = \frac{\sum\_{t=1}^{\mathcal{S}} \sum\_{j=1}^{M} f\left(\mathbf{X}\_j\right)}{M^\* \mathcal{S}} \tag{3}$$

Where, M is the size of Si ( *X S j i* ), g is the number of generations to change sample, and *<sup>f</sup> Xj* is the MSE of the individual *Xj* .

At each g generations, training samples are re-ordered, so that the most difficult training samples, which have higher weights, will be moved to the beginning of the ordered training list, and the easiest training samples, which have smaller weights, will be moved to the end of the ordered training list.

1. Adaptive-Sequential training-Subset Selection method (ASSS):

It uses the following procedure (step 1 to step 3):

154 Genetic Programming – New Approaches and Successful Applications

b. Sequential training-Subset Selection method (SSS)

method for moneyness-time to maturity classes

c. Adaptive training-Subset Selection method (ASS):

iterative way.

sample is Si, then at generation g: P (Sj) = 1, with j= i+1 if i<k, or j=1 if i=k.

eliminated.

method for time series samples

Figure 8 shows that some training samples could be duplicated, but some others could be

**Figure 8.** Example of fitness curve of the best individuals generated by genetic programming using RSS

It selects all the training samples in the order. If, at generation g-1, the current training

**Figure 9.** Example of curve fitness of the best individuals generated by genetic programming using SSS

As illustrated in Figure 9, all the learning subsets are used during the evolution in an

Instead of selecting a training subset data in a random or sequential way, one can use an adaptive approach to dynamically select difficult training subsets data which are frequently misclassified. This approach is inspired from the dynamic subset selection method proposed by Gathercole and Ross [42] which is based on the idea of dynamically selecting instances, not training samples, which are difficult and/or have not been selected for several


$$P(\mathcal{S}\_l) = \frac{\mathbf{w}(\mathcal{S}\_l)}{\sum \mathbf{w}(\mathcal{S}\_l)}$$

Where, the summation is over all training samples.

Moreover, the probability P (Si) is positively related to the fitness of the parse tree generated relative to the corresponding training sample.

$$P(\mathcal{S}\_l) = \frac{f(\mathcal{S}\_l)}{\sum f(\mathcal{S}\_l)}$$

Where, ����� is the average fitness of individuals relative to the training sample.

Compute a fitness function which is the mean squared error for each individual in the

training sample and then the average fitness. Update the weights: 1 1 \* *g M j t j i f X W S M g* 

**Step 3.** t=t+g. If t<T (T is the total number of generations), then go to step 2.

As illustrated in Figure 10, selection of training samples is made in the order for the first t generations using the SSS method. Some training samples could be duplicated to improve the genetic programming learning. Later, samples are selected for the next run according to the adaptive approach based on the re-ordering procedure.

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 157

The successive TS sample Sj, j=i+1

The corresponding MTM test samples

The last subset in TS samples set (S10)

The last TS sample with the nine MTM test samples (S10 + C1T + C2T …+ C9T)

The nine MTM test samples

(C1T + C2T …+ C9T)

In static training-subset selection approach, first, the genetic program is trained separately on each of the first nine TS sub-samples (S1,…, S9) using ten different seeds and is tested on the subset data from the immediately following date (S2,…, S10). Second, using the same genetic parameters and random seeds applied for TS data, the genetic programming is trained separately on each of the first nine MTM sub-classes (C1L,…, C9L) and is tested on the

> Ci T

second nine MTM sub-classes (C1T,…, C9T).

Ci

Static Subset Selection

Dynamic Subset

Selection (RSS/SSS/ASSS/

ARSS)

methods

following section.

**3.3. Dynamic hedging** 

**Subset Selection Learning data sample Test data sample**

<sup>L</sup> MTM training samples

Si TS samples (S1, …, S9)

*(1 subset for a run)* 

TS samples S1, …, S9 *(9 subsets for a run)* 

*(9 subsets for a run)*

the TS and MTM test data (S10 + C1T + C2T …+ C9T).

MTM samples C1L, …, C9L

TS samples + MTM samples (S1, …, S9 ; C1L, …, C9L ) (*18 subsets for a run)*

**Table 3.** Definition of training and test data samples for static and dynamic training-subset selection

In dynamic training-subset selection approach, first, the genetic program is trained on the first nine TS sub-samples simultaneously (S1,…, S9) using ten different seeds and it is tested only on the tenth sub-sample data (S10). Second, the genetic programming is trained on the first nine MTM sub-classes simultaneously (C1L,…, C9L) and it is tested on the second nine MTM sub-classes regrouped in one test sample data (C1T + C2T …+ C9T). Third, the genetic programming is trained on both the nine TS sub-samples and the nine MTM sub-classes simultaneously (S1, …, S9 ; C1L, …, C9L ) and it is tested on one test sample data composed of

Based on the training and test MSE, the best generated genetic programming volatility models relative to static and dynamic training-subset selection methods respectively are selected. These models are then compared with each other according to the MSE total and the best ones are used to implement the dynamic hedging strategies as described in the

To assess the accuracy of selected generated genetic programming volatility models in hedging with respect to Black-Scholes model, three dynamic hedging strategies are employed, notably, delta-neutral, delta-gamma neutral and delta-vega neutral strategies.

(C1L, …, C9L) *(1 subset for a run)*

**Figure 10.** Example of curve fitness of the best individuals generated by genetic programming using ASSS method for time series samples

2. Adaptive-Random training-Subset Selection method (ARSS):

The ARSS method uses the same procedure as the ASSS method, except that the initial weights are generated randomly in the start of running, rather than initialized with a constant: For t=0, , 0,1 ,1 . *WS PP i k i ii* Then, for the few first generations, samples are selected using *RSS* method. After, the selection of samples is made using the adaptive approach based on the re-ordering procedure.

**Figure 11.** Example of curve fitness of the best individuals generated by genetic programming using ARSS method for moneyness-time to maturity classes

#### *3.2.2. Training and test samples*

Different forecasting genetic programming volatility models are estimated from the training set and judged upon their performance on the test set. Table 3 summarizes the training and test data samples used for static and dynamic training-subset selection methods, respectively.

In static training-subset selection approach, first, the genetic program is trained separately on each of the first nine TS sub-samples (S1,…, S9) using ten different seeds and is tested on the subset data from the immediately following date (S2,…, S10). Second, using the same genetic parameters and random seeds applied for TS data, the genetic programming is trained separately on each of the first nine MTM sub-classes (C1L,…, C9L) and is tested on the second nine MTM sub-classes (C1T,…, C9T).


**Table 3.** Definition of training and test data samples for static and dynamic training-subset selection methods

In dynamic training-subset selection approach, first, the genetic program is trained on the first nine TS sub-samples simultaneously (S1,…, S9) using ten different seeds and it is tested only on the tenth sub-sample data (S10). Second, the genetic programming is trained on the first nine MTM sub-classes simultaneously (C1L,…, C9L) and it is tested on the second nine MTM sub-classes regrouped in one test sample data (C1T + C2T …+ C9T). Third, the genetic programming is trained on both the nine TS sub-samples and the nine MTM sub-classes simultaneously (S1, …, S9 ; C1L, …, C9L ) and it is tested on one test sample data composed of the TS and MTM test data (S10 + C1T + C2T …+ C9T).

Based on the training and test MSE, the best generated genetic programming volatility models relative to static and dynamic training-subset selection methods respectively are selected. These models are then compared with each other according to the MSE total and the best ones are used to implement the dynamic hedging strategies as described in the following section.

#### **3.3. Dynamic hedging**

156 Genetic Programming – New Approaches and Successful Applications

ASSS method for time series samples

constant: For t=0, , 0,1 ,1 . *WS PP i k i ii*

approach based on the re-ordering procedure.

ARSS method for moneyness-time to maturity classes

*3.2.2. Training and test samples* 

respectively.

the adaptive approach based on the re-ordering procedure.

2. Adaptive-Random training-Subset Selection method (ARSS):

the genetic programming learning. Later, samples are selected for the next run according to

**Figure 10.** Example of curve fitness of the best individuals generated by genetic programming using

The ARSS method uses the same procedure as the ASSS method, except that the initial weights are generated randomly in the start of running, rather than initialized with a

are selected using *RSS* method. After, the selection of samples is made using the adaptive

**Figure 11.** Example of curve fitness of the best individuals generated by genetic programming using

Different forecasting genetic programming volatility models are estimated from the training set and judged upon their performance on the test set. Table 3 summarizes the training and test data samples used for static and dynamic training-subset selection methods,

Then, for the few first generations, samples

To assess the accuracy of selected generated genetic programming volatility models in hedging with respect to Black-Scholes model, three dynamic hedging strategies are employed, notably, delta-neutral, delta-gamma neutral and delta-vega neutral strategies.

For delta hedging, at date zero, a delta hedge portfolio consisting of a short position in one call (or put) option and a long (short) position in the underlying index is formed. At any time t, the value of the delta hedge portfolio *t* is given by:

$$P(t) = V(t) + \Delta\_V(t)S(t) + \beta(t) \tag{4}$$

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 159

1 *P V xS yV B* (0) (0) (0) (0) (0) (0) (0) 0 (11)

1 (0) ( (0) (0) (0) (0) (0)) *B V xS yV* (12)

(13)

.

(14)

(15)

1

1

*t* are the vega factors for the options *V t* and *V t* <sup>1</sup> respectively.

<sup>1</sup> neutral: ( ) ( ) ( ) ( ) 0

*V V*

 <sup>1</sup>

*t*

() () () ()

*xt t yt t*

*V V*

*V V*

 *y t t*

At the beginning of the hedge horizon, the value of the hedge portfolio is zero:

money market account is adjusted:

Where, ( ) *<sup>V</sup>* 

revision.

*<sup>t</sup>* and 1

horizon of the option, *P*

( ) *<sup>V</sup>* 

> .

time to maturity class, the tracking error is given by:

portfolio at the end of the hedge horizon of the option, *P*

such that the portfolio *t* is both delta and vega neutral:

*V t*

moneyness and time to maturity criteria, which produces nine classes.

At each rebalancing time *it* , both delta and gamma hedge factors are recomputed and the

 11 1 <sup>1</sup> ( ) ( ) ( ( ) ( )) ( ) ( ( ) ( )) ( ) *r t i i ii i ii i Bt e Bt xt xt St yt yt V t*

The delta-gamma hedge error is defined as the absolute value of the delta-gamma hedge

For delta-vega hedging, a new position in a traded option is required as in the delta-gamma hedging. The proportions of the underlying *x t* and the additional option *yt* are chosen

ega neutral: ( ) ( ) ( ) 0

*y t <sup>t</sup>*

( )

*Delta t xt yt t*

 

> *V V*

As in delta-gamma hedging, at each rebalancing time *it* , both delta and vega hedge factors are recomputed and the money market account is adjusted. The delta-vega hedge error is defined as the absolute value of the delta-vega hedge portfolio at the end of the hedge

35 option contracts are used as hedging options and 35 other contracts which depend on the same underlying, with the same maturity but different strike prices are used as additional options. Contracts used to implement the hedging strategies are divided according to

The delta, gamma and vega hedge factors are computed using the Black-Scholes formula by taking the derivative of the option value with respect to index price, the derivative of delta with respect to index price and the derivative of the option value with respect to volatility respectively. For the genetic programming models, the hedge ratios are computed using the same formulas replacing the Black-Scholes implied volatilities with the generated genetic programming volatilities. Two rebalancing frequencies are considered: 1-day and 7 days

The average hedging error is used as performance measure. For a particular moneyness-

Where, *P t* , *V t* , *S t* , *<sup>V</sup> t* and *t* denote the values of the portfolio, hedging option (call or put), underlying, delta hedge factor and bond (money market account) respectively.

The portfolio is assumed self-financed, so the initial value of the hedge portfolio at the beginning of the hedge horizon is zero:

$$\mathbf{P(O)} = V(\mathbf{O}) + \boldsymbol{\Delta}\_V(\mathbf{O})\mathbf{S(O)} + \boldsymbol{\beta}(\mathbf{O}) = \mathbf{0} \tag{5}$$

$$
\implies \mathcal{J}(0) = -(V(0) + \mathcal{S}(0)\Lambda\_V(0))\tag{6}
$$

A dynamic trading strategy is performed in underlying and bond to hedge the option during the hedge horizon. The portfolio rebalancing takes place at intervals of length*t* during the hedge horizon 0, , *o T* , where *T* is the maturity of the option. At each rebalancing time *it* , the hedge factor ( ) *v i t* is recomputed and the money market account is adjusted:

$$\mathcal{J}(t\_i) = e^{r\delta t} \mathcal{J}(t\_{i-1}) - \mathcal{S}(t\_i)(\Delta\_V(t\_i) - \Delta\_V(t\_{i-1})) \tag{7}$$

The delta hedge error is defined as the absolute value of the delta hedge portfolio at the end of the hedge horizon of the option, *P*.

For delta-gamma hedging, a new position in a traded option is required. Then, the deltagamma hedge portfolio is formed with:

$$P(t) = V(t) + x(t)S(t) + y(t)V\_1(t) + B(t) \tag{8}$$

Where, *V t* <sup>1</sup> is the value of an additional option which depends on the same underlying, with the same maturity but different strike price than the hedging option *V t* . *x t* and *y t* are the proportions of the underlying and the additional option respectively. They are chosen such that the portfolio *t* is both delta and gamma neutral:

$$\begin{cases} \text{Delta} \text{ } \text{neutral} \colon \Delta\_V(t) + \mathfrak{x}(t) + \mathfrak{y}(t)\Delta\_{V1}(t) = 0\\ \text{Gamma} \colon \text{neutral} \colon \Gamma\_V(t) + \mathfrak{y}(t)\Gamma\_{V\_1}(t) = 0 \end{cases} \tag{9}$$

$$\Rightarrow \begin{cases} y(t) = \frac{-\Gamma\_V \left( t \right)}{\Gamma\_{V\_1} \left( t \right)} \\ x(t) = -\Delta\_V(t) - y(t)\Delta\_{V1}(t) \end{cases} \tag{10}$$

Where, the values of *<sup>V</sup> t* and *<sup>V</sup> t* are the delta and gamma factors for the option *V t* ; the values *<sup>V</sup>*<sup>1</sup> *<sup>t</sup>* and *<sup>V</sup>*<sup>1</sup> *t* are the delta and gamma factors for the option *V t* <sup>1</sup> .

At the beginning of the hedge horizon, the value of the hedge portfolio is zero:

158 Genetic Programming – New Approaches and Successful Applications

Where, *P t* , *V t* , *S t* , *<sup>V</sup> t* and

hedge horizon 0,

the values *<sup>V</sup>*<sup>1</sup>

beginning of the hedge horizon is zero:

 , *o T* 

of the hedge horizon of the option, *P*

gamma hedge portfolio is formed with:

*<sup>t</sup>* and *<sup>V</sup>*<sup>1</sup>

time t, the value of the delta hedge portfolio *t* is given by:

(0) ( (0) (0) (0)) *V S <sup>V</sup>*

1 1 ( ) ( ) ( )( ( ) ( )) *r t*

chosen such that the portfolio *t* is both delta and gamma neutral:

( )

*y t <sup>t</sup>*

 

> .

the hedge horizon. The portfolio rebalancing takes place at intervals of length

For delta hedging, at date zero, a delta hedge portfolio consisting of a short position in one call (or put) option and a long (short) position in the underlying index is formed. At any

() () ()() () *<sup>V</sup> Pt V t tSt t*

(call or put), underlying, delta hedge factor and bond (money market account) respectively. The portfolio is assumed self-financed, so the initial value of the hedge portfolio at the

(0) (0) (0) (0) (0) 0 *V S <sup>V</sup>*

A dynamic trading strategy is performed in underlying and bond to hedge the option during

time *it* , the hedge factor ( ) *v i t* is recomputed and the money market account is adjusted:

*i i i Vi Vi t e t St t t*

The delta hedge error is defined as the absolute value of the delta hedge portfolio at the end

For delta-gamma hedging, a new position in a traded option is required. Then, the delta-

Where, *V t* <sup>1</sup> is the value of an additional option which depends on the same underlying, with the same maturity but different strike price than the hedging option *V t* . *x t* and *y t* are the proportions of the underlying and the additional option respectively. They are

> *Delta t xt yt t Gamma t yt t*

<sup>1</sup> neutral: ( ) ( ) ( ) ( ) 0 neutral: ( ) ( ) ( ) 0 *V V V V*

> <sup>1</sup>

*t*

*V V*

Where, the values of *<sup>V</sup> t* and *<sup>V</sup> t* are the delta and gamma factors for the option *V t* ;

() () () ()

*xt t yt t*

*V V*

*t* are the delta and gamma factors for the option *V t* <sup>1</sup> .

, where *T* is the maturity of the option. At each rebalancing

(7)

1 *Pt V t xtSt ytV t Bt* () () ()() () () () (8)

1

1

*t* denote the values of the portfolio, hedging option

(4)

(5)

*t* during the

(6)

(9)

(10)

$$P(0) = V(0) + x(0)S(0) + y(0)V\_1(0) + B(0) = 0\tag{11}$$

$$\implies B(0) = -(V(0) + x(0)S(0) + y(0)V\_1(0))\tag{12}$$

At each rebalancing time *it* , both delta and gamma hedge factors are recomputed and the money market account is adjusted:

$$B(t\_i) = e^{r\delta t} B(t\_{i-1}) - (\mathbf{x}(t\_i) - \mathbf{x}(t\_{i-1})) S(t\_i) - (y(t\_i) - y(t\_{i-1})) V\_1(t\_i) \tag{13}$$

The delta-gamma hedge error is defined as the absolute value of the delta-gamma hedge portfolio at the end of the hedge horizon of the option, *P*.

For delta-vega hedging, a new position in a traded option is required as in the delta-gamma hedging. The proportions of the underlying *x t* and the additional option *yt* are chosen such that the portfolio *t* is both delta and vega neutral:

$$\begin{cases} \text{Delta} \,\text{neutral} \colon \Delta\_V(t) + \mathbf{x}(t) + y(t)\Delta\_{V1}(t) = 0\\ \text{Vega neutral} \colon \mathcal{B}\_V(t) + y(t)\mathcal{B}\_{V\_1}(t) = 0 \end{cases} \tag{14}$$

1

$$\implies \begin{cases} y(t) = \frac{-\mathcal{B}\_V\left(t\right)}{\mathcal{B}\_{V\_1}\left(t\right)}\\ x(t) = -\Delta\_V(t) - y(t)\Delta\_{V1}(t) \end{cases} \tag{15}$$

Where, ( ) *<sup>V</sup> <sup>t</sup>* and 1 ( ) *<sup>V</sup> t* are the vega factors for the options *V t* and *V t* <sup>1</sup> respectively.

As in delta-gamma hedging, at each rebalancing time *it* , both delta and vega hedge factors are recomputed and the money market account is adjusted. The delta-vega hedge error is defined as the absolute value of the delta-vega hedge portfolio at the end of the hedge horizon of the option, *P*.

35 option contracts are used as hedging options and 35 other contracts which depend on the same underlying, with the same maturity but different strike prices are used as additional options. Contracts used to implement the hedging strategies are divided according to moneyness and time to maturity criteria, which produces nine classes.

The delta, gamma and vega hedge factors are computed using the Black-Scholes formula by taking the derivative of the option value with respect to index price, the derivative of delta with respect to index price and the derivative of the option value with respect to volatility respectively. For the genetic programming models, the hedge ratios are computed using the same formulas replacing the Black-Scholes implied volatilities with the generated genetic programming volatilities. Two rebalancing frequencies are considered: 1-day and 7 days revision.

The average hedging error is used as performance measure. For a particular moneynesstime to maturity class, the tracking error is given by:

$$\begin{cases} \begin{aligned} \sum\_{i=1}^{n} \mathcal{E}\_{i} \left( \tau \right) \\ \mathcal{E}\_{M} = \frac{i^{2}}{n} \end{aligned} \tag{16} \\ \begin{aligned} \mathcal{E}\_{i} = e^{-rT} \times \frac{\left| P\_{i} \left( \tau \right) \right|}{N \times V \left( 0 \right)} \end{aligned} \tag{16} \end{cases} \tag{16}$$

Dynamic Hedging Using Generated Genetic Programming Implied Volatility Models 161

Table 4 shows that, the generated genetic programming volatility models M4S4, M4C4 and M6C6 present the smallest MSE on the enlarged sample for TS and MTM samples respectively. Comparison between these models reveals that the TS model M4S4 seems to be more performing than MTM models M4C4 and M6C6 for the enlarged sample. Furthermore, results show that the performance of TS models is more uniform than that of MTM models. MTM models are not able to fit appropriately the entire data sample as well as the TS models as they have large Total MSE. Indeed, the MSE total exceed 1 with some MTM classes, however it does not reach 0.006 for all TS samples. Figure 12 describes the evolution's pattern of the squared errors given by TS models and MTM models for all observations in the enlarged data sample. Some extreme MSE values for MTM data are not

It appears throughout Figure 12 that, the TS models are adaptive not only to training samples, but also to the enlarged sample. In contrast, the MTM models such as M1C1 are adaptive to training classes, but not all to the enlarged sample. A first plausible explanation of these unsatisfied results is an insufficient search intensity inducing difficulty to obtain general model suitable for the entire benchmark input data. To enhance exploration intensity during learning and thus improve the genetic programming performance, we introduced to the evolution procedure the dynamic subset selection, which aims to obtain a

general model that can be adaptive to both TS and MTM classes simultaneously.

**Figure 12.** Evolution of the squared errors for total sample of the best generated GP volatility models,

(a) MSE pattern for TS samples (b) MSE pattern for MTM classes

For dynamic training-subset selection methods (RSS, SSS, ASSS and ARSS), four generated genetic programming volatility models are selected for TS classification (MSR, MSS, MSAS and MSAR). Similarly, four generated genetic programming volatility models are selected for MTM classification (MCR, MCS, MCAS and MCAR) and four generated genetic programming volatility models are selected for global classification, both TS and MTM classes (MGR, MGS, MGAS and MGAR). Table 5 reports the best generated genetic programming volatility models, using dynamic training-subset selection, relative to TS

using static training-subset selection method, relative to TS samples(a) and MTM classes (b).

samples, MTM classes and both TS and MTM data.

shown in this figure.

Where, n is the number of options corresponding to a particular moneyness-time to maturity class and *<sup>i</sup>* is the present value of the absolute hedge error of the portfolio *P* over the observation path N (as a function of rebalancing frequency), divided by the initial option price *V* 0 .
