steps Fit. % hits Eval. Fit. % hits Eval. 400 37.45 0% NA 17.3 3% 24450 600 27.05 0% NA 1.14 66% 22530 The 1st column is the number of allowed time steps, then for each heuristic, over 40 independent runs, we give the average of the best fitness (taken from [17] for TreeDE), then the percentage of run reaching a hit solution (solution that found all 89 food pellets), then the average number of evaluations to produce the first hit solution (if

Programs are again vectors of floating point values. Each instruction is represented as a single value which is decoded in the same way as operators are in the regression problems, that is using Eq. 6. Instruction are decoded sequentially, and the virtual machine is refined to handle jumps over an instruction or group of instructions, so that it can deal with IF-FOOD-AHEAD instructions. Incomplete programs may be encountered, for example if a PROGN2 is decoded for the last value of a program vector. In this case the incomplete instruction is simply dropped and we consider that the program has reached normal termination (and the program

The Santa Fe trail being composed of 89 pieces of food, the fitness function is the remaining food (89 minus the number of food pellets taken by the ant before it runs out of time). So, the lower the fitness, the better the program, a hit solution being a program with fitness 0, i.e. a

Results are summed-up in Table 4. Contrary to the regression experiment, the generational variant of LDEP is now better than the steady state. We think this behavior is explained by the hardness of the problem: more exploration is needed, and it pays no more to accelerate

GP gives the best results for 400 time steps, but it is LDEP that provides the best average fitness for 600 steps, at the cost of a greater number of evaluations, meaning LDEP is better at exploiting the available amount of computing time. LDEP is also better than TreeDE on both steps limits. For CMA-LEP, two values for *σ* ∈ {1, 10} and two values for *λ* ∈ {10, 100} were again tried, the best setting being *σ* = 10 and *λ* = 100 (whose results are reported here). CMA-LEP performed really poorly, and its first results were so bad that it motivated us to try this rather high initial variance level (*σ* = 10), which brought a sensible but insufficient improvement. We think that the lack of elitism is, here again, a probable cause of CMA-ES bad behavior, on a very chaotic fitness landscape with many neutral zones (many programs

ever produced or else NA if no run produced a hit solution).

**Table 4.** Santa Fe Trail artificial ant problem.

is iterated if there are remaining time steps).

program able to pick up all the food on the grid.

convergence.

exhibit the same fitness).

As the LDEP continuous approach for evolving programs achieved interesting results on the previous GP benchmarks, we decided to move forward and to test whether or not we were able to evolve a more complex data structure: a stack. Langdon successfully showed in [30] that GP was able to evolve not only a stack with its minimal set of operations (push, pop, makenull), but also two other optional operations (top, empty), which are considered to be inessential. We followed this setting, and the five operations to evolve are described in Table 6.


**Table 6.** The five operations to evolve

This is in our opinion a more complex problem than the previous ones, since the correctness of each trial solution is established using only the values returned by the stack operations and only pop, top and empty return values.

#### *Choice of primitives*

As explained in [30], the set of primitives that was chosen to solve this problem is a set that a human programmer might use. The set basically consists in functions that are able to read and write in an indexed memory, functions that can modify the stack pointer and functions that can perform simple arithmetic operations. The terminal set consists in zero-arity functions (stack pointer increment and decrement) and some constants.

#### 18 Will-be-set-by-IN-TECH 44 Genetic Programming – New Approaches and Successful Applications Continuous Schemes for Program Evolution <sup>19</sup>

The following set was available for LDEP:


**Operation Evolved operation Simplified operation** push write(1 ,write(dec\_aux ,arg1 )) stack[aux] = arg1

+ read(inc\_aux ))) tmp = stack[aux];

pop write(aux ,((aux + (dec\_aux + inc\_aux )) aux = aux + 1

top read(aux) return sp[aux]

makenull write((MAX - (0 + write\_aux(1 ))),MAX ) aux = 1

**Table 7.** Example of an evolved push-down stack

constants (e.g. in symbolic regression problems).

a fine-tuning of the solutions tree-depth.

usual continuous benchmarks.

**5. Conclusions**

empty aux if (aux > 0) return true

This chapter explores evolutionary continuous optimization engines applied to automatic programming. We work with Differential Evolution (LDEP) and CMA-Evolution Strategy (CMA-LEP), and we translate the continuous representation of individuals into linear imperative programs. Unlike the TreeDE heuristic, our schemes include the use of float

Comparisons with GP confirm that LDEP is a promising optimization engine for automatic programming. In the most realistic case of regression problems, when using constants, steady state LDEP slightly outperforms standard GP on 5 over 6 problems. On the artificial ant problem, the leading heuristic depends on the number of steps: for the 400 steps version GP is the clear winner, while for 600 steps generational LDEP yields the best average fitness. LDEP improves on the TreeDE results for both versions of the ant problem, without needing

For both regression and artificial ant, CMA-LEP performs poorly with the same representation of solutions than LDEP. This can be deemed not really surprising since the problems we tackle are clearly outside the domain targeted by the CMA-ES heuristic that drives evolution. Nonetheless it is also the case for DE, which still produces interesting solutions, thus this points to a fundamental difference in behavior between these two heuristics. We suspect that CMA-ES lack of elitism may be an explanation. It also points to a possible inherent robustness of the DE method, on fitness landscapes that are possibly more chaotic than the

The promising results of LDEP on the artificial ant and on the stack problems are a great incentive to deepen the exploration of this heuristic. Many interesting questions remain open. In the beginnings of GP, experiments showed that the probability of crossover had to be set differently for internal and terminal nodes: is it possible to improve LDEP in similar ways? It is to be noticed that in our experiments the individual vector components take their values in the range (−∞, +∞), since it is required by the standard CMA-ES algorithm. It could be interesting to experiment DE-based algorithms with a reduced range of vector component values, for example [−1.0, 1.0], that would require to modify the mapping of constant indices.

aux = aux - 1

return tmp

else return false

stack[aux] = tmp + aux;

Continuous Schemes for Program Evolution 45

• functions to modify the stack pointer: inc\_aux to increment the stack pointer, dec\_aux to decrement it, write\_aux to set the stack pointer to its argument and returns the original value of aux.

### *Algorithm and fitness function*

We used a slightly modified version of our continuous scheme as the stack problem requires the simultaneous evolution of the five operations (push, pop, makenull, top, empty). An individual is composed of 5 vectors, one for each operation. Mutation and crossover are only performed with vectors of the same type (i.e. vectors evolving the push operation for example).

Programs are coded in prefix notation, that means that an operation like (arg1 + MAX) was coded as + arg1 MAX. We did not impose any restrictions on each program's size except that each vector has a maximum length of 100 (this is several times more than sufficient to code any of the five operations needed to manipulate the stack).

In his original work, Langdon chose to use a population of size 1, 000 individuals with 101 generations. In the DE case, it is known from experience that using large populations is usually inadequate. So, we fixed a population of 10 individuals with 10, 000 generations for LDEP, amounting to about the same number of evaluations.

We used the same fitness function that was defined by Langdon. It consists in 4 test sequences, each one being composed of 40 stack operations. As explained in the previous section, the makenull and push operations do not return any value, they can only be tested indirectly by seeing if the other operations perform correctly.

#### *Results*

In Langdon's experiments, 4 runs out of 60 produced successful individuals (i.e. a fully operational stack). We obtained the same success ratio with LDEP: 4 out of the first 60 runs yielded perfect solutions. Extending the number of runs, LDEP evolved 6 perfect solutions out of 100 runs, providing a convincing proof of feasibility. Regarding CMA-LEP, results are less convincing, since only one run out of 100 was able to successfully evolve a stack.

An example of successful solution is given in table 7 with the raw evolved code and a simplified version where redundant code is removed.


**Table 7.** Example of an evolved push-down stack
