**4.2. Santa Fe ant trail**

The Santa Fe ant trail is a famous problem in the GP field. The objective is to find a computer program that is able to control an artificial ant so that it can find all 89 pieces of food located on a discontinuous trail within a specified number of time steps. The trail is drawn on a discrete 32 × 32 toroidal grid illustrated in Figure 1. The problem is known to be rather hard, at least for standard GP (see [29]), with many local and global optima, which may explain why the size of the TreeDE population was increased to *N* = 30 in [17].

Only a few actions are allowed to the ant. It can turn left, right, move one square forward and it may also look into the next square in the direction it is facing, in order to determine if

**Figure 1.** Illustration of the Santa Fe Trail (the ant starts in the upper left corner, heading to the east, large dots are food pellets, and small dots are empty cells on the ideal path).

it contains a piece of food or not. Turns and moves cost one time step, and a maximum time steps threshold is set at start (typical values are either 400 or 600 time steps). If the program finishes before the exhaustion of the time steps, it is restarted (which amounts to iterating the whole program).

We do not need mathematical operators nor registers, only the following instructions are available:


14 Will-be-set-by-IN-TECH

differences. In contrast with the other methods, CMA-LEP results are an order of magnitude worse. Tuning the CMA-ES engine to tackle the problem as separable did not improve the results. We think this behavior may result from the high dimensionality of the problem (N=128), that certainly disrupts the process of modeling an ideal mean solution from a comparatively tiny set of search points. This is combined to the lack of elitism, inherent to the CMA-ES method, thus when it comes to generate new test points, the heuristic is left

Problem Fit. %hits Eval. Fit. %hits Eval. *f*<sup>1</sup> 0.0 100% 7957 0.0 100% 7355 *f*<sup>2</sup> 0.02 95% 16282 0.0 100% 14815 *f*<sup>3</sup> 0.4 52.5% 24767 0.0 100% 10527 *f*<sup>4</sup> 0.36 42.5% 21941 0.278 45% 26501 *f*<sup>5</sup> 0.13 2.5% 34820 0.06 15% 29200 *f*<sup>6</sup> 0.59 0% NA 0.63 0% NA standard GP CMA-LEP Problem Fit. %hits Eval. Fit. %hits Eval. *f*<sup>1</sup> 0.002 98% 3435 0.03 20% 6500 *f*<sup>2</sup> 0.0 100% 4005 2.76 0% NA *f*<sup>3</sup> 0.02 93% 7695 5.33 0% NA *f*<sup>4</sup> 0.33 23% 24465 2.06 6% 10900 *f*<sup>5</sup> 0.07 0% NA 13.35 0% NA *f*<sup>6</sup> 0.21 0% NA 5.12 0% NA For each heuristic, over 40 independent runs, the column Fit. gives the average of the best fitness, then we have the percentage of run reaching a hit solution, then the average number of evaluations to produce the first hit solution (if ever produced or else NA if

Overall, these results confirm that DE is an interesting heuristic, even when the continuous representation hides a combinatorial type problem, and thus the heuristic is used outside its original field. The LDEP mix of linear programs and constant management appears

The Santa Fe ant trail is a famous problem in the GP field. The objective is to find a computer program that is able to control an artificial ant so that it can find all 89 pieces of food located on a discontinuous trail within a specified number of time steps. The trail is drawn on a discrete 32 × 32 toroidal grid illustrated in Figure 1. The problem is known to be rather hard, at least for standard GP (see [29]), with many local and global optima, which may explain why the

Only a few actions are allowed to the ant. It can turn left, right, move one square forward and it may also look into the next square in the direction it is facing, in order to determine if

generational LDEP steady state LDEP

solely with a probably imperfect model.

no run produced a hit solution).

competitive with the standard GP approach.

**4.2. Santa Fe ant trail**

**Table 3.** Results for symbolic regression problems with constants.

size of the TreeDE population was increased to *N* = 30 in [17].


16 Will-be-set-by-IN-TECH 42 Genetic Programming – New Approaches and Successful Applications Continuous Schemes for Program Evolution <sup>17</sup>


If food{ Move } else {

Move ;

Move } }

**4.3. Evolving a stack**

**Table 6.** The five operations to evolve

*Choice of primitives*

only pop, top and empty return values.

(stack pointer increment and decrement) and some constants.

Progn3{ Right ;

Progn2{ Left ;

Right } ; // end Progn3

perfect solution found by LDEP for 400 time steps.

If food{ Right } else { Left } ;

If food{ Progn2{ Move ; Move } }

If food{ Move } else { Left } ; //end Progn3

else { Right } } } ; // end Progn3

Continuous Schemes for Program Evolution 43

**Table 5.** Example of a perfect solution for the Ant Problem found by LDEP in 400 time steps

**Operation Comment** makenull initialize stack empty is stack empty? top return top of stack

push(*x*) store *x* on top of stack

Here again LDEP appears as a possible competitor to GP. Table 5 shows an example of a

As the LDEP continuous approach for evolving programs achieved interesting results on the previous GP benchmarks, we decided to move forward and to test whether or not we were able to evolve a more complex data structure: a stack. Langdon successfully showed in [30] that GP was able to evolve not only a stack with its minimal set of operations (push, pop, makenull), but also two other optional operations (top, empty), which are considered to be inessential. We followed this setting, and the five operations to evolve are described in Table 6.

pop return top of stack and remove it

This is in our opinion a more complex problem than the previous ones, since the correctness of each trial solution is established using only the values returned by the stack operations and

As explained in [30], the set of primitives that was chosen to solve this problem is a set that a human programmer might use. The set basically consists in functions that are able to read and write in an indexed memory, functions that can modify the stack pointer and functions that can perform simple arithmetic operations. The terminal set consists in zero-arity functions

Progn3{ Progn3{


The 1st column is the number of allowed time steps, then for each heuristic, over 40 independent runs, we give the average of the best fitness (taken from [17] for TreeDE), then the percentage of run reaching a hit solution (solution that found all 89 food pellets), then the average number of evaluations to produce the first hit solution (if ever produced or else NA if no run produced a hit solution).

**Table 4.** Santa Fe Trail artificial ant problem.

Programs are again vectors of floating point values. Each instruction is represented as a single value which is decoded in the same way as operators are in the regression problems, that is using Eq. 6. Instruction are decoded sequentially, and the virtual machine is refined to handle jumps over an instruction or group of instructions, so that it can deal with IF-FOOD-AHEAD instructions. Incomplete programs may be encountered, for example if a PROGN2 is decoded for the last value of a program vector. In this case the incomplete instruction is simply dropped and we consider that the program has reached normal termination (and the program is iterated if there are remaining time steps).

The Santa Fe trail being composed of 89 pieces of food, the fitness function is the remaining food (89 minus the number of food pellets taken by the ant before it runs out of time). So, the lower the fitness, the better the program, a hit solution being a program with fitness 0, i.e. a program able to pick up all the food on the grid.

Results are summed-up in Table 4. Contrary to the regression experiment, the generational variant of LDEP is now better than the steady state. We think this behavior is explained by the hardness of the problem: more exploration is needed, and it pays no more to accelerate convergence.

GP gives the best results for 400 time steps, but it is LDEP that provides the best average fitness for 600 steps, at the cost of a greater number of evaluations, meaning LDEP is better at exploiting the available amount of computing time. LDEP is also better than TreeDE on both steps limits. For CMA-LEP, two values for *σ* ∈ {1, 10} and two values for *λ* ∈ {10, 100} were again tried, the best setting being *σ* = 10 and *λ* = 100 (whose results are reported here). CMA-LEP performed really poorly, and its first results were so bad that it motivated us to try this rather high initial variance level (*σ* = 10), which brought a sensible but insufficient improvement. We think that the lack of elitism is, here again, a probable cause of CMA-ES bad behavior, on a very chaotic fitness landscape with many neutral zones (many programs exhibit the same fitness).

```
If food{ Move } else {
  Progn3{
    Progn3{
      Progn3{ Right ;
        If food{ Right } else { Left } ;
        Progn2{ Left ;
          If food{ Progn2{ Move ; Move } }
              else { Right } } } ; // end Progn3
      Move ;
      Right } ; // end Progn3
    If food{ Move } else { Left } ; //end Progn3
  Move } }
```
**Table 5.** Example of a perfect solution for the Ant Problem found by LDEP in 400 time steps

Here again LDEP appears as a possible competitor to GP. Table 5 shows an example of a perfect solution found by LDEP for 400 time steps.

## **4.3. Evolving a stack**

16 Will-be-set-by-IN-TECH

generational LDEP steady state LDEP standard GP # steps Fit. % hits Eval. Fit. % hits Eval. Fit. % hits Eval. 400 11.55 12.5% 101008 14.65 7.5% 46320 8.87 37% 126100 600 0.3 82.5% 88483 1.275 70% 44260 1.175 87% 63300

CMA-LEP TreeDE
