CSW 4771: Machine Learning (W4771)
 

Final Project: Genetic Programming Project Othello

Kayuri Mitsui
mkayuri@cs.columbia.edu
Note: this paper includes the previous project contents.

Introduction

This paper describes my project of applying Genetic Programming to generate Othello players where each individual Othello players is represented in a S-expression function tree that evaluates the Othello game board and determines next moves. The default application of Genetic Programming on Othello, which is specifically GPJPP Java package together with Othello playing Java package, has several sources of randomness or ambiguousness or uncertainty and still results in relatively good Othello players. My interest in this project is to eliminate some of these sources and see how the modified Genetic Programming methods work on this specific domain.

Approach 1: Randomness on Crossover

The regular crossover may hurt the evolution because there is no guarantee that the crossover produces offspring whose fitness measures are better than those of their parents. This is a source of randomness in default GP algorithm.

Elitism Evolution

To eliminate the source of randomness, in other words, to avoid the crossover that produces worse offspring and hurts the entire evolution, I invent the following crossover variation, called "Elitism Evolution."

The standard evolution for developing generations selects two parents and replaces them with two offspring. It does not always succeed to produce at least one better offspring than both of its parents. For this point of view, before adding new individuals to the population, evaluate them to decide weather they are kept or discarded. The Elitism evolution practiced in this assignment evaluates both two offspring together with their two parents. After they have measured their fitness score, the two best individuals with respect to their fitness measure can be one of the parents and the one of the offspring. If such a combination of two individuals are kept to next generation many times, the very similar individuals would take over a large fraction of the population. Then, it would reduce the diversity and slow the further progress (the problem of crowding).

Therefore, the Elitism evolution in this project evaluates two parents together with their offspring produced by the regular crossover operation, then selects only one with the best fitness measure. The chosen one could be one of parents. In GP world, there are no limited lifespan for individuals so even parents can survive over generations as long as they are better than their offspring they could produce. When two parents are taken and the single individual is kept in the population, what happens to the GPJPP package is to fill up the population so that the population size is kept constant by adding a new randomly created individual.

In summary, the elitism evolution in this assignment follows these steps;

  1. Select two parents (same as regular crossover)
  2. Apply crossover operation and get two offspring (same as regular crossover)
  3. Evaluate both two offspring and get their fitness measure
  4. Compare the fitness of two parents and two offspring together, and
  5. Keep the individual with the highest fitness measure
  6. Add a new randomly created individual to the population
Please note that the terminology "Elitism" in regular Genetic Programming world indicates that only the elite individuals (top x%) in the population are allowed to reproduce their offspring. I misunderstood the terminology here but let me keep it as original.

Result of Elitism Evolution

The following table shows three cases, each of these cases is tested with and without the elitism evolution, for 11 generations. Each cell is the average fitness measure of the population at certain generation. Three cases are different in other parameter settings.
Case
1
2
3
ElitismEvolution
No
Yes
No
Yes
No
Yes
Generation            
0
966.45
973.47
393.56
374.88
966.56
962.32
1
908.80
835.99
294.54
43.90
952.70
788.86
2
907.34
520.71
257.20
36.38
919.31
475.04
3
874.68
316.13
232.79
38.99
871.92
307.00
4
826.84
308.96
209.07
35.14
781.88
301.58
5
758.25
305.54
177.90
38.20
656.66
319.95
6
760.84
326.38
145.81
40.53
579.12
298.76
7
708.25
314.94
95.14
39.83
550.94
302.07
8
719.78
319.47
86.57
43.00
514.95
303.80
9
715.27
323.52
69.03
35.84
451.15
305.10
10
652.21
311.18
73.42
41.35
415.96
309.00
11
648.17
310.81
55.23
41.47
402.75
305.08
All the three pairs show that the elitism converges faster to lower (better) fitness values than non-elitism case.

Brood Selection

There is another crossover schema, called Brood Selection [Tackett and Carmi, 1994]. The Brood Selection applies regular crossover on the same pair of parents many times, say n times, and produce n pair of offspring, then choose the best from the pool of children.

More precisely,

  1. Select 2 parents
  2. Loop for i=1 to n
  3. Pick random crossover points from parents.
  4. Swap these points to create i-th pair of offspring.
  5. Evaluate all 2n offspring and select two offspring to return.
When n, which is called brood size factor, is equal to 1, it equals to the regular crossover. After the loop, it produces 2n newly created individuals. Then Brood Selection applies a function to evaluate all the 2n individuals, which is called culling function, and selects the best two. The culling function is much similar to the fitness function but can be something else.

The cost of Brood Selection depends on the selection of the culling function. If the culling function is the evaluate function of playing Othello games, it thus is expensive. Moreover, it is greedy because trying many possible crossover operations to find the best. Comparing the regular crossover, Brood Selection chooses the best individuals from a large number of brothers and sisters therefore the schema more likely to result in better offspring than the offspring generated by the regular crossover.

Then, compare Brood Selection with my Elitism Evolution described above. Because the elitism evolution chooses the best single individual from the two parents and the two children, so it is guaranteed that the fitness measure of the selected individual is always better or equal to the both parents. Note that the elitism evolution can select one of the parents as long as its fitness is best. It is still possible that the selected two offspring by Brood Selection schema are not always better than their parents in the viewpoint of their fitness measure. Roughly speaking, it would conclude as follows;

Pr(fitness(Eliism_offspring) < fitness(parent))
 <= Pr(fitness(BroodSelection_offspring) < fitness(parent))
and
Pr(fitness(BroodSelection_offspring) < fitness(parent))
 <= Pr(fitness(RegularCrossover_offspring)<fitness(parent))
Although, the comparison made above does not saying that the offpspring selected by the elitism evolution is always better than the offspring selected by Brood Selection.

Moreover, remember that the elitism evolution selects only one individual in order to avoid decaying the population diversity, and add one randomly created individual to keep the population size constant. When taking account it and comparing the two offspring produced by Brood Selection with the set of one offspring selected by the elitism evolution and the randomly created new one, it is not guaranteed that the elitism evolution is better than Brood Selection or vise versa.

Approach 2: Ambiguous Fitness Measure

The fitness value measured by actual Othello games and assigning the opponentís score (when Genetic Programming system considers the less fitness, the better player). It is weak information because there are more each game result can tell.

Description of Number of Moves

As I executed some GP programs, there happened several evolution processes converge to "local minima" individual that consists of a single function "(white)" and its fitness is very high. Because the GP program is to learn to play as white, it sounds more like the game principle rather than winning strategies. Then, in order to more differentiate such individuals, I introduce the number of moves taken. Note that the better individual causes the lower value in fitness measure. The better fitness measure does not imply the larger value in fitness measure but the lower value. Therefore, when the game is played longer, it results in larger value in fitness measure and it implies the worse fitness.

Also each individual fitness is measured by playing 10 games instead of 5 in order to make it differentiate.

This application is more domain-specific. Most of the games has the same length usually. However, such "local minima" "(white)" individual is evaluated with high fitness value, and most of the cases it is caused by the result score 27ñ0. It means the player won by 27 pieces and the opponent random player has 0 piece, where a game is finished by 23 moves rather than 60 moves when all 64 cells are filled.

Result of Number of Moves

The following table shows the frequency of such local minimum on 10 runs of 11 generations for each case. Therefore, the maximum frequency is 120.
Case
1
2
3
4
Number of Moves
No
Yes
No
Yes
No
Yes
No
Yes
Frequency of "(white)"
88
44
0
0
97
88
0
0
Fortunately Case 2 and 4 do not have "local minimum" but some of the other cases may contain the same function tree of "(white)" as worst individuals. However, the Number of Moves helped to decrease the occurrences of "local minima".

Approach 3: Random Othello Player

As described before, the fitness measured by playing Othello games several times against the random player in the default GP Othello application. This is another source of randomness.

Playing Against Edgar

To eliminate the randomness caused by the random player that picks the random move from the legal moves according to the board status, replace the current fitness function with one with Edgar, very good player invented by Astro Teller.

The following table shows the average fitness measure of the population at certain generation on two cases.

Result of Playing Against Edgar

The following table shows the cases without Elitism option;
Case
Random Player
Againt Edgar
Generation Average Fitness Variety Average Fitness Variety
0
369.53
1.000
464.62
1.000
1
209.03
0.670
456.98
0.950
2
178.84
0.480
450.74
0.940
3
173.30
0.360
458.64
1.000
4
127.89
0.250
447.90
0.950
5
114.67
0.260
457.88
0.910
6
95.56
0.200
446.70
0.930
7
61.50
0.130
456.46
0.920
8
57.16
0.100
453.18
0.890
9
54.28
0.090
446.32
0.920
10
34.50
0.050
436.98
0.930
11
25.33
0.020
439.72
0.910
12
27.79
0.020
438.90
0.960
13
26.52
0.020
438.58
0.920
14
35.54
0.010
439.58
0.950
15
36.10
0.010
441.20
0.950
16
28.84
0.010
429.64
0.940
17
37.18
0.010
445.52
0.870
18
30.64
0.010
432.28
0.780
19
38.19
0.010
430.20
0.800
20
40.12
0.010
411.96
0.840
It shows the slow improve on fitness measure when the fitness measure is taken by playing games against Edgar.

The following table shows the changes of average fitness and variety at generations.
Case
Against Edgar
Against Edgar
Options
Elitism
Elitism & Picky
Generation Avg. Fitness Variety Avg. Fitness Variety
0
463.10
1.000
232.92
1.000
1
411.14
0.720
193.21
0.720
2
363.44
0.510
171.41
0.560
3
325.92
0.230
152.35
0.420
4
298.04
0.170
123.30
0.265
5
278.26
0.120
61.46
0.125
6
263.10
0.090
20.83
0.015
7
256.22
0.080
20.00
0.005
8
254.68
0.040
20.00
0.005
9
254.68
0.040
20.00
0.005
10
253.00
0.030
20.00
0.005
11
253.00
0.030
20.00
0.005
12
253.00
0.030
20.00
0.005
13
253.00
0.030
20.00
0.005
14
253.00
0.030
20.00
0.005
15
253.00
0.030
20.00
0.005
16
253.00
0.030
20.00
0.005
17
253.00
0.020
20.00
0.005
18
253.00
0.020
20.00
0.005
19
253.00
0.020
20.00
0.005
20
253.00
0.020
20.00
0.005
21
253.00
0.020
20.00
0.005
Case 1: Elitism and non-Picky Crossover, Case 2: Elitism and Picky Crossover

The problem of crowding: There arises the problem of crowding. It actually reminds me the picture of education, where school system highly requires good grades and then students become less personalized. In short, very stricted absolute standards does not allow various standards, therefore, individuals have to grow up in very specific direction and it results in low diversity.

Playing Against Each Other

[Ferrer and Martin ??] studies Genetic Programming for the same domain of evolving board evaluation function. It says the tournament approach is used to determine individual fitness. Select two individuals and let them play the game each other and only the winner can go further round of games.

By the nature of the Othello game, a good white player can be also a good black player. In addition, the individual function tree consists of color-independent functions and terminals that is transitive to the other color. Therefore, all the individual white players in the population can be transformed and can fight as black player. Then, a weak version of the tournament approach described above is achieved by playing several games by selected two individuals against each other and assigning the game score to their new fitness measure. (For the same reason, each individual can play the game against Edgar because encoded Edgar can not be white player.).

This weaker standard causes the fitness function less strict. In the case, the "playing against each other" approach seems to go toward the randomness oppositely. It does not involve any external players and the population will grow purely, maybe slowly, with the population itself. However, it eliminates some sort of uncertainty of external players in a sense.

The following table shows the changes of average fitness of population in three cases including "Playing against Edgar" case.
Case Random Player Against Each Other Against Edgar
Generation Avg. Fitness Variety Avg. Fitness Variety Avg. Fitness Variety
0
369.53
1.000
411.24
1.000
464.62
1.000
1
209.03
0.670
299.90
0.800
456.98
0.950
2
178.84
0.480
288.90
0.580
450.74
0.940
3
173.30
0.360
277.22
0.430
458.64
1.000
4
127.89
0.250
267.74
0.350
447.90
0.950
5
114.67
0.260
225.98
0.230
457.88
0.910
6
95.56
0.200
206.28
0.110
446.70
0.930
7
61.50
0.130
208.32
0.110
456.46
0.920
8
57.16
0.100
221.86
0.060
453.18
0.890
9
54.28
0.090
204.36
0.050
446.32
0.920
10
34.50
0.050
203.82
0.030
436.98
0.930
11
25.33
0.020
198.00
0.020
439.72
0.910
12
27.79
0.020
212.30
0.020
438.90
0.960
13
26.52
0.020
205.00
0.020
438.58
0.920
14
35.54
0.010
203.90
0.020
439.58
0.950
15
36.10
0.010
203.00
0.020
441.20
0.950
16
28.84
0.010
191.20
0.020
429.64
0.940
17
37.18
0.010
206.50
0.020
445.52
0.870
18
30.64
0.010
191.20
0.020
432.28
0.780
19
38.19
0.010
204.00
0.020
430.20
0.800
20
40.12
0.010
193.60
0.020
411.96
0.840
The "Playing against each other" case implies the slower converge than the random player case but the faster converge than the Edgar case. The interesting point is the diversity. The powerful controler (Edgar) does not hurt the diversity in terms of the number of generations. However, the "playing against each other" case does not hurt the diversity much as the random player does.

Approach 4: Ambiguousness on Representation

The function tree representation deployed here is based on the terminal nodes and the functions nodes. The functions are regular arithmetic operations. The terminals are the number of certain color pieces in certain place; for example, white, black_edges, white_corners, black_near_coners, 8 different terminals for all. Then, the functions are neutral enough, but these terminals are biased based on prior knowledge about Othello.

In order to eliminate the uncertainty of these prior knowledge, instead having these existing terminals, it requires the constant terminal from 1 to 8 and a querying function to return the status of the cell specified by 1 to 8, x and y coordination.

Further more, in order to let a function tree do individual decision making, it also requires conditional operators, such as "if", "and", "or" and "not."

The modified system would result in a weaker AI problem solving method without less prior knowledge. However, it seems to take a long time to practice this presentation.

Approach 5: Random Crossover Point Selection

The next source of randomness resides at the point when the crossover point is selected. The regular crossover selects the random subtree (crossover point) from individual function tree.

Picky Crossover

I experimented the elitism evolution because the existed evolution produces offspring with unexpectable fitness measures and still could produce offspring with worse fitness measures. This comes from one "randomness" of the standard GP algorithm. The other randomness exists in crossover operation, in which the random node of one parent and the other random node of the other parent are swapped each other to produce two offspring. Then, to tackle down the GP algorithm's randomness my question is why they pick random nodes. It is because GP algorithm does not have any information to evaluate each node in individual function tree of genome. Consequently, the next question is how to evaluate each node in function tree. Then, I introduce an intuitive (weak?) evaluation on nodes, which is each node in an individual function tree works as fine as the individual does. Of course, each single subtree may not work independently as well as in the function tree, but nodes in the better individuals (with better fitness measure) tend to work better than the nodes in the worse individuals.

That is, each node is assigned by the fitness measure of the individual tree to which the node belongs. If I keep only one value to each node, the node has only previous evaluation and it does not reflect its history performance. Then let every node have the range information of its past performance so that each node now carries the maximum and minimum evaluated values and travels with them through the individual trees. Now the crossover operation, named Picky Crossover, can select the best/worst node according to its evaluated, and hopefully it might result in better individuals.

Each node in individual function tree is assigning the subtree performance in value from the whole tree which it belongs to. Let the node fitness indicate such a value for each node. To maintain the node fitness, each node in individual now carries two extra values, the best and the worst fitness values. (Express the fitness node by range, instead of one value. Single value is not powerful enough to express its performance history.) The two extra values are; the "best" and the "worst" fitness values which the node have ever experienced.

Every node travels among many individual trees. Therefore, it experienced various fitness values. The best and the worst node fitness values are initialized with the worst possible and the best possible node fitness values respectively, and update before (or possibly after) every crossover operation as follows;

AdjustNodeFitness (NewFitnessValue)

  1. If NewFitnessValue is better than the "best"

  2. then "best" = NewFitnessValue;
  3. If NewFitnessValue is worse than the "worst"

  4. then "worst" = NewFitnessValue;

Using this simple algorithm, the new fitness value of a newly generated individual is propagated among all its nodes in the function tree. Then, Picky Crossover scheme follows these steps.

  1. Select two parents, Parent 1 and Parent 2.
  2. Take the already-calculated fitness measure of an individual to all its nodes.
  3. Use AdjustNodeFitness algorithm to update the node fitness value in each node.
  4. Select the best node in Parent 1. (Select the node with the best "best" node fitness value.)
  5. Select the worst node in Parent 2. (Select the node with the worst "worst" node fitness value.)
  6. Swap both nodes to produce two offspring Child 1 and Child 2.
This algorithm produces two offspring to pass and they are described as follows.
Child1={Parent1}-{Best subtree in Parent1}+{Worst subtree from Parent2};
Child2={Parent2}-{Worst subtree in Parent2}+{Best subtree from Parent1};

Intuitively, it can be easily expected that the one offspring is much better than the other offspring. However, the Picky Crossover keeps both two offspring. It is tempted to cull the worse offspring, but it is possible to simultaneously apply the elitism evolution described above here to do the similar culling.

Result of Picky Crossover

The following table shows the three cases tested with or without Picky Crossover.
Case
1
2
3
Picky Crossover
No
Yes
No
Yes
No
Yes
Generation            
0
966.45
966.56
384.94
374.88
973.47
962.32
1
908.80
952.70
247.44
43.90
835.99
788.86
2
907.34
919.31
85.99
36.38
520.71
475.04
3
874.68
871.92
91.32
38.99
316.13
307.00
4
826.84
781.88
86.55
35.14
308.96
301.58
5
758.25
656.66
61.36
38.20
305.54
319.95
6
760.84
579.12
94.62
40.53
326.38
298.76
7
708.25
550.94
49.02
39.83
314.94
302.07
8
719.78
514.95
52.61
43.00
319.47
303.80
9
715.27
451.15
83.09
35.84
323.52
305.10
10
652.21
415.96
82.52
41.35
311.18
309.00
11
648.17
402.75
67.97
41.47
310.81
305.08

Case 1 through Case 3 are tested with and without Picky Crossover. Case 1 to Case 3 are different in other parameter settings. There seems to show that the Picky Evolution case reaches better fitness values faster than the non-Picky Evolution case. It can say that Picky Evolution may improve the evolution in population with respect to its speed.

More Picky Crossover

Now apply multi-crossover approach to Picky Crossover. Picky crossover always picks the best subtree from Parent 1 and the worst subtree from Parent 2. However, the reverse combination is possible to apply in the same time. The More Picky Crossover does the crossover combination same as Picky Crossover and additionally picks the worst subtree from Parent 1 and the best subtree from Parent 2 and does crossover to produce additional two offspring. More precisely, More Picky Crossover follows these steps;

  1. Select two parents, Parent 1 and Parent 2.
  2. Take the already-calculated fitness measure of an individual to all its nodes.
  3. Use AdjustNodeFitness algorithm to update the node fitness value in each node.
  4. Select the best node in Parent 1. (Select the node with the best "best" node fitness value.)
  5. Select the worst node in Parent 2. (Select the node with the worst "worst" node fitness value.)
  6. Swap both nodes to produce two offspring Child 1 and Child 2.
  7. Select the worst node in Parent 1.
  8. Select the best node in Parent 2.
  9. Swap both nodes to produce two offspring Child 3 and Child 4.
  10. Pick best two offspring from Child 1 to Child 4.
Step 10 culling children is required to keep the population size constant. Otherwise, the population size grows as twice as much every generation.

In a sense, it is much similar to Brood Selection introduced before, where the brood size factor is 2. However, note that Brood Selection still picks random crossover points.

Result of More Picky Crossover
Crossover
Non Pikcy
Picky
More Picky
Generation
0
384.94
374.88
393.01.
1
247.44
43.90
215.27
2
85.99
36.38
151.48
3
91.32
38.99
127.01
4
86.55
35.14
113.51
5
61.36
38.20
110.23
6
94.62
40.53
98.00
7
49.02
39.83
111.02
8
52.61
43.00
122.46
9
83.09
35.84
102.45
10
82.52
41.35
99.89
11
67.97
41.47
111.90

The More Picky Crossover is more greedy on the selection of offspring than the Picky Crossover. Therefore I expect More Picky Crossover would converge faster than Picky Crossover. Unfortunately, my expectations are not implied here.

Analysis on Picky Crossover

When selecting the best subtree, the Picky Crossover algorithm searches among the tree and takes the node with best node fitness. Then, the selected best subtree is attached to another individual. When the selected best subtree has a very high value in the node fitness, and the destination individual has new but lower fitness value, then the subtree is very likely to be picked up at next Picky Crossover and added to some other individual. Therefore, the best subtree tends to travel among individuals until the subtree meet the individual with the higher fitness measure than its node fitness value. The selected subtree can be considered as a "Building Block" or some kind of meta-level function (like ADF) and it travels among its population until it finds the better relationship between a caller and a callee. This "building block" can exist only one in an individual at a time, so there is no expectation of reuse of the block like in ADF (Automatic Defined Function).

Other Crossover Schema

[Langdon, 96] introduces the directed crossover, where each individual contains RPB (Result Producing Branch) and several ADF (Automatically Defined Function), then total 15 trees. Each tree (not subtree) is evaluated by the outcomes (scores on subsequent fitness tests) from its execution. Then, this becomes a bias to choose which tree (RPB, ADF0, ADF1, Öetc) to apply crossover. The directed crossover uses similar evaluation schema as Picky Crossover, but the bias on the directed crossover tends to select the correct tree to improve by crossover and it does not pay attention to poorly performed trees.

[Díhaeseleer ??] describes Context Preserving Crossover, which, rather than sets bias on crossover points, puts restrictions on crossover points. All the nodes in individual tree are enumerated by its coordination. For example, (a, b, c) coordination indicates a node which accessible from root by visiting a-th child node from root, and visiting its b-th child node, and then to its c-th child node. The crossover allows only the nodes with the exactly same coordination to be exchanged each other to produce offspring. Good subtrees ("building blocks") can consequently never migrate to another level, not to any other sections in a tree, and it results in the problem of crowding.

My Bias on Node Relationship

Now I intuitively noticed from several learned good Othello players with given terminal and function node types that there exist some preferences among relationships between the function node and its terminal nodes. I observe that the certain combination of function node and terminal node would cause certain preference. Then, it evaluates each node, each subtree, consequently it can produce some kind of bias on crossover point selection. My observation is described in terms of the combinations, and its preference, as follows;

Note: W = {white, white_edges, white_corner, white_near_corner}, and B = {black, black_edges, black_corner, black_near_corner}
Case Combination Preference
1, 2
   +,*           +,*
  /  \          /  \
 W    W   ,    B    B
The combination implies a preference aggravating White pieces, and a preference aggravating Black pieces. (Preference A and B)
3, 4
   -,/           -,/
  /  \          /  \
 W    W   ,    B    B
The combination implies a preference aggravating White pieces in specific, but not well expressive place, and a preference aggravating Black pieces in specific place. (Preference C, and D)
5, 6
   +,*           +,*
  /  \          /  \
 W    B   ,    B    W
The combination implies a neutral preference in both cases. (Preference E)
7
   -,/ 
  /  \
 W    B
The combination implies a preference aggravating White pieces (against Black pieces). (Preference A)
8
  -,/
  /  \
 B    W
The combination implies a preference aggravating Black pieces (against White pieces). (Preference B)
9
W (terminal node)
The node implies a preference aggravating White pieces. (Preference A)
10
B (terminal node)
The node implies a preference aggravating Black pieces. (Preference B)
11
10 (constant terminal node)
The node implies a neutral preference. (Preference E)
Then, there are 5 different preferences shown.
No. Preference Corresponding Combinations
A Preference aggravating White pieces. Case 1, Case 7, and Case 9
B Preference aggravating Black pieces. Case 2, Case 8, and Case 10
C Preference aggravating White pieces in certain game board cells. Case 3
D Preference aggravating Black pieces in certain game board cells. Case 4
E Neutral Preference Case 5, Case 6 and Case 11

The Preference E like shown in Case 4 and 5 is difficult to understand, even it might mean something for GP but it is categorized as "Neutral Preference" here. Therefore, this relationship is not wanted by good players. Preference C and D is more restrictedly categorized, but can be in the same preference as Preference A and B respectively.

These preferences cause further preferences or bias for what type of combination should be put in certain crossover point to avoid establishing neutral (unknown) preference, thus avoid combinations in Case 5 and Case 6. In the following table, "?" indicates the crossover point and the left most column indicates what kind of combination which is subtree should be added.
No. Combination Preferred Combination
1, 2)
   +,*           +,*
  /  \          /  \
 W    ?   ,    ?    W
From Case 1 and Case 9, "?" location prefers some combination subtree that implies a preference on White pieces (Preference A), such as combinations shown in Case 1,7, 9, may Case 3
3, 4)
   +,*           +,*
  /  \          /  \
 B    ?   ,    ?    B
From Case 2 and Case 10, "?" location prefers some combination subtree that implies a preference on Black pieces (Preference B), such as combinations shown in Case 2, 8, 10, may Case 4.
5)
   -,/
  /  \
 W    ?
From Case 3, "?" location prefers some combination subtree that implies a preference on White pieces (Preference A), such as combinations shown in Case 1, 7, 9, may Case 3.
From Case 7, "?" location prefers some combination subtree that implies a preference on Black pieces (Preference B), such as combinations shown in Case 2, 8, 10, may Case 4.
6)
  -,/ 
  /  \
 ?    W
From Case 3, "?" location prefers some combination subtree that implies a preference on White pieces (Preference A), such as combinations shown in Case 1, 7, 9, may Case 3.
From Case 8, "?" location prefers some combination subtree that implies a preference on Black pieces (Preference B), such as combinations shown in Case 2, 8, 10, may Case 4.
7)
  -,/ 
  /  \
 B    ?
From Case 4, "?" location prefers some combination subtree that implies a preference on Black pieces (Preference B), such as combinations shown in Case 2, 8, 10, may Case 4.
From Case 8, "?" location prefers some combination subtree that implies a preference on White pieces (Preference A), such as combinations shown in Case 1, 7, 9, may Case 3.
8)
  -,/
  /  \
 ?    B
From Case 4, "?" location prefers some combination subtree that implies a preference on Black pieces (Preference B), such as combinations shown in Case 2, 8, 10, may Case 4. 
From Case 8, "?" location prefers some combination subtree that implies a preference on White pieces (Preference A), such as combinations shown in Case 1. 7. 9, may Case 3.

This preference can be performed as restriction so that the individual function tree is of strong typed. These preferences also can be used together with the node fitness described on Picky Crossover and have another version of node fitness. Alternatively, this bottom up building manner finally produces the preference of a whole tree. It can be used to help the fitness function, or the individual with unwanted (opposite) preference, say Black preference, can be transformed into the right preference to White preference, by replacing all white* terminal by corresponding black* terminal node.

Because of the limited time allowed this project, the implementation is not included in this paper. Even it is very domain- and representation-specific bias and it includes my personal, intuitive bias though, it may produce better Othello players.

Perspective

I examined the various sources of randomness, ambiguousness, and uncertainty, in order to make it more deterministic and hopefully improve the learning process in various ways. Unfortunately for the limited time for the project, some experiments does not done by large enough number of runs and population size, and some approaches did not get implemented and actually experimented, especially the last "My Bias on Node Relationship" idea. It is a good chance to reconsider the fact that the ways to eliminate the randomness also requires another source of bias. I also realize that to establish such bias, it is more likely to depend on the things already given, such as representation, external Othello player, because of the limited prior and fair knowledge about the domain. Finally, the project provides continuing interest in Genetic Programming for further experiment.

References

Peter J. Angeline (1996) Two Self-Adaptive Crossover Operations for Genetic Programming
Advances in Genetic Programming 2, pp.89-110, MIT Press, 1996.

Patric D'haeseleer (1994) Context Preserving Crossover in Genetic Programming
Proceedings of the 1994 IEEE World Congress on Computational Intelligence, Vol.2, pp.256-261, IEEE Press, 27-29 June 1994

Gabriel J. Ferrer and W. N. Martin (1995) Using Genetic Programming to Evolve Board Evaluation Functions
Proceedings of the 1995 IEEE Conference on Evolutionary Computation, 1995

W. B. Langdon (1996) Directed Crossover within Genetic Programming
Research Note, University College London, Number RN/95/71, September 1995.

Walte A. Tackett and Aviram Carmi (1994) The Unique Implications of Brood Selection for Genetic Programming
Proceedings of the 1994 IEEE WOrld Congress on Computational Intelligence, IEEE PRess, 27-29 June 1994.