Othello Program

Group Members:
Mohamed F. Abdelsadek

Introduction:
Genetic Programming (GP) is an effective technique for evolving computer programs. In an attempt to better understand the evolutionary process used by genetic programming, the learning strategies used to develop an Othello player are varied. This paper examines the following three categories of variations:

1. Varying the hypothesis by adding more (or removing) primitives that can be used by the GP.
2. Varying the learning method.
3. Varying the methods of evaluating the results.

The genetic algorithm searches the hypothesis space for an Othello player that has the best fitness among the population. The evolved player is then tested by having it compete against either a random player (non-deterministic) or a previously evolved player (deterministic). Edgar was the original Othello player produced by the genetic program and it uses a different hypothesis representation than all the newly evolved players. Edgar is used in this paper as a benchmark "good" player.

This paper examines the different players evolved from varying the fitness measure. The fitness measure is varied by using Edgar, a previously-evolved player, and a random player as means of fitness evaluation. All three players are used and the different evolved players are tested. To evaluate the result of the evolved player a technique similar to fitness measure is used. The player is made to compete against Edgar, a previously-evolved player, and a random player. The evolved player can be trained to play either white or black. This is examined by evolving different versions of the player one trained to play white and the other black The players are made to compete to determine which color has advantage. The hypothesis space is finally examined and the introduction of new primitive "near_white_edges" and "near_black_edges." Finally Automatically Defined Functions (ADF's) are turned on and the effect of that on the evolved players is examined.

Each variation or test is examined separately and evaluated, at the end of the paper a summary of the results of the different variations is analyzed.

Player Representation:

The hypothesis space that the genetic algorithm searches is comprised of all equations of any length which the use the set of primitives listed below. The equation is used at every move in the game to determine the next possible best move. The primitives use as input the current state of the board and the next resulting state of the board to determine the best possible move.
 
 
Primitive Description
+ The Addition Operator which takes two parameters 
- The Subtraction Operator which takes two parameters 
* The multiplication Operator which takes two parameters 
/ The Division Operator which takes two parameters 
white Number of white pieces on the board 
black Number of black pieces on the board 
white_edges Number of white pieces on the edges of the board (rows 1 or 8 or columns 1 or 8) 
black_edges Number of black pieces on the edges of the board (rows 1 or 8 or columns 1 or 8) 
white_corners Number of white pieces occupying a corner on the Othello Board 
black_corners Number of black pieces occupying a corner on the Othello Board 
white_near_corners Number of white pieces near corners (the three squares surrounding any corners -- a total of 9 locations) 
black_near_corners Number of white pieces near corners (the three squares surrounding any corners -- a total of 9 locations) 
10 The Constant 10
Current primitive functions and terminals used by the genetic program. 
 
 

Case I:

Prior to any training modification, we began by testing the equation derived for the class by the TA.  This Othello
Player was trained to play white and was trained against a random player.  The following is the evolved equations.

- + black_corners * white_edges white_corners / white_near_corners 10
(the inorder representation of the function tree)

which is equivalent to black_corners - (( white_near_corners/10) + (white_edges * white_corners) )

Edgar was also trained as a white player.  When Edgar (playing white) was made to compete with the above equation
(playing black) Edgar lost the match.  Since both Edgar and the above equation are deterministic (produce the same
result every time used) if the game is played n times Edgar and the evolved player will play n identical games.  The
evolved player was tested to play against itself.  The evolved equation was used for both colors.  White
won the game which was the expected result since the evolved player was trained to play white.

Case II:
This is a modification to the learning method.  Fitness was evaluated against the player examined in Case I.  After
approximately 9 hours of training the following was the resulting evolved player:

( +  ( /  ( *  ( *   black_near_corners   black_near_corners ) ( /   10   black )) ( +  ( -   black_near_corners   black ) ( /   black   black_near_corners ))) ( *  ( *   white_near_corners   white_edges ) ( -   white_corners   black_near_corners )))

which had a fitness of 23.  This player was trained as black.  It was able to defeat Edgar, the random player and
the Case I player when they compete.  It is interesting to note the complexity of the above equation, and that
fitness is inversely proportional to the length of the function tree.   Included in the appendix is a list of the all the other
(best and worst of generation) equations derived by the GP.  Some of the equations included are also good player, but this
was chosen because among all the other players it had the highest fitness.

Case III:

The only difference between this player and that from CASE II is that this player trained against a random player.  The
training time was approximately 12 hours.  The resulting player was a very simple but never the less and effective equation.

 black_edges

The above player was able to defeat Edgar, Case I player and the random player (on occasion).
The reason that this player had such high fitness is due in part to the length of the tree.
There is a general Bias then for this Genetic Algorithm to favor smaller length functions
since the yield higher fitness.

Case IV:

The training for this case was prematurely terminated, but the hypothesis with the
highest fitness still produced good results.  Two new primitives were added to the
hypothesis representation.

near_white_edges:  Indicates the number of white pieces near the edges (col 7 or col 2 or row 7 or row 2).
near_black_edges:  Indicates the number of black pieces near the edges (col 7 or col 2 or row 7 or row 2).
 
The purpose of these primitives is to add more variations to the hypothesis space.
The player with the highest fitness is the following:

 ( *   black   black_near_edges )

The above formula is able to defeat the random player and Edgar.  However, the formula
loses when it plays against Case I player.  As can be seen, the derived formula uses one of the
new primitives.  The player was trained to play black against the Random Player.
 

Case V:

This player incorporated the new modifications in Case IV.  The player was
trained as Black against Case I evolved player.  The resulting player with
the highest fitness after 8 hours of training is:

( -   black  ( *   10   white_near_edges ))

This player is pretty efficient in that it defeats both Edgar and the Case I
evolved player by eliminating all of the white pieces from the board.  This player
also has a high accuracy against the random player.

Case VI:

This player was trained as white against the Case I Derived player.  The
resulting player included the new primitives.  The resulting player with
the best fitness was:

( +  ( -   white_edges   black ) ( /  ( +  ( *   black_corners   white_edges )  white_near_corners )  black_corners ))

This player was able to defeat Edgar, the Random Player, the Case I.
 
 

Case VII:

The final modification was to turn on ADFs (Automatically Defined Functions) to the system.
There is only one ADF, ADF0 that was defined for the system.
 
 
  RPB: ( * ( ADF0 white_corners white_near_corners ) ( + white_edges white_near_corners ))
ADF0: (+ (* black_near_edges black ) (* white_near_corners white_near_edges ))

The function was produced by training as black against case V player. And surprisingly it gets defeated by the same function it was specifically trained to play against. It defeats the player from Case I and loses against the Random Player (more that 50%). Among the different players, this one probably has the worst performance. The ADF Function primitives looked as follows:

Primitive Description
+ The Addition Operator which takes two parameters 
- The Subtraction Operator which takes two parameters 
* The multiplication Operator which takes two parameters 
/ The Division Operator which takes two parameters 
ADF0 The ADF0 location in the equation tree 
white Number of white pieces on the board 
black Number of black pieces on the board 
white_edges Number of white pieces on the edges of the board (rows 1 or 8 or columns 1 or 8) 
black_edges Number of black pieces on the edges of the board (rows 1 or 8 or columns 1 or 8) 
white_corners Number of white pieces occupying a corner on the Othello Board 
black_corners Number of black pieces occupying a corner on the Othello Board 
white_near_corners Number of white pieces near corners (the three squares surrounding any corners -- a total of 9 locations) 
black_near_corners Number of white pieces near corners (the three squares surrounding any corners -- a total of 9 locations) 
10 The Constant 10
The Main Equation primitives 


Primitive Description
+ The Addition Operator which takes two parameters 
- The Subtraction Operator which takes two parameters 
white Number of white pieces on the board 
black Number of black pieces on the board 
white_edges Number of white pieces on the edges of the board (rows 1 or 8 or columns 1 or 8) 
black_edges Number of black pieces on the edges of the board (rows 1 or 8 or columns 1 or 8) 
white_corners Number of white pieces occupying a corner on the Othello Board 
black_corners Number of black pieces occupying a corner on the Othello Board 
white_near_corners Number of white pieces near corners (the three squares surrounding any corners -- a total of 9 locations) 
black_near_corners Number of white pieces near corners (the three squares surrounding any corners -- a total of 9 locations) 
10 The Constant 10
ADF0 primitives 

Conclusion:

Variation on the learning techniques has a major impact on the resulting derived hypothesis. This paper focused on examination of the fitness measure used to evaluate the player as they evolve. The obvious observation is that when trained against a deterministic player (as opposed to a random player) there is a tendency for the resulting player to have evolved with a focus on the deterministic players characteristics. This may be problematic when the evolved player is made to compete against other players.

The hypothesis representation has a major impact on how effective the evolved player becomes. This was demonstrated when the addition of black_near_edges and white_near_edges resulted in the best player (see Case V). This player was not defeated by any of the other players in the game and without black_near_edges this player would not have been possible. For Othello there appears to be a tendency for shorter tree functions to have better performance. This is demonstrated in Case VII when ADF's were turned on. The resulting equation was extremely long and not very effective. There is a general tendency for ADF's to enhance the speed of evolution (or the player generation time), which it succeeded in doing. Overall, one must take caution when developing a genetic program to ensure that the hypothesis representation, fitness measure, and the result evaluation techniques are all well planned and suited for the desired learning task.