The Othello Project
Juno Suk, jsuk@lucent.com

Introduction
In this project, we are given an Othello game written in Java with a genetic programming implementation already in place. This original implementation consists of 13 primitives, + - * / 10 black black_corners black_near_corners black_edges white white_corners white_near_corners white_edges. The first four primitives are standard arithmetic operators while the rest of the primitives are terminals, all but one corresponding to positions on the board. All the standard genetic programming population control constructs such as crossover, mutation, and ADFs are included in the given Java package, gpjpp. In addition to this genetic programming setup, a computer player, EDGAR, is provided. Though not a genetically programmed player, EDGAR boasts a superior playing ability, easily dispatching most Othello players in short time. It is the purpose of this project to modify and extend the current genetic programming setup in hopes of discovering new approachs in implementation and representation that will facilitate the creation and evolution of GP players that can play well against both EDGAR and the general othello playing population.

Approach
To begin the process of creating a competent othello player, I first surveyed the past research done on this same project to see what approaches had met with success and what approaches had not. From this survey, I came up with a couple observations that led to an interesting observation and a five part plan to create a better othello player.

An Interesting Observation
I observed that most of the papers had proven unsuccessful in getting any improved results when adding new terminal types to the primitive set. But there was one paper which stood out from the rest because it had replaced the initial set of terminals entirely with a new set of its own and had fantastic results against EDGAR as well as the general population. The paper by Chris Porcelli and Patrick Pan had results that were clearly different (they were successful) than the rest of the papers in the class, so after pondering what set their approach apart from the rest I realized a commonality among the other approaches that had been inadvertently excluded from Porcelli and Pan's approach. This was the absence of the black and the white terminals.

Taking a look at how the othello board evaluated its primitives, I was surprised to find that it had not normalized the values but rather, just gave the total sum of the pieces satisfying each primitive. That would mean the following maximum values for each primitive:
black/white 64
black_corners/white_corners 4
black_near_corners/white_near_corners 12
black_edges/white_edges 30
10 (constant) 10
As can be seen, the distribution of values among the primitives are not equal. Especially noted, the black and white primitives can easily overshadow all the other primitives with its maximum value of 64 against the other primitives' total sum of 4 + 12 + 30 + 10 = 56.

What does this mean? The effect of the potentially large valued primitive will cause the other values to play a less significant role in a GP tree and will, by evolution, become extinct. This extinction is evident in many of the class papers as their results often returned the simple tree of white or black depending on which side they trained on. And if not just the single primitive, most, if not all, of the trees incorporated these couple primitives in their trees somehow.

Why is this? Imagine a three dimensional surface representing the hypothesis space. This is admittedly not a perfect model since the hypothesis space is not being represented by only three primitives but, conceptually, it should suffice. The goal of genetic programming is to come up with a player that is found on the lowest point of this three dimensional surface- the global minimum. To find this minimum, GP creates random players scattered throughout this surface and measures their "elevation" by their fitness. The lowest elevated are combined through crossovers in hopes that the created progeny are found at a lower elevation and, ultimately, create the one that is found at the global minimum. In this way, the progeny, a.k.a. GP hypotheses, gradually converge upon the minimum. Unfortunately, there are local minima as well that occupy this hypothesis space. Now, take the primitives black and white, which by themselves are a generally good heuristic to follow, but do not represent the global minimum. But the results do show that these primitives reside at a minimum, a local one. Due to the magnitude of the primitives, the valley that this minimum lies at the bottom can be imagined to be a shallow but wide one, occupying much of the hypothesis space. Most of the hypothesis representations containing the black/white primitive will tend to fall into some part of this valley, and after a few generations, will gradually converge to the hypothesis at the bottom of the valley. The global minimum lies somewhere else on the surface but its valley is relatively small in width compared to the large valley of the black/white primitive. Therefore, between generations, hypotheses in this smaller but deeper valley are come across infrequently, and when they are found, will not likely survive once they are crossovered with another hypothesis since the other hypothesis may contain the offending primitive which will cause the offspring to fall into the wide shallow valley once again.

Also worthy of noting is the fact that these two primitives will tend to dominate later in the game when more pieces have been placed on the board. For example, take a look at this scenario:

. . . . . . . ?

? w w w w w w b

w w w w w w . .

w w w w w . . .

w w w w w . . .

w w w w b b w .

b b . . . . b .

w . . . . . . w

As you can see, black is considering two choices. One represents a really good choice (the upper right hand corner) while the other is a terrible choice (left side, second from top). As you can see, if black takes the corner, then strategically, he sets himself up to claim several columns and top edge pieces. On the other hand, if he takes the left position, then, even though he gains many pieces, he only makes things worse since white can then claim the whole left edge including the upper left corner. Now what if the gp function was ( + ( - ( * 5 black_corner ) ( / black_near_corner 2 )) black ). Then the upper right hand choice (good choice) would give you 5 - 2 + 12 = 15. The other choice (bad choice) would give you 0 - 3 + 20 = 17. So even though it seems as though the gp function will take into consideration that you will be gaining an extra black_corner and not as many black_near_corners, it still ends up taking the left side, due to the overwhelming numbers of the black primitive.

This is exactly what should be avoided - the end game is played much more effectively when the corners/edges/near_corners become a greater factor. So the exact behavior is desired - the black/white primitives should decrease in the later game to let all the other primitives play a role. Unfortunately, this is not the case.

So the observation here is the classic "can't see over the trees to see the forest" problem. Because of the dominating white/black primitives, much of the hypothesis space is never explored and better trees in the distance cannot be discovered.

Creating A Better Othello Player: Part 1
In the first set of tests, there are two subparts.

In the first subpart, I run the genetic programming as-is. These tests will serve merely as a base reference to compare the results from other parts by.

In the second subpart, I trained the population against the (deterministic) EDGAR opponent. There is a reasoning behind this.

It is important to consider that the evolutionary process may be able to see past the dominating primitives and will boost the other primitives up by a multiplier constant. For example, the GP tree ( + white ( * 10 ( + white_corners white_corners ))) can potentially avoid the dominating effect of white. Performance results from this training will also be a base reference for later training against EDGAR.

Creating A Better Othello Player: Part 2
In this part, I again trained against both random players and EDGAR, but with one minor change. In the board evaluation, I divide the white/black values by 8. This should lessen the dominating effect of these primitives and bring them down to the level of the other primitives. Hopefully now, the population gets a chance to evolve into other areas of the hypothesis space "forest".

The only caveat now is that the black_edges/white_edges primitives are now too large with their maximum value of 30. But that is not a concern in this project since these primitives will be replaced with a new set in the next part. The purpose of this part is merely to see if the reduction of black/white will bring any sense of improvement.

Creating A Better Othello Player: Part 3
In this part, I introduce a new set of terminals. Inspired partially by the work of Patrick Pan and Chris Porcelli, and Daby Sow, this set of terminals is a sort of hybrid between their set of terminals.

In Pan and Porcelli's work, they introduce a set of 10 * 2 terminals making use of the radial semi-symmetry of the othello board. Each primitive corresponds to a unique position on the board which may occur on the board either 4 or 8 times depending on its location on the symmetrical area. Take a look at their paper to get a more comprehensive treatement of their implementation.

In Sow's work, he introduces a set of terminals that are dependent on the distance the position is from the corner. This results in a layout of primitive positions that are arranged in 4 quarter squares, each concentric to a corner of the board. Take a look at his paper for a comprehensive description.

For my new set of terminals, I borrow Daby Sow's concept of corner-distant terminals as well, but extending it further to include center-distant terminals as well. Here are diagrams of the terminal set:



These 8 * 2 terminals (times 2 for the white terminals) allow me to be just as expressive as the 10 * 2 terminal set of Pan and Porcelli. Both terminal sets cover all the positions of the board and the 8 primitive locations in my set can represent any one of the 10 unique positions by the intersection of 1 center terminal and 1 corner terminal from my set.
It also has the added advantage of being a smaller set, leading to a smaller hypothesis space to search and, therefore, leading to faster convergence.

Taking advantage of the smaller set of terminals, I added a few constants, 5 4 3 2 as terminals to hopefully encourage the evolution in assigning weights to the different terminals.

All the terminals are normalized to a base total of 28 to avoid the "forest over the trees" problem.

Terminal        Tiles   Factor       Total
--------------  -----   ------       ----------------
black_center_a  4	7.0          28.0
black_center_b  12      2.3          27.6
black_center_c  20      1.4          28.0
black_center_d  28      1.0          28.0

black_corner_w  4	7.0          28.0
black_corner_x  12      2.3          27.6
black_corner_y  20      1.4          28.0
black_corner_z  28      1.0          28.0

Creating A Better Othello Player: Part 4
In this part, we get down to business.

First, all training is now done as as a black player in preparation for playing against EDGAR.

The set of terminals is further reduced by removing unneeded terminals. Specifically, all the primitives related to the white player have been removed since they really aren't needed. In the process of evaluating different board positions for black, any white pieces that are removed will be reciprocated by an increase in the black pieces. And white pieces never increase when black is evaluating its options. So the black primitives' values will reflect any reductions in the white primitives' values, and, therefore, having white primitives around is redundant. Removing them can only prove beneficial since it will reduce not only the hypothesis space, but the evaluation computation time in half.

The black primitive has been reintroduced to the set since it still is an important measure to be considered. It is known now as black_total and is normalized by being multiplied by 0.4375. (0.4375 * 64 = 28)

Creating A Better Othello Player: Part 5
The last part of the five part plan, everything remains the same as in part 4 except two things:
1. Train against (deterministic) EDGAR
2. Get unbiased results
In the previous parts, results were just taken to get the general idea of how the new concept worked. But in this final part, we want to create the best possible player, not only one that can beat EDGAR but one that can generalize and play other types of players, human and random, as well. Therefore, all resulting trees were run through a validation set of 50 random players. Unfortunately, EDGAR is deterministic and plays the same each time so I couldn't use him for the validation set since he would give the same result every time. Therefore, the best GP player from each generation was validated using 50 games against the random player. The best players after validation were not run through a final testing set since they had already been run through 50 random games which should have removed most bias from playing EDGAR in itself and should not have introduced much bias itself since the validation cases were all random.

Problems
In part 3, all results were invalidated because of a bug found in my board evaluation function.

Also, a bug was found in the OthelloGPPlayer implementation provided. Any GP function containing a left parens followed by a right parens ") (" caused parsing errors. Any GP functions containing this combination may have been evaluated incorrectly by the OthelloGPPlayer. The extent of damage this had on parts 1 through 2 (and on all the previous projects using this Othello project) should be minimal though since parens really are unnecessary when processing any prefix/postfix equation since they are only there for our clarity's sake. Nevertheless, I found that the parse error did affect my OthelloCache runs against EDGAR so it may have done some damage after all. The OthelloGPPlayer was fixed during the debugging of part 3. It now strips the function of all parens since they are unnecessary. So results for parts 4 and 5 should be fine.



Results
Only the best result(s) from each training session are shown. Click on the links to view the actual output files and configuratons. Or click here to view all result sets.

Part 1

GP trained against random (set 1a)
-------------------------
Best GP function:                 white

Against black random player:      47 wins, 3 losses

Against EDGAR(white)
GP Player(black):                 WHITE 49 BLACK 15



GP trained against EDGAR (set 1c)
------------------------
Best GP function:                 ( -  ( /   white_near_corners   black )
                                       ( /   black_edges   10 ))

Against black random player:      46 wins, 4 losses

Against EDGAR(white)
GP Player(black):                 WHITE 18 BLACK 46




Part 2

GP trained against random (set 1b)
-------------------------
Best GP function:                 ( * ( - white_corners
                                          ( / white_near_corners
                                              white_near_corners ))
                                      white_edges )


Against black random player:      49 wins, 1 loss

Against EDGAR(white)
GP Player(black):                 WHITE 36 BLACK 28


GP trained against EDGAR (set 2c)
------------------------
Best GP function:                 ( -   white_corners
                                        ( /   white_edges
                                              white_edges ))

Against black random player:      49 wins, 1 loss

Against EDGAR(white)
GP Player(black):                 WHITE 16 BLACK 48



Part 3

Results for this section are invalid.
To see them anyway, go to set3a or set3b or set3c


Part 4

GP trained against random (set 4a)
-------------------------
Best GP function:                 ( + black_center_d black_corner_w )

Against white random player:      49 wins, 1 loss

Against EDGAR(white)
GP Player(black):                 WHITE 48 BLACK 16



Part 5

(set 5a set 5b)

Player  GP Function                      vs random(white)   vs EDGAR(white)
------  -------------------------------  -----------------  ----------------
1       ( -  ( *   black_corner_z  black_center_d )
             ( /   black_corner_z   5 ))

                                         50 wins, 0 losses  WHITE 3 BLACK 20



2       ( /  ( *  ( -  ( / black_corner_x black_center_c ) ( +   4   5 ))
                  ( -  ( +   5 black_center_d )
                       ( * black_center_d   4 )))  4 )

                                         49 wins, 1 loss    WHITE 0 BLACK 48



3       ( +  ( *   black_corner_w   black_corner_z )
             ( *   black_total   black_center_b ))

                                         50 wins, 0 losses  WHITE 8 BLACK 23







Conclusions
This project was successful in its main goal which was to come up with new approaches in implementation and representation in the creation of more effective genetically programmed othello players. It was also hoped that these new approaches would facilitate the development of players that could beat EDGAR and also play well against the general public. Surprisingly, this hope was fulfilled early in part 1 when the GP player evloved from training against EDGAR was able to beat EDGAR and still win 46 games out of 50 against a random player.

But the best players were unarguably evolved in the final part where the GP players were able to beat EDGAR by a landslide (48 to 0) and were still generalized enough to beat the random player 98 to 100% of the time.

But to evolve such good players at the end, there were many factors along the way that contributed to the final success. The first factor was the training factor - who to train against. In a couple papers from class, it was stated that the random player was too easy and that EDGAR was too difficult and because of the lack of mid-level players, the GP could not produce any good players for lack of a gradient. I would contest that position and point out that in the first part of my project, training against EDGAR with the primitives remaining the way they were evolved a perfectly capable player who could beat EDGAR and still beat random players 92% of the time. I'll concede, though, that it may have been difficult to come across such a player since the presence of a domineering black/white primitive would have hindered the population from "seeing over the trees".

Which brings me to the next factor - the equality of the primitives. As stated before, a single primitive with a potential to output large values relative to its fellow primitives could hinder the population from venturing into a better part of the hypothesis space since the single primitive would overshadow and cause other primitives to become less and less a part of the population, ultimately to the point of extinction.

In part 2, the effect of primitive equality was tested. A simple modification of dividing the black/white values by 8 may have made a difference as the evolved players, both EDGAR-trained and random-trained, demonstrated a noticeable improvement in performance against both random players and EDGAR. And if you notice their GP functions, neither black nor white are used in the evolved players. But it can also be argued that the results could have gone the other way and that the improved results were due only to a more favorable random draw in the population creation and evolution. So, though there is an improvement in the results, the results are only marginally better and not statistically significant enough to come to a definite conclusion.

The next factor - the types of primitives used, played an undisputably larger role in improving the GP othello player. In my primitive set, I introduced a set of primitives that were complete in covering the board, but fewer in number, yet still versatile enough to express any tile on the board. The removal of all redundant primitives, such as the white primitives in this case, also proved to be useful as it further reduced the primitive set, consequently reducing the hypothesis space, which in turn reduced the time of evolution to find a capable player. As the results show, this new set of primitives was definitely more effective as evolved players could beat random players consistently, and when trained against EDGAR, could beat him just as easily as well.

Taking all these factors into account has led to the development of three truly competent othello players in the end. But there is always room for improvement. As can be seen in the final result set, the GP players can still lose against a random player occasionally. And to lose to a random player isn't very impressive. There exists somewhere in a hypothesis space, an othello player that can never lose, but I doubt that the current hypotheses come near to that. In fact, I very much doubt that the current set of primitives is even capable of generating a perfect othello player. There may be an optimal hypothesis within the current space, but I doubt that it could win every game it ever met.

But there are other AI constructs besides genetic programming that could create players. For example, artificial neural nets and search trees could be implemented to create effective othello players as well. And as for a perfect othello player who can play the perfect game, this is indeed possible with the search tree equipped with an infinite search depth and no time restrictions. Unfortunately, even with our current processor speeds, no time restrictions would be unreasonable since the number of branches to search would be astronomical. And so we are still left with the task of coming up with a more perfect othello player using a less exhaustive method, i.e. GPs and ANNs.

So, in conclusion, though much progress has been made through the genetic programming in this project, the search for the perfect othello player continues.