Written by Lenny Volchok
Machine Learning COMS W4771
Spring 2000
Othello Project

Abstract

For Project Othello, we had to apply the machine learing method genetic progrmming (GP) to the game Othello. What makes Othello a perfect candidate for application of genetic programming is that an Othello Player may be represented using a simple function tree of basic terminals. This function tree is used to evaluate possible moves that the player might have. The player chooses the best move according to this evaluating function. The function trees of different players can be easily crossed-over and expanded.

Preparations

At first I wanted to see what is the deal with Edgar and whether or not he is as tough as everyone says he is. I also wanted to find out what his strategy is and what strategy I might want to use to beat him.

Just to get things started I played Edgar for 10 games. Here are the results of the games:

Game My Score (BLACK) Edgar's Score (WHITE) Comments
1 49 15  
2 48 16  
3 32 32 Edgar is showing some signs of life 
4 37 27  
5 45 19  
6 20 44 OK, I was getting tired of winning 
7 59 5 Now, Edgar got me mad 
8 45 19  
9 0 18 That was not fair! (An accident) 
10 30 34 I became really tired by the end :) 

My Strategy was simple:
Take over the edges two away from the corners, whenever possible. It was very important for me to get 4 middle edges in a row. Also, take the corners, whenever possible. Try avoiding putting pieces into edges next to corners and diagonals next to corners. Note, that "taking" opponent's pieces in those dangerous areas might also be bad. Try avoiding it as well. Also, try to be as close to the center as possible in the beginning.

After playing Edgar I made several conclusions.

Goal

To create an Othello player that can beat Edgar and that can also be a good player in general.

The Plan

Use the Genetic Programming software provided, but now use the new Random Edgar for training purposes. The problem with the Deterministic Edgar is that it might be too hard to beat and it always makes the same moves in the same situations. If I train my player against the Deterministic Edgar, my player might overlearn and learn how to beat Edgar and noone else. This is not what I want. I want to create a good player in general, and beating Edgar is just one of the objectives.

To create Random Edgar, add some randomness to the moves that Edgar makes. If Edgar has more than 1 legal move, the best move is randomly selected from the possible moves, with the best move given the highest probability.

For now I will only try to train the best possible BLACK player. We'll worry about training the white players later, after Edgar is defeated.

Advantages of training against Random Edgar:

So, with all these thoughts in mind, I implemented my idea and created the Random Edgar. Here's how Regular Edgar matches up against the Random Player :

Player Color Total Wins Average # of pieces
Edgar WHITE 50 46.9
Random Player BLACK 0 16.96


Here's how Random Edgar matches up against the Random Player :

Player Color Total Wins Average # of pieces
Random Edgar WHITE 39 39.1
Random Player BLACK 11 24.88


Just to compare with the Random Player vs. the Random Player :

Player Color Total Wins Average # of pieces
Random Player WHITE 4 23.98
Random Player BLACK 46 39.76


Black wins for some strange reason, although both players are playing randomly. My educated guess would be that black has a slight advantage, since it moves first.

Actually, this gave me the idea that there is a difference between the white and black players and thus they should be trained separately.

Even though Random Edgar might not perform as well as the our Regular Edgar against the Random Player, it is still a much better player than the Random Player. Therefore, it would be a much better idea to train against this guy.

Results

Note: the results I am using here are the intermediate results I got, since my program is still running, as it has been for over a week.

I trained three different players using three different strategies.

  1. Strategy: Play 5 games against the Random Player to calculate the fitness.
  2. Strategy: Play 5 games against the Random Edgar to calculate the fitness.
  3. Strategy: Play 4 games against the Random Edgar + 1 game against Edgar

All the other things were the same for all runs. I used the same .ini files for all training.

The last strategy is almost equivalent to adding more probability to Edgar's best move in the Random Edgar implementation, but I took the easy way out. Also, I just wanted to make sure that my player is able to beat Edgar, after it is done with the training.

Training against the Random Player
This strategy proved to be the least successful. I don't know whether or not there was a bug in the software, but when I was training my player against the Random Player, it was always easy to obtain the fitness of 0 and the function tree was the same after each run: "black_edges". I guess, this function was good enough to beat the Random Player, so my player was satisfied with it.

My Player function tree = "black_edges"

Player Color Total Wins Average # of pieces
Random Player WHITE 2 6.36
My Player BLACK 48 39.9

See the file.

Player Color Total Wins Average # of pieces
Random Edgar WHITE 38 38.88
My Player BLACK 12 24.68

See the file.

Lost to Edgar 48:16. Click here to see the game.

Training against the Random Edgar
To see the stats click here.

My Player function tree = "( / ( / 10 black ) ( + ( * black white_edges ) ( - black ( - ( + 10 ( + black_edges ( + ( + black black_near_corners ) black_near_corners ))) white ))))"

Player Color Total Wins Average # of pieces
Random Player WHITE 20 28.98
My Player BLACK 30 32.06

See the file.

Player Color Total Wins Average # of pieces
Random Edgar WHITE 36 36.16
My Player BLACK 14 24.06

See the file.

Lost to Edgar 51:13. Click here to see the game.

Training against the Random Edgar and Edgar
To see the stats click here. OK. I really wanted to beat Edgar. So I cheated a little bit and trained against him as well as against the Random Edgar. As a result, my player might have overlearned a bit just to play well against Edgar, but I did not care much, since I was able to defeat Edgar! So, I played 4 games against the Random Edgar and 1 game against Edgar, just to make sure that Edgar goes down.

My Player function tree = "( + ( + white_edges black ) black_edges )"

Player Color Total Wins Average # of pieces
Random Player WHITE 0 6.44
My Player BLACK 50 50.82

See the file.

Player Color Total Wins Average # of pieces
Random Edgar WHITE 25 29.2
My Player BLACK 25 27.12

See the file.

BEAT Edgar 36:28. Click here to see the game. The final board.

Problems

I did not experience many of the problems from the previous years, mainly because I read the students from previous years were nice enough to tell us about any problems that they experienced. For example, there is a bug in the software for reading the function trees with parenthesis, so I got rid of parenthesis when I tested my players. Also, when I ran the Genetic Programming software overnight I used the "nohup" command and did not draw the resulting function trees, so my program kept running without problems. I found out about all these things from the writeups of students from the previous years.

The problem that everyone, who does not own a PC, probably experienced is that the Genetic Programming software just takes TOO LONG to run. My program has been running for about a week now, and still it only produced a couple of runs. It is still far from being done. For my results and conclusions I actually used the intermediate results I was able to produce. I expect the final results to be only better.

Today is a very sad day, since I found out that the Random Player does not do what I thought it did. Someone told me that the Random Player is not as random as it is supposed to be and, therefore, some of the acquired data might be misleading. This might explain why the Black Random Player kept beating the White Random Player. My Random Edgar still works, but all the results I have playing against the Random Player might just be thrown out... The only thing that makes me feel a little bit better is the fact that My Player was able to BEAT EDGAR!

There were several other problems with my approach, which I would probably correct, if I were to do the project all over again. For example, playing five games for training, especially against the Random Player, is not nearly enough. A player might just get lucky and get a small fitness, just because all of the five Random Players were poor players.

Conclusion

In conclusion, I have to say that I really enjoyed this project, although I did not get as good results as I was hoping for. I am really proud of the fact that My Player was able to learn how to beat powerful Edgar with simple terminals. Actually, My Player's function was really simple. Edgar went down!

My Player might not be as good a general player as I was hoping for. It is still much better than a Random Player, but a person could easily defeat him. Since I really wanted to beat Edgar, My Player could not avoid overlearning. I let him overlearn, since the victory over Edgar just feels good!

As a whole, I consider my project to be a success :)