Project Othello
I. What is Othello?
Othello is well known board game. If you are not familiar with Othello, click
here to learn the rules of the game. Click here if you want to play Othello
against a computer player.
II. Project Purpose
Apply genetic programming to train computer programs/players to play Othello.
Specifically, we will investigate new hypothesis, new learning variants(voting),
and try to analyze the resulting function trees to see if genetic programming
is successful in capturing some of the essences of the game Othello.
III. Genetic Program Representation
Arbitrary genetic programs are represented by their parse trees. The fitness of
a given individual is evaluated by executing the program on the training data.
In project Othello, genetic programs are basically formula that use statistics
of the game board, such as number of black pieces, number of white pieces, to
calculate the next board position which calculates to the best score.
IV. Hypotheses
The original hypothesis
The original hypothesis implemented in the GPOthello program includes the following
terminals and primitives:
- operators: +, -, /, *
- terminals: 10, white, black, white_edges, black_edges, white_corners, black_corner, black_near_corner, white_near_corner
Revised hypothesis
After playing Othello(against Edgar) for a while, I discovered that certain positions on the board are strategic. So I have incorporated these positions as terminals
in the revised hypotheis:
- new terminals: black_near_near_corner, white_near_near_corner, possible_NO_move_for_opponent(at current move)
near_near_corner are positions that are separated from the four corners of the board by one square(either horizontally or vertically).
GPOthello.java, OthelloBoard.java, OthelloGPPlayer.java, OthelloPlayer.java are modified to implement the new hypothesis.
New hypothesis vs. original hypothesis
Training configuration:
- PopulationSize: 200
- Number of generations: 50
- Termination Fitness: 50
- GoodRuns: 5
Training against random players.
Best Trained Player(A) based on new hypothesis is:
fig. 1
Best Trained Player(B) based on old hypothesis is:
fig. 2
When played against random players, player A wins 47 out of 50 games.
When played against random players, player B wins 23 out of 50 games.
Training against Edgar
Training configuration is not changed.
Best Trained Player(C) based on new hypothesis is:
fig. 3
When played against Player A, C wins. The final score is WHITE 27 BLACK 37.
When played against Edgar, Edgar wins. The final score is WHITE 36 BLACK 28.
When played against random players, player C wins 50 out of 50.
Training against better player seems to bring out better performance.
V. Evolving based on voting
-
The original evolving method is based on evaluating individually each genetic program, assigning each individual a fitness score based its performance for 5 games played against random players.
- The new approach let the whole population vote on each move of a game.
But how should we assign fitness scores to individuals, now that we obviously only
get a single score for the whole population?(this puzzles me a lot)
I use the method that assign individuals scores scaled to their contributions(how many many each individual votes to the carried-out move) to the game.
Individuals who contribute more will get better scores if the population score is good.
Individuals who contribute more will get worse scores if the population score is bad.
This somewhat models the real world situation.
GPPopulation.java in the gpjpp package and GPOthello.java are modified to implement this.
Training configuration is not changed.
- Best Trained Player(D) based on evolving by voting is:
fig. 4
When played against random players, player D wins 41 out of 50.
VI. Analysis of resulting functoin trees
The functions trees of the best individuals demonstrate that genetic programming can ultimately rely on Darwin's Law of Survival by Fitness to select the fittest individual.
The function in fig. 1 is
((white_near_corners * black_near_corners) / white_near_corners)/white_near_corners) - possible_NO_move_for_opponent.
The function can be simplied to
(black_near_corners/white_near_corners) - possible_NO_move_for_opponent.
Although this function does not make obvious sense to any human Othello player, it does
demonstrate its fitness by its ability to win over random players.
If we examine the equation carefully, the ratio between the number of black_near_corners
and white_near_corners tells something critical about the board state.
Since the number of *_near_corners is the number of chances that players have to
"jump" to the corner of the board.
Also the number of possible moves for opponent
can be very critical at the finishing stage of a game.
We can see it is not by pure chance that this individual can excel among its population.
VII. Credits
The genetic programming package used is GPJPP and its demo program Othello.