Project Othello

I. What is Othello?

Othello is well known board game. If you are not familiar with Othello, click here to learn the rules of the game. Click here if you want to play Othello against a computer player.

II. Project Purpose

Apply genetic programming to train computer programs/players to play Othello. Specifically, we will investigate new hypothesis, new learning variants(voting), and try to analyze the resulting function trees to see if genetic programming is successful in capturing some of the essences of the game Othello.

III. Genetic Program Representation

Arbitrary genetic programs are represented by their parse trees. The fitness of a given individual is evaluated by executing the program on the training data. In project Othello, genetic programs are basically formula that use statistics of the game board, such as number of black pieces, number of white pieces, to calculate the next board position which calculates to the best score.

IV. Hypotheses

The original hypothesis
The original hypothesis implemented in the GPOthello program includes the following terminals and primitives:

operators: +, -, /, *
terminals: 10, white, black, white_edges, black_edges, white_corners, black_corner, black_near_corner, white_near_corner

Revised hypothesis
After playing Othello(against Edgar) for a while, I discovered that certain positions on the board are strategic. So I have incorporated these positions as terminals in the revised hypotheis:

new terminals: black_near_near_corner, white_near_near_corner, possible_NO_move_for_opponent(at current move)

New hypothesis vs. original hypothesis
Training configuration:

PopulationSize: 200
Number of generations: 50
Termination Fitness: 50
GoodRuns: 5

Training against random players.

Best Trained Player(A) based on new hypothesis is:

fig. 1

Best Trained Player(B) based on old hypothesis is:

fig. 2

When played against random players, player A wins 47 out of 50 games.
When played against random players, player B wins 23 out of 50 games.

Training against Edgar

Training configuration is not changed.
Best Trained Player(C) based on new hypothesis is:

fig. 3

When played against Player A, C wins. The final score is WHITE 27 BLACK 37.
When played against Edgar, Edgar wins. The final score is WHITE 36 BLACK 28.
When played against random players, player C wins 50 out of 50.
Training against better player seems to bring out better performance.

V. Evolving based on voting

The original evolving method is based on evaluating individually each genetic program, assigning each individual a fitness score based its performance for 5 games played against random players.
The new approach let the whole population vote on each move of a game. But how should we assign fitness scores to individuals, now that we obviously only get a single score for the whole population?(this puzzles me a lot)
I use the method that assign individuals scores scaled to their contributions(how many many each individual votes to the carried-out move) to the game.
Individuals who contribute more will get better scores if the population score is good.
Individuals who contribute more will get worse scores if the population score is bad.
This somewhat models the real world situation.
GPPopulation.java in the gpjpp package and GPOthello.java are modified to implement this.

Best Trained Player(D) based on evolving by voting is:

VI. Analysis of resulting functoin trees

The functions trees of the best individuals demonstrate that genetic programming can ultimately rely on Darwin's Law of Survival by Fitness to select the fittest individual.

The function in fig. 1 is
((white_near_corners * black_near_corners) / white_near_corners)/white_near_corners) - possible_NO_move_for_opponent.

The function can be simplied to
(black_near_corners/white_near_corners) - possible_NO_move_for_opponent.

Although this function does not make obvious sense to any human Othello player, it does demonstrate its fitness by its ability to win over random players.

If we examine the equation carefully, the ratio between the number of black_near_corners and white_near_corners tells something critical about the board state.
Since the number of *_near_corners is the number of chances that players have to "jump" to the corner of the board.

Also the number of possible moves for opponent can be very critical at the finishing stage of a game.

We can see it is not by pure chance that this individual can excel among its population.

VII. Credits

The genetic programming package used is GPJPP and its demo program Othello.