Ilia Malkovitch starring in GP-Othello
Introduction The logical conclusion, and this was the avenue that we chose to pursue in our experimentation, is to develop a method that would evolve more general players that would tend to do well against a broader range of strategies, avoiding overfitting.
In order for us to effectively pursue our genetic programming experiments, we had to speed up the execution of the package. One of the avenues we pursued in our investigations is an attempt to speed up the execution of the Othello GP package. We attempted several things along those lines. First, we noticed a glaring "// Very inefficient bubble sort" in the OthelloPlayer.java file. We valiantly proceeded to replace it with an ultra-efficient quicksort implementation. Unfortunately, this produced a very slight increase in performance. Next, we noticed that training against Edgar takes significantly longer than training against simpler players (i.e. Random). After careful observation we discovered that the Edgar class is being re-instantiated every time it is used against a particular individual. Modifying the code to avoid this we achieved a great increase in speed - almost 6-fold - as the experiment shows: Population 200 (training 5 games against Edgar) This allows us to run more tests against Edgar significantly more efficiently, and generally makes it much less annoying to do so.
Therefore, we needed to devise a method of training that would allow us to produce Othello players that would generalize their play to remain competitive against various players. The ideal approach, of course, is to allow our genetic algorithm to train against a competent human opponent that varies its play, so that the genetic algorithm can learn to play diversely. The next best thing is training versus several different automated players that use various strategies in an attempt to make the resulting individuals come up with a generalized strategy that plays well against all of them. The goal of this training is to try to produce individuals that would be able to play competitively against the testing players that we used. Thus, if an individual evolves that can consistently beat the various kinds of players, then we can assume that the strategy that it has learned is a generally good Othello strategy that is not overfitted to any particular opponent.
After testing the players evolved from this multi-training we saw that they still did not have a general strategy. Our primary goal was to make it win against all three players, and after these test, the players were able to beat at most 2 out of three.
Our key to solving this problem was to penalize players that have this disability of overspecializing to a particular player. After taking the raw fitness of the 6 games, we computed the variance of the fitness over the games, and used it as a modifier to produce the new standard fitness for each individual. To compute the variance between the three types of games we used the raw fitness from games against Edgar, raw fitness from Greedy, and the average raw fitness from Random. So, the new standard fitness for each individual became modStdF = stdF + variance*5 This causes the individuals that overspecialize to a particular opponent and do badly against another to have a higher fitness, eliminating them as viable choices for selection. The top players evolved using this method turned out to be those with low variances (as low as 2) and low unmodified fitness measures (number of opponents pieces remaining). Note: The variance-modified fitness generally turns out significantly higher than the original fitness, because the variance factor tends to change it a lot.
Here are some of the best evolved individuals (variance-modified fitness) Fitness 137.44 ( * ( - ( + ( / ( / black_edges black ) black ) black ) ( - black_near_corners black )) ( / ( / ( * ( - ( + ( * ( / black 5 ) ( / black white_edges )) black ) ( / ( + black_corners black_corners ) white_corners )) ( * black black_corners )) black ) ( + black_near_corners black )))
( - ( / ( * black_corners ( / ( * ( * black_corners white_near_corners ) ( / white_near_corners white )) ( * ( * ( - ( / white_corners white_corners ) ( / white_corners white )) ( + ( * ( * white black_corners ) ( + black white_corners )) ( + white white ))) ( / ( / ( + black_edges black_edges ) ( - black 5 )) black )))) white_near_corners ) ( - white_edges 5 ))
The individuals that were evolved using the variance-modified fitness method satisfied these conditions, decisively defeating Edgar, Greedy and Random. Unfortunately, it is very difficult to determine whether the strategies learned are can actually be generalized to any opponent. One simple test was to have the creators (presumably employing intelligent Othello strategies) play against the evolved players, and the creators were defeated soundly. This is not a very precise fitness measure, but it does show that a certain amount of "intelligence" was evolved using GP Othello.
|