A Neural Network Approach to Othello Playing

 

Federico Kattan

Columbia University

   Machine Learning class, Fall 97'

fkattan@cs.columbia.edu

 
 
 

Introduction

 

I tried to build an Othello player that's based on neural networks, in this way explicit descriptions about the game are kept to a minimum. Some researchers believe that the nature of intelligent behavior resides in the ability of a system to represent perception-categorization action behavior as a coupling between these activities rather as a pipelined process.

The theory of situated cognition ([1] Clancey) claims that every human thought and action is adapted to the environment. This is, what people perceive, how the y conceive of their activity and what they physically do, develop together. In this view thinking is  like riding a bike, every turn of the steering wheel,  and every adjustment of body position doesn't came as a result of evaluating the physics formulae of stability and dynamics, but by a recoordination of previous postures, perception and motion sequences. All human action is partially improvisatory by direct coupling of perception, conception (or interpretation) and action. This coupling involves a "self-organization with memory" in the brain that haven't been duplicated in computer programs.

In regard to the symbolic approach of describing knowledge to a computer system, the theory of situated cognition claims that when researchers equate human knowledge with a set of descriptions (like rules and facts in an expert system) they can describe how the system should behave in particular situations, but they can't capture the full flexibility of how perception, memory and action interact in intelligent behavior. (This may be one of the causes of why such descriptions are domain dependent and so expensive and difficult to build)
 

Neural Darwinism

There is some work done modeling remembering as categorized coupling of perception and motor systems ([2] Edelman, pg. 93). It's shown how complete structures that coordinate perception-categorization-action are possible without explicit stored descriptions (like production rules).

Gerald Edelman's model of learning is based in his earlier work on the immune system (for what he got the Novel Price in 1972). In that work he showed how recognition of bacteria is based on a competitive selection process over a population of antibodies. There are some intriguing characteristics of this approach:

 

 
In the theory of Neuronal Selection Groups (TNGS) Edelman extends this theory to a more general science of recognition. Mental categories, coordination, and conceptualizations arose as a product of a selection process over a population of neural maps that allows the organism to recognize previous interactions. The method of considering alternatives and make a decision is replaced by a selectional-coupling mechanism.

TNGS explains "how multiple maps lead to integrated responses, and generalizations of perceptual responses, even in the absence of language". Edelman adders to other researches and reacts against the descriptive cognitive approach, which assumes that learning, occurs when an already categorized world with the correct responses is given.  Edelman categorizes his approach as instructionism, and seeks to explain how categorization occurs at the neuronal level.

 Darwing III is a recognition automaton based on the previous theory. It has a single moving eye and a 4 joins arm with touch and kinesthesia (join sense). After experience with randomly moving objects its eye will follow the object and its arm will try coordinate its movement to reach the object, all of this without pre-programmed structures describing this particular behavior.

 
 
 

Neural Networks for Othello playing

 

A neural Network maps learned input-output pairs in such a way that what is retained is not independently stored units of descriptive logs, but rather a cumulative record of relations between features of the input patterns and corresponding outputs (or actions). Analysis of the hidden units in a neural network reveal patterns similar to those built deliberately in symbolic based systems (look [2] Mitchell, pg. 107 for an example).

For this project I tried to accomplish the behavior shown by an Othello player keeping at a minimum explicit descriptions. Even when this is a much more modest project, I tried to stay as much as possible in the lines of the mentioned research. For that I used a Neural Network as the foundation of an Othello player, the NN is a standard back-propagation network with the following topology:

 

 

 

 
 
Training was done by presenting the network with games played by EDGAR against a random player. The learning process was done on complete games and therefore there isn't credit assignment of each particular move.  Training was done on 5000 epochs but approximately 85% of the learning was done in the first 200 epochs (see the following two graphs).
 
 
 

Learning curve for the Othello Player network

 
 
 

Detail of the first 200 epochs

 

The modifications on the given Othello package involved: Generation of games logs suitable to be used by the neural network during training; A C function that can load the trained network and interprets its output (this one will be used during game playing); And the connection between the Java player and the C function. The neural network was defined using SNNS that presents a graphical user interface during network definition and training, it also provides an utility to generate C functions to represent and use the trained networks.

 
 
 

Results

After training for 5000 epochs the network was able to beat a random player in all of the 50 games played. This result is similar to the one achieved by using GP in the previous homework.  Further experiments were not being done but it would be interesting to see if a hill climbing learning scheme in which increasingly good neural networks plays against each other trying to learn better positions if feasible. Also playing against EDGAR would be an interesting experiment. Trying to use a different topology or neural maps of the kind described by Edelman could be future areas of research and experimentation.

 
 

References

[1] William J. Clancey,  Situated Cognition, 1997 Cambridge University Press.

[2] Gerald M. Edelman, Bright Air, Brilliant Fire, 1992
[3] Tom M. Mitchell, Machine Learning, 1997 McGraw-Hill