GP Genome Representation

Relevant papers from the Symposium Working Notes:

Language Representation Progression in Genetic Programming. Astro Teller

Context-Free Language Induction by Evolution of Deterministic Push-Down Automata Using Genetic Programming. Afra Zomorodian


  	     What's the Difference between GA and GP?
	      Unifying/Discriminating Research Demes
		 (Discussion slide by Eric Siegel)

NOTE: THERE ARE EXCEPTIONS TO EACH ITEM BELOW -- THAT IS THE POINT.

			Genetic Algorithms		Genetic Programming
			------------------		-------------------

fitness			optimization			execution

genome			string				tree

recreation		golf				rock climbing

operators		representation-blind		representation-
							specific

music			Benny Goodman			Smashing Pumpkins

crossover		location preserved		``wandering genes''

benchmark		multimodal function		Pac-man and
			optimization			Tetris

size			fixed length			variable length

favorite flick		Little Mermaid			Jurassic Park
							(or Jaws I and II)

domain			optimization			machine learning
							(avoid overfitting)

opsys			Windows 95			Solaris

tradeoff		better generalizing		less constrained

undergarment		briefs				boxers

- Two examples of work that brings the two representation styles together:
	- ADFs are a list of trees
	- Gruau (Advances II) and Whigham (Rosca's workshop) use grammars to
	  constrain GP trees 

- Variable length string crossover (e.g. some GAs, or Perkis' postfix)
  allows GP's "wandering genes", but genes are less mobile.

- GA can also have "wandering genes" with the Messy GA (pointed out by Lee
  Spector)

- To formalize what it means to "execute": GP allows primitives which are
  functions.

* Design better representation frameworks. In this context Astro [Teller] talked about explicit directed graphs which present the advantage of facilitating the representation of loops and intricate control structures. The question was: Why do we need such complex structural representation framework? What is it you can do with graphs that you can't do with trees? Are they better suited to genetic operations usually performed in GP?

* Representation frameworks should have the properties of being conveniently and directly manipulable [Afra]

* Importance of the distinction genotype-phenotype [Afra]

* Wandering genes [Eric]

* View of the GP representation as an implicit representation. This induces generalizability and conciseness [Justinian]

* Position-independent subtrees [John]

* Apparent position-independent subtrees due to the bias induced by a tree representation [Justinian, see ICGA95 paper on Causality in GP]

* GP is an open-ended tool. When using GP, 10% of effort goes into design and implementation and 90% in runs. This is different from conventional programming where the figures are 90% into effort and 10% into runs [Oackley]

* ADFs and subroutines help scale up computation. There aren't many examples where this is not true [Andre].

{Justinian Rosca}

Teller and others are moving the paradigm of GP away from trees, and towards other representations (e.g. graphs). Along with this is also the idea of an individual as a program (or algorithm or process), not just a function. {Alex Chaffee}

Teller pointed out that only one loop (eg repeated tree evaluations) gives as much power as "separate loops" (eg graphs), but may be much harder on the genetic search. {Eric Siegel}

Can we move away from a tree model of GP towards a DAG model that might exploit more parallelism? I have that problem currently with an intrusion detection system we have built. {Mark Crosbie}