GP Brainstorming: Growth Phase

Relevant papers from the Symposium Working Notes:

Context-Free Language Induction by Evolution of Deterministic Push-Down Automata Using Genetic Programming. Afra Zomorodian

Note: Response by Frederic Gruau below.

There has been interesting work in which the genome is grown to a phenome before measuring fitness. That is, an individual represents production rules of some kind that, when activated, generate the phenotype. For example, Gruau evolves a parse tree with special growth primitives (Cellular Encoding), and when this evolved program is run, it grows a neural network. This neural network (which is different in both representation and size from the genome) is then evaluated over fitness cases.

One advantage of this scheme is it supports a type of modular code reuse: one part of the genome (eg a subtree) can describe a component (eg a subnet) that can appear multiple times in the fully grown phenome. Another potential advantage is the resulting effect of genetic crossover and mutation (ie the resulting transmission function [Altenberg, Advances in GP]).

Other examples include growing a DFA for language recognition from an evolved tree (Zomorodian's paper in the GP Symposium Working Notes), and growing a physical creature in a virtual environment (Karl Sims).

Is there an advantage to this separate growth phase that can not be encompassed by other means? For example, if an evolved parse tree (genome) is evaluated multiple times during the fitness measure, selective parts of the tree can be reused (phenotypic behavior). Is it necessary to physically grow a phenome? Are there actually circumstances under which we need an explicit physical "phenome", instead of allowing code reuse to emerge from "phenotypic behavior"?

If you want to grow a robot to manipulate the real world, there is a clear advantage, but my question applies to information processing applications.

{Eric Siegel}

I'm not convinced that a growth phase is necessary, but I am convinced that more work should be done so that the question can be answered. {Astro Teller}

A two stage process is NOT necessary to solve the nested brackets problem [as done by Zomorodian], I have solved it using a GP plus indexed memory. (NB fitness function etc are different from Afra's [Zomorodian], see this paper) Thats not to say a two stage process doesnt have advantages. {William Langdon}

David Andre said: in nature, the environment effects the growth phase {Eric Siegel}.

I am not at all sure we need ontogeny within GP. First of all we should define what we mean by ontogeny in GP: Do we mean anything but making a difference between genotype and phenotype? Anyhow, it could be used to constrain the search space to feasible solutions in case you want your solutions expressed in terms that can express non-feasible solutions. {Hakan Jonsson}

On "the need for a growth phase":

"Need" isn't quite right -- as with many questions of GP technique, the issue is the *convenience*, to the evolutionary process, of the representation. In this case the issue is the convenience, to the evolutionary process, of various indirect mappings from genotype to phenotype.

"Growth" isn't quite right either -- many of us were talking about a more general ontogenetic process that is sensitive to the environment.

"Phase" isn't quite right either -- although many of the current techniques for genotype->phenotype mapping (e.g., construction of an automaton or a neural net, or the expansion of ADMs) occur in a single phase, several of us were kicking around ideas about multi-stage or continuous ontogeny.

{Lee Spector}

I think it relates from the need for us to be able to show others the solutions obtained from GP runs. I know every time I see an S--expression in a paper I wince. I really liked what Hugues and Jordan did with their graphical representation of the solution to the spiral problem. I have used similar graphical representations of solutions in the past, just not with color. Besides being able to show it to someone who can't, or just does not want to, read S--expressions, I find that such graphical representations can help you debug your code. It is more than debugging for you can also come to understand properties of your domain that you previously were clueless at. {Thomas Haynes}

By performing genetic operations on the blue print which will be further developed into a full phenotype one affects the kind of structures that can be created. A basic question in my mind is what is the language that can be generated through development?

Why is such a language appropriate for expressing problem solutions? If development can help to better focus on and generate language constructs that are beneficial than it is useful. This may be true as additional information (such as various constraints) can be implicitly used in the development function.

Development insures that any genetic operations (crossover, mutation) would make sense, generating valid structures.

An elementary form of growth is employed in the use of subroutines in GP as in ADF [Koza] or AR [Rosca]. This was outlined by Peter Angeline in his overview paper from EP-95 (Morphology in EC). [However, this growth is not explicit -- the subroutines are called on demand, instead of physically growing before measuring fitness -ed]

{Justinian Rosca}

Not simply growth, ONTOGENY. According to Webster 'the life cycle of a single organism; biological development of the individual: distinguished from phylogeny'. Webster's definition of phylogeny is 'the racial history or evolutionary development of any plant or animal species'.

(a) Natural evolutionary biology does not prove that ontogeny is required to evolve successful computer programs and I would not claim that it is. However, it does demonstrate that the combination of population (evolutionary) and individual (neural) learning is capable of great things. Simple specie which have no nervous systems, but depend wholly upon genetic learning, cannot adapt as rapidly as specie with nervous systems, nor do they exhibit as complex behaviors.

(b) It seems a natural way to marry the complementary strengths of several different machine learning paradigms.

{Stuart Card}

It allows the expression of a structure more complex and constrained than a tree in a tree (or string)! It could eventually allow the evolution of successive layer of abstraction. This might prove more powerful than imposing the level of abstraction at which we are doing the evolution. {Gregory Seront} I just want to let you know that I use some kind of ontogeny in my work on genetic programming of parallel processes. I use an online learning and evolution mechanism, where the processes reproduce by exchanging genetic material in the form of a specification of the behaviour of the process. An offspring is produced by giving the specification to a function the generates a process in the language that the processes are written in and inserts it to the population (a kind of growth). I think this is a type of ontogeny and I do not think I could do it by directly manipulating the process description language since this would be too brittle. So here is an example of where it is useful! Would you call this ontogeny? {Hakan Jonsson}

In support of a growth phase:

a) small changes in genome can have bigger effects since growth will amplify the changes. [However, the same thing is true when there is no growth phase, but there is some type of looping, eg multiple tree evaluations. -ed]

b) growth phase can be used to modify, bring about qualitatively different kind of "structural changes" e.g. for a GP process that evolves neural nets, the growth phase can interpret the genome differently i.e. make it a different *kind* of neural net, still using the genome for controlling the design, so growth can be an "architecture creating/ modifying phase". (The relationship between the environment and the growth process is not clear though.)

{Brij Masand}

Solicited response by Frederic Gruau:

1- First aspect in this question is the computer CPU time. I though for a while to skip the growth phase, in order to spare the time of copying sub-structure. I realized that this is not necessarily a spare of time, it might well be a loss of time. Assume for simplification that the phenotype is a neural network. If you develop the entire structure, then you can simplify it, by merging neurons, and since during fitness evaluation, data has to go through the entire structure, then a lot of time can be saved by this simplification. I come back letter to the speed in point four.

2- Second aspect of the question is : is it simply necessary? Well it is necessary, for example if the neural network is recurrent and maintain internal state. The state will be different for each substructure, and there simply has to be different copies of the substructure to hold this state. Of course this is the diference with standard GP, where a substructure is a procedure call, and when the procedure call returns, there is no need to keep the status of the local variables of the procedure. Another reason I find it is necessary, in my work, is because the substructure are not merely copied, there are redeveloped each time entirely, and they develop in different ways, depending on the boundary connections. In other words there are some context dependant effect which cannot be modeled unless the substructure are effectifely redeveloped, and stored. Context dependance makes the development more rich, and in my approach, for example, how a subnetwork is hooked up the whole network is context dependant.

3- Point 3 is concerned with the strong association between growth and architecture. In general, Many of the contribution to this brainstorm group could be revised, if their author stopped to think "pure GP" where the call of a subprocedure is an unpredictable event, which may happen an unpredictable number of times, sometimes useless. In this framework, everything is highly dynamic, the computer memory needed to evaluate the fitness keeps growing and shrinking with the depth of encapsulated function call. The interest of growth is strongly corelated with the idea to handle a fixed size architecture instead of this dynamic object that is the interpretation of a LISP program or any program, with procedure calls. In the growth approach, a big number of calls translate in a huge architecture, and you can immediately see its harmfull effect in terms of memory size, and speed. What you want is not only to get a correct program, you want to generate a nice, and small architecture. You want to see your architecture, see it grow. Growth for me, is not something to be discussed, but something to be managed.

In other words, one of the interest of growth, is that you have a very precise tool to measure the time it will take to evaluate the fitness of a given individual. This is simply the size of the architecture. You know it, before even you have started to evaluate the fitness. You have an idea of it even before the end of the growth. As a result, you can have a tigh control of the evaluation time for fitness, a control that is independant from any other feature of the particular algorithm that you are using. I usually use abortion in a massive way of cancerous development (those who involve doubly recursive calls for example), and this cancerous thing do not take any of the ressource. The criterion is simple, if the baby grows upper max K1, K2*nio cells, kill it. K1, K2 are problem dependant constant largely problem independant. nio is the number of input output units. Cancer is common in early generations, but not for long. I d' say that cancer is THE illness that plague the standard ADF approach with standard GP using program interepretation. Encapsulated ADF are well known to take an awful amount of CPU, and people using them shamelessly use thousands of tricks to restrict the possible combinations. As a doctor I'd say use the architecture approach to cure that, at least you 'd be able to see the illness at work, your screen will darken as the number of cells grows!

4-Comming back to the speed argument (the most important in my mind, all the computer science is about speed) an even more obvious interest of growth, and the "architecture" approach, it that the object is not dynamic, and the memory needed to evaluate the finess is assigned once for all for all the fitness cases. In many application, each fitness case needs the phenotype to be "run" 1000 times, and this multiplies. This is an important feature, is we think that very often more time is spent in malloc and calloc, than in actual arithmetic operation. You can also optimize the datastructure that stores the architecture, to sweep through all the nodes in a single fully parallel loop that can be obvioulst pipelined or vectorized by the most stupid compiler (nowadays microprocessors do a lot of these fancy things, the pentium for example and most of the MIMD parallel dinosaures paragon, CM5...)

My final conclusion is that I d'like to express my hope for more people to use vocabulary related to biology and development in a more appropriate way. I hate earing such words as ontogeny, when it is merely used in a "media" kind of way, to look fashionable, and does nothing else than mixing up everyting, and making things harder to understand. I do not know what ontogeny means, and I am proud of it. As my father like to say about some people: " why express things in a simple way, when it is possible to say it in a complex way ?"

{Frederic Gruau}