Context-Free Language Induction by Evolution of Deterministic Push-Down Automata Using Genetic Programming. Afra Zomorodian
One advantage of this scheme is it supports a type of modular code reuse: one part of the genome (eg a subtree) can describe a component (eg a subnet) that can appear multiple times in the fully grown phenome. Another potential advantage is the resulting effect of genetic crossover and mutation (ie the resulting transmission function [Altenberg, Advances in GP]).
Other examples include growing a DFA for language recognition from an evolved tree (Zomorodian's paper in the GP Symposium Working Notes), and growing a physical creature in a virtual environment (Karl Sims).
Is there an advantage to this separate growth phase that can not be encompassed by other means? For example, if an evolved parse tree (genome) is evaluated multiple times during the fitness measure, selective parts of the tree can be reused (phenotypic behavior). Is it necessary to physically grow a phenome? Are there actually circumstances under which we need an explicit physical "phenome", instead of allowing code reuse to emerge from "phenotypic behavior"?
If you want to grow a robot to manipulate the real world, there is a clear advantage, but my question applies to information processing applications.
"Need" isn't quite right -- as with many questions of GP technique, the issue is the *convenience*, to the evolutionary process, of the representation. In this case the issue is the convenience, to the evolutionary process, of various indirect mappings from genotype to phenotype.
"Growth" isn't quite right either -- many of us were talking about a more general ontogenetic process that is sensitive to the environment.
"Phase" isn't quite right either -- although many of the current techniques for genotype->phenotype mapping (e.g., construction of an automaton or a neural net, or the expansion of ADMs) occur in a single phase, several of us were kicking around ideas about multi-stage or continuous ontogeny.
Why is such a language appropriate for expressing problem solutions? If development can help to better focus on and generate language constructs that are beneficial than it is useful. This may be true as additional information (such as various constraints) can be implicitly used in the development function.
Development insures that any genetic operations (crossover, mutation) would make sense, generating valid structures.
An elementary form of growth is employed in the use of subroutines in GP as in ADF [Koza] or AR [Rosca]. This was outlined by Peter Angeline in his overview paper from EP-95 (Morphology in EC). [However, this growth is not explicit -- the subroutines are called on demand, instead of physically growing before measuring fitness -ed]
(a) Natural evolutionary biology does not prove that ontogeny is required to evolve successful computer programs and I would not claim that it is. However, it does demonstrate that the combination of population (evolutionary) and individual (neural) learning is capable of great things. Simple specie which have no nervous systems, but depend wholly upon genetic learning, cannot adapt as rapidly as specie with nervous systems, nor do they exhibit as complex behaviors.
(b) It seems a natural way to marry the complementary strengths of several different machine learning paradigms.
a) small changes in genome can have bigger effects since growth will amplify the changes. [However, the same thing is true when there is no growth phase, but there is some type of looping, eg multiple tree evaluations. -ed]
b) growth phase can be used to modify, bring about qualitatively different kind of "structural changes" e.g. for a GP process that evolves neural nets, the growth phase can interpret the genome differently i.e. make it a different *kind* of neural net, still using the genome for controlling the design, so growth can be an "architecture creating/ modifying phase". (The relationship between the environment and the growth process is not clear though.)
2- Second aspect of the question is : is it simply necessary? Well it is necessary, for example if the neural network is recurrent and maintain internal state. The state will be different for each substructure, and there simply has to be different copies of the substructure to hold this state. Of course this is the diference with standard GP, where a substructure is a procedure call, and when the procedure call returns, there is no need to keep the status of the local variables of the procedure. Another reason I find it is necessary, in my work, is because the substructure are not merely copied, there are redeveloped each time entirely, and they develop in different ways, depending on the boundary connections. In other words there are some context dependant effect which cannot be modeled unless the substructure are effectifely redeveloped, and stored. Context dependance makes the development more rich, and in my approach, for example, how a subnetwork is hooked up the whole network is context dependant.
3- Point 3 is concerned with the strong association between growth and architecture. In general, Many of the contribution to this brainstorm group could be revised, if their author stopped to think "pure GP" where the call of a subprocedure is an unpredictable event, which may happen an unpredictable number of times, sometimes useless. In this framework, everything is highly dynamic, the computer memory needed to evaluate the fitness keeps growing and shrinking with the depth of encapsulated function call. The interest of growth is strongly corelated with the idea to handle a fixed size architecture instead of this dynamic object that is the interpretation of a LISP program or any program, with procedure calls. In the growth approach, a big number of calls translate in a huge architecture, and you can immediately see its harmfull effect in terms of memory size, and speed. What you want is not only to get a correct program, you want to generate a nice, and small architecture. You want to see your architecture, see it grow. Growth for me, is not something to be discussed, but something to be managed.
In other words, one of the interest of growth, is that you have a very precise tool to measure the time it will take to evaluate the fitness of a given individual. This is simply the size of the architecture. You know it, before even you have started to evaluate the fitness. You have an idea of it even before the end of the growth. As a result, you can have a tigh control of the evaluation time for fitness, a control that is independant from any other feature of the particular algorithm that you are using. I usually use abortion in a massive way of cancerous development (those who involve doubly recursive calls for example), and this cancerous thing do not take any of the ressource. The criterion is simple, if the baby grows upper max K1, K2*nio cells, kill it. K1, K2 are problem dependant constant largely problem independant. nio is the number of input output units. Cancer is common in early generations, but not for long. I d' say that cancer is THE illness that plague the standard ADF approach with standard GP using program interepretation. Encapsulated ADF are well known to take an awful amount of CPU, and people using them shamelessly use thousands of tricks to restrict the possible combinations. As a doctor I'd say use the architecture approach to cure that, at least you 'd be able to see the illness at work, your screen will darken as the number of cells grows!
4-Comming back to the speed argument (the most important in my mind, all the computer science is about speed) an even more obvious interest of growth, and the "architecture" approach, it that the object is not dynamic, and the memory needed to evaluate the finess is assigned once for all for all the fitness cases. In many application, each fitness case needs the phenotype to be "run" 1000 times, and this multiplies. This is an important feature, is we think that very often more time is spent in malloc and calloc, than in actual arithmetic operation. You can also optimize the datastructure that stores the architecture, to sweep through all the nodes in a single fully parallel loop that can be obvioulst pipelined or vectorized by the most stupid compiler (nowadays microprocessors do a lot of these fancy things, the pentium for example and most of the MIMD parallel dinosaures paragon, CM5...)
My final conclusion is that I d'like to express my hope for more people to use vocabulary related to biology and development in a more appropriate way. I hate earing such words as ontogeny, when it is merely used in a "media" kind of way, to look fashionable, and does nothing else than mixing up everyting, and making things harder to understand. I do not know what ontogeny means, and I am proud of it. As my father like to say about some people: " why express things in a simple way, when it is possible to say it in a complex way ?"