27 May 2007

Attempt to write a GA

This is really just a note-to-self:

This week I started playing with my MacBook's XCode IDE by putting together a simple genetic algorithm. Melanie Mitchell (1997) suggests the following programming exercise: put something together that ends up evolving binary strings of all 1s (i.e. a typical chromosome might start out as "0110001010110011" and after 20 generations of fitness-proportionate selection, roulette-wheel sampling, and 70% crossover, look like "1111011111111111." What fancy terminology.)

Paraphrasing Mitchell's description of a simple GA, the steps are along the lines of:

  1. Start with a randomly generated population of n l-bit chromosomes.
  2. Calculate the fitness f(x) of each chromosome x in the population.
  3. Repeat the following until n offspring are created:
    1. Pick two "parents," with so-called roulette-wheel sampling so that it's more likely to pick fitter parents. (Where fitness is the number of ones in the chromosome.)
    2. With some probability PROB_XOVR (like, 70%) cross over at a randomly-selected junction within the two chromosomes. These are two offspring.
    3. With some probability PROB_MUT (like, 0.1%) mutate (flip) each spot in each offspring chromosome.
  4. Replace the current population with the new population.
  5. Go to step 2. (You can go 20, 100, or however long it takes to evolve a fit population.)

Anyhow, it worked! I was surprised by what an impact crossover has, but it's probably the caffeine making me think fuzzy.

With a population of 100 and a chromosome length of 20, the average fitness (the number of 1s at the beginning of the universe) was, say, 9.8. By the 48th generation the average fitness was about 17, and by the next generation chromosomes started to appear with all 1s.

Here's another run. A snippet of Generation 0 shows what you'd expect; each of the length-20 chromosomes has about 10 zeros and 10 ones (in this run, 9.86):

totalfitness = 986. avgfitness = 9.860000.
Generation 0.

By generation 23 the population had become much more 1-laden. The average fitness climbed to 17.3:

Generation 23.
Found a gene with all 1s at generation 23.
Found a gene with all 1s at generation 23.
totalfitness = 1730. avgfitness = 17.299999.
Generation 23.


I guess the next step is plotting population fitness as a function of generation. That would be an excuse to re-learn file I/O.


No comments: