I wrote a "plain-text" genetic algorithm

Using python... it's just about finished now, close enough to realize that it actually works. Pretty cool.

All other GA algorithms I've seen used some sort of binary string encoding or something similar to represent the "organism" algorithms and mutate/breed/evolve them.

That sucks if you want the organism code to be plainly readable so you know what the hell it is actually doing when it runs. So I wrote a helluva long spaghettic code Python program that can excute, rank for fitness (based on a specific goal), and then mutate and breed a folder full of other Python programs. That Python is so readable makes it a bit easier, but writing mutation rules for program code that doesn't break syntax, logic, etc. was a bitch! It's kinda cool though, sort of like human DNA being represented in plain text instead of four letters in groups of three... imagine instead of guanine a base said "if variable 05 > sin(0.5) then add lysine to the protein you are working on." Ha!

I'm a crappy hack programmer if there ever was one (this whole thing is procedural... no object oriented in my world!) but I'm quite proud of my accomplishment. Downside is that compared to "dumber" binary type GA's this one is a lot slower to search a space due to all the text parsing, but it makes it much easier to seed organisms... you can actually write real Python programs that attempt to solve a problem, and let evolution take them to the next level(s).
 
lol! :)

There's strengths and weaknesses in what I've done. On the downside the time per generation is a lot longer than 'dumb' binary manipulations, and the search space is a lot more limited by the framework I set up for mutation rules, seed organisms, etc. so I won't be solving unknown mysteries of the universe this way.

On the upside, all of the progeny are executable python programs that are within the framework I've outlined so the search of the narrow space I've defined is pretty efficient. Not all of the progeny will do stuff that works, but they'll do stuff that makes sense. IOW, if I framed the search to solving a particular equation, I won't get mutants that attempt to draw stuff on the screen. So it seems a pretty good focused way to find an optimal solution for a particular problem - however, I'm not just using the GA to find optimal coefficients for a presumed solution, but rather search the space of possible ways to go about solving the problem.
 
If the strings are getting longer over the time you should consider using counted strings because slowness actually comes from searching for the string end each time the length is changed.
 
Hmm... I'm not sure I'm following you, probably because I'm such an absolutely horrid programmer. Well, I'm not a programmer by any stretch, I just happened to cobble this together in a language that even I can make sense of.

As for the strings growing... I have very fine control over all aspects of the mutation process: what types of things are allowed to be mutated (numbers, variables, math functions, equalities, whole if/then blocks of code, larger file fragments, etc.), what kind of mutations are allowed on those things (randomly change it, swap it with another in the same/different organism, copy and insert it somewhere appropriate, delete it, etc.), and how often these mutations occur (maximum total number of mutations per offspring, maximum number of each type, whether I want to enforce a number or allow a random number up to the limit, etc.).

So I play with the balance of these 50 or so control settings until the organism program length is fairly steady state on average over many generations, or has just a slight growth at the most.

Now, because I'm such a bad programmer and couldn't understand how to possibly get the diversity and archiving I wanted in object oriented, where each organism was just a member of a class, I've structured this so that each organism is its own separate python program and file. The "main" program just launches all the organisms in a given generation, ranks them for fitness, and follows the configuration settings to mutate and breed them to produce the next generation, and repeats the cycle.

Given that background... what is a string counter, and does it apply to what I'm doing? So far as I can tell, the slowness is coming from each organism being its own file, so even if a lot of variables (well, hundreds really) are shared between organisms, each one rebuilds its own lists and such. I do a lot of list building... that's about the only thing I really understand, and they work pretty well! lol :)
 
I guess I should give a little perspective on what I mean by slow too...

The task I'm having them work towards is pretty demanding as well, forcing each organism to open the same data file, read in a few hundred items and put that into a list, work some math, and repeat 50 or so times... for one organism in one generation. That takes maybe 5 seconds depending on how much data I want them to operate on in a given generation.

But you can see how this limits the speed of evolution. The mutation process is pretty quick, I can generate a thousand new organisms in under a minute, but I can only evaluate 10 or so per minute so that is the bottleneck.
 
Back
Top