1、What is a genetic algorithm?l Methods of representation l Methods of selection l Methods of change l Other problem-solving techniques Concisely stated, a genetic algorithm (or GA for short) is a programming technique that mimics biological evolution as a problem-solving strategy. Given a specific pr
2、oblem to solve, the input to the GA is a set of potential solutions to that problem, encoded in some fashion, and a metric called a fitness function that allows each candidate to be quantitatively evaluated. These candidates may be solutions already known to work, with the aim of the GA being to imp
3、rove them, but more often they are generated at random.The GA then evaluates each candidate according to the fitness function. In a pool of randomly generated candidates, of course, most will not work at all, and these will be deleted. However, purely by chance, a few may hold promise - they may sho
4、w activity, even if only weak and imperfect activity, toward solving the problem.These promising candidates are kept and allowed to reproduce. Multiple copies are made of them, but the copies are not perfect; random changes are introduced during the copying process. These digital offspring then go o
5、n to the next generation, forming a new pool of candidate solutions, and are subjected to a second round of fitness evaluation. Those candidate solutions which were worsened, or made no better, by the changes to their code are again deleted; but again, purely by chance, the random variations introdu
6、ced into the population may have improved some individuals, making them into better, more complete or more efficient solutions to the problem at hand. Again these winning individuals are selected and copied over into the next generation with random changes, and the process repeats. The expectation i
7、s that the average fitness of the population will increase each round, and so by repeating this process for hundreds or thousands of rounds, very good solutions to the problem can be discovered.As astonishing and counterintuitive as it may seem to some, genetic algorithms have proven to be an enormo
8、usly powerful and successful problem-solving strategy, dramatically demonstrating the power of evolutionary principles. Genetic algorithms have been used in a wide variety of fields to evolve solutions to problems as difficult as or more difficult than those faced by human designers. Moreover, the s
9、olutions they come up with are often more efficient, more elegant, or more complex than anything comparable a human engineer would produce. In some cases, genetic algorithms have come up with solutions that baffle the programmers who wrote the algorithms in the first place!Methods of representationB
10、efore a genetic algorithm can be put to work on any problem, a method is needed to encode potential solutions to that problem in a form that a computer can process. One common approach is to encode solutions as binary strings: sequences of 1s and 0s, where the digit at each position represents the v
11、alue of some aspect of the solution. Another, similar approach is to encode solutions as arrays of integers or decimal numbers, with each position again representing some particular aspect of the solution. This approach allows for greater precision and complexity than the comparatively restricted me
12、thod of using binary numbers only and often is intuitively closer to the problem space (Fleming and Purshouse 2002, p. 1228).This technique was used, for example, in the work of Steffen Schulze-Kremer, who wrote a genetic algorithm to predict the three-dimensional structure of a protein based on the
13、 sequence of amino acids that go into it (Mitchell 1996, p. 62). Schulze-Kremers GA used real-valued numbers to represent the so-called torsion angles between the peptide bonds that connect amino acids. (A protein is made up of a sequence of basic building blocks called amino acids, which are joined
14、 together like the links in a chain. Once all the amino acids are linked, the protein folds up into a complex three-dimensional shape based on which amino acids attract each other and which ones repel each other. The shape of a protein determines its function.) Genetic algorithms for training neural
15、 networks often use this method of encoding also.A third approach is to represent individuals in a GA as strings of letters, where each letter again stands for a specific aspect of the solution. One example of this technique is Hiroaki Kitanos grammatical encoding approach, where a GA was put to the
16、 task of evolving a simple set of rules called a context-free grammar that was in turn used to generate neural networks for a variety of problems (Mitchell 1996, p. 74).The virtue of all three of these methods is that they make it easy to define operators that cause the random changes in the selecte
17、d candidates: flip a 0 to a 1 or vice versa, add or subtract from the value of a number by a randomly chosen amount, or change one letter to another. (See the section on Methods of change for more detail about the genetic operators.) Another strategy, developed principally by John Koza of Stanford U
18、niversity and called genetic programming, represents programs as branching data structures called trees (Koza et al. 2003, p. 35). In this approach, random changes can be brought about by changing the operator or altering the value at a given node in the tree, or replacing one subtree with another.
19、Figure 1: Three simple program trees of the kind normally used in genetic programming. The mathematical expression that each one represents is given underneath.It is important to note that evolutionary algorithms do not need to represent candidate solutions as data strings of fixed length. Some do r
20、epresent them in this way, but others do not; for example, Kitanos grammatical encoding discussed above can be efficiently scaled to create large and complex neural networks, and Kozas genetic programming trees can grow arbitrarily large as necessary to solve whatever problem they are applied to.Met
21、hods of selectionThere are many different techniques which a genetic algorithm can use to select the individuals to be copied over into the next generation, but listed below are some of the most common methods. Some of these methods are mutually exclusive, but others can be and often are used in com
22、bination.Elitist selection: The most fit members of each generation are guaranteed to be selected. (Most GAs do not use pure elitism, but instead use a modified form where the single best, or a few of the best, individuals from each generation are copied into the next generation just in case nothing
23、 better turns up.)Fitness-proportionate selection: More fit individuals are more likely, but not certain, to be selected.Roulette-wheel selection: A form of fitness-proportionate selection in which the chance of an individuals being selected is proportional to the amount by which its fitness is grea
24、ter or less than its competitors fitness. (Conceptually, this can be represented as a game of roulette - each individual gets a slice of the wheel, but more fit ones get larger slices than less fit ones. The wheel is then spun, and whichever individual owns the section on which it lands each time is
25、 chosen.)Scaling selection: As the average fitness of the population increases, the strength of the selective pressure also increases and the fitness function becomes more discriminating. This method can be helpful in making the best selection later on when all individuals have relatively high fitne
26、ss and only small differences in fitness distinguish one from another.Tournament selection: Subgroups of individuals are chosen from the larger population, and members of each subgroup compete against each other. Only one individual from each subgroup is chosen to reproduce.Rank selection: Each indi
27、vidual in the population is assigned a numerical rank based on fitness, and selection is based on this ranking rather than absolute differences in fitness. The advantage of this method is that it can prevent very fit individuals from gaining dominance early at the expense of less fit ones, which wou
28、ld reduce the populations genetic diversity and might hinder attempts to find an acceptable solution.Generational selection: The offspring of the individuals selected from each generation become the entire next generation. No individuals are retained between generations.Steady-state selection: The o
29、ffspring of the individuals selected from each generation go back into the pre-existing gene pool, replacing some of the less fit members of the previous generation. Some individuals are retained between generations.Hierarchical selection: Individuals go through multiple rounds of selection each gen
30、eration. Lower-level evaluations are faster and less discriminating, while those that survive to higher levels are evaluated more rigorously. The advantage of this method is that it reduces overall computation time by using faster, less selective evaluation to weed out the majority of individuals th
31、at show little or no promise, and only subjecting those who survive this initial test to more rigorous and more computationally expensive fitness evaluation.Methods of changeOnce selection has chosen fit individuals, they must be randomly altered in hopes of improving their fitness for the next gene
32、ration. There are two basic strategies to accomplish this. The first and simplest is called mutation. Just as mutation in living things changes one gene to another, so mutation in a genetic algorithm causes small alterations at single points in an individuals code.The second method is called crossov
33、er, and entails choosing two individuals to swap segments of their code, producing artificial offspring that are combinations of their parents. This process is intended to simulate the analogous process of recombination that occurs to chromosomes during sexual reproduction. Common forms of crossover
34、 include single-point crossover, in which a point of exchange is set at a random location in the two individuals genomes, and one individual contributes all its code from before that point and the other contributes all its code from after that point to produce an offspring, and uniform crossover, in
35、 which the value at any given location in the offsprings genome is either the value of one parents genome at that location or the value of the other parents genome at that location, chosen with 50/50 probability. Figure 2: Crossover and mutation. The above diagrams illustrate the effect of each of t
36、hese genetic operators on individuals in a population of 8-bit strings. The upper diagram shows two individuals undergoing single-point crossover; the point of exchange is set between the fifth and sixth positions in the genome, producing a new individual that is a hybrid of its progenitors. The sec
37、ond diagram shows an individual undergoing mutation at position 4, changing the 0 at that position in its genome to a 1.Other problem-solving techniquesWith the rise of artificial life computing and the development of heuristic methods, other computerized problem-solving techniques have emerged that
38、 are in some ways similar to genetic algorithms. This section explains some of these techniques, in what ways they resemble GAs and in what ways they differ. Neural networksA neural network, or neural net for short, is a problem-solving method based on a computer model of how neurons are connected i
39、n the brain. A neural network consists of layers of processing units called nodes joined by directional links: one input layer, one output layer, and zero or more hidden layers in between. An initial pattern of input is presented to the input layer of the neural network, and nodes that are stimulate
40、d then transmit a signal to the nodes of the next layer to which they are connected. If the sum of all the inputs entering one of these virtual neurons is higher than that neurons so-called activation threshold, that neuron itself activates, and passes on its own signal to neurons in the next layer.
41、 The pattern of activation therefore spreads forward until it reaches the output layer and is there returned as a solution to the presented input. Just as in the nervous system of biological organisms, neural networks learn and fine-tune their performance over time via repeated rounds of adjusting t
42、heir thresholds until the actual output matches the desired output for any given input. This process can be supervised by a human experimenter or may run automatically using a learning algorithm (Mitchell 1996, p. 52). Genetic algorithms have been used both to build and to train neural networks. Fig
43、ure 3: A simple feedforward neural network, with one input layer consisting of four neurons, one hidden layer consisting of three neurons, and one output layer consisting of four neurons. The number on each neuron represents its activation threshold: it will only fire if it receives at least that ma
44、ny inputs. The diagram shows the neural network being presented with an input string and shows how activation spreads forward through the network to produce an output. Hill-climbingSimilar to genetic algorithms, though more systematic and less random, a hill-climbing algorithm begins with one initia
45、l solution to the problem at hand, usually chosen at random. The string is then mutated, and if the mutation results in higher fitness for the new solution than for the previous one, the new solution is kept; otherwise, the current solution is retained. The algorithm is then repeated until no mutati
46、on can be found that causes an increase in the current solutions fitness, and this solution is returned as the result (Koza et al. 2003, p. 59). (To understand where the name of this technique comes from, imagine that the space of all possible solutions to a given problem is represented as a three-d
47、imensional contour landscape. A given set of coordinates on that landscape represents one particular solution. Those solutions that are better are higher in altitude, forming hills and peaks; those that are worse are lower in altitude, forming valleys. A hill-climber is then an algorithm that starts
48、 out at a given point on the landscape and moves inexorably uphill.) Hill-climbing is what is known as a greedy algorithm, meaning it always makes the best choice available at each step in the hope that the overall best result can be achieved this way. By contrast, methods such as genetic algorithms
49、 and simulated annealing, discussed below, are not greedy; these methods sometimes make suboptimal choices in the hopes that they will lead to better solutions later on. Simulated annealingAnother optimization technique similar to evolutionary algorithms is known as simulated annealing. The idea borrows its name from the industrial process of annealing in which a material is heated to above a critical point to soften it, then gradually cooled in order to erase defects in its