I read somewhere that mutation probability should be nearly 0.015 to 0.02. I wonder how this mutation rate will make any difference to the original chromosome?
Mutation causes a (random) jump in the location of the generated solutions: crossover causes a more controlled and justified move in the location of the generated solutions. So mutation is a random explore, and crossover is more like exploiting what we know. Each time a gene is considered, the mutation probability is used to determine whether to mutate. So small values ensure that not too many mutations are considered at once - though that will depend on the number of genes in each member of the population. So to conclude, it's a question of exploration (randomly, but not utterly randomly as would be the case if many genes spontaneously co-mutated) versus exploration.
It makes a difference. GA is a simulation of human generation process. In reality mutation happens once in a million/trillion(may be more) times so that new born can have some extra quality than parents. Now if we implement the same technique in algorithm then mutation rate comes like that. But we need to remember mutation process also have some process to follow which needs to be determine based on problems.
whatever I have mentioned its just a depiction of the idea and tells about why mutation rate is so small. How rate will make any difference, no doubt Les have given very short and informative comment.
I would personally suggest trying to optimize the mutation rate for your given problem, as it has been shown (e.g. in an article Optimal mutation probability for genetic algorithms) that rates as high as .2 or even higher might be beneficial in certain genetic algorithms. My own experience as well with optimisation of cellular networks showed that it is best to run tests to see which rate works best. You might start with .3, .2, .1, 0.05, 0.01, 0.001 for instance and see which rate works best when everything else is held constant.
Mutation is used for exploration of search space, more precisely to allow candidates to escape from local minima. In case of a large mutation rate the population has difficulties to converge to a (global) minimum. Simply because whenever a candidate comes close to the global minimum mutation pulls it away from it. Thus, regarding exploration of search space, larger population is preferred compared to large mutation rate.
As already suggested, you don't want to reduce the GA to a pure random search so mutation rate needs to be small; on the other hand you need to maintain diversity in the population, so not too small! A popular choice is 1/(string length) which gives an average of 1 mutation per string. In practice it depends on other questions too, such as the encoding used, type of GA (generational vs. steady-state), population size, selection intensity, crossover rate etc. There's an extended discussion and references in the book I wrote with Jon Rowe: Genetic Algorithms-Principles and Perspectives (Kluwer, 2003).
Using larger mutation rates will prevent the genetic algorithm from converging more quickly. Ideally, you want the algorithm to find the optimal solution rapidly. Using small mutation rates leads quickly to good results. Using TOO small mutation rates makes the process much slower, induces lack of genetic variety, and eventually it might even not converge correctly. Using LARGE mutation rates makes it really hard for the algorithm to converge at all, because fit solutions are easily lost.
The best approach is to change the mutation rate gradually. Start with larger mutation rates to induce high genetic variety and avoid local maxima, and as the algorithm approaches the end (last iteration) reduce the rate to allow specification of the best solution. Another algorithm that resembles this is the Simulated Annealing.
Before answering your question, let me briefly describe some basic and important concepts.
As you probably know, we should always accomplish a proper balance between exploration and exploitation ability of the searching/optimiser algorithm. Exploration (simply but not precisely) means searching search space as much as possible, while exploitation means concentrating on one point (hopefully the global optimum).
In GA, mutation operators are mostly used to provide exploration and cross-over operators are widely used to lead population to converge on one the good solutions find so far (exploitation). Consequently, while cross-over tries to converge to a specific point in landscape, mutation does its best to avoid convergence and explore more areas.
Obviously, we prefer to explore much more in the beginning of the search process (to ensure the population coverage and diversity). On the other hand, we prefer more exploitations at the end of search process to ensure the convergence of the population to the global optimum. There is just an exception; when population converges to a local optimum, we should (if we can) increase the population diversity to explore other areas.
According to the above facts, too high mutation rate increases the probability of searching more areas in search space, however, prevents population to converge to any optimum solution. On the other hand, too small mutation rate may result to premature convergence (falling to local optima instead of global optimum).In other words, too high mutation rate reduces the search ability of GA to a simple (and dummy!) random walk while a too small GA (without any other facilities such as niching or crowd-avoiding to preserve diversity) almost always fails to a local optimum.
As Larry Raisanen mentioned, the best value of mutation rate is very problem specific. You can try several values in linear or bidirectional manner. Remember, as Colin Reeves wrote, this value also depends on the nature and implementation of the algorithm.In my opinion, however, there is no constant best mutation rate for most of the real world problems. As I mentioned before, searching algorithm demands different exploration-exploitation ability in different stage of the search process. Hence, a more dynamic mutation rate, as Paulo Gaspar proposed, is more preferred. I believe you can find more complex methods which adaptively tune the mutation rate according to the problem and the state of the current population comparing with the previous ones.
A crossover operator exploits parent solutions, hence usually a higher rate is maintained with the expectation of converging faster using the already explored regions. In contrast, since a mutation operator explores the surrounding of a solution, it is not safe to make a big move in an unknown region but to move slowly (carefully).
In order to visualize our friends answers we can suppose the search space as the curve in this picture. suppose we are at point 'A', if we use a small mutation rate we can explore some points like 'C', in this case we can achieve the chance of reaching point 'B' which is a little bit better than 'A' .But what if we use a large mutation rate and transfer from 'A' to 'E'? It is obvious that we will exploit into point 'D' which is weaker than A. We can conclude that large mutation rate only throws the solution into far distances of current solution and it prevents the algorithm to converge asap and also doesn't guarantee any optimum solutions. So it is wise to use small mutation rate in order to let the algorithm search for better solutions (if exist) in places not so far from the current solution.
The mutation rate is very low in models of Darwinian genetic algorithms because there is a very low probability that a particular random mutation shown to be useful and therefore stable in the population. On the other hand, neo Lamarckian genetic algorithms, such as those using targeted mutations, the mutation rate is usually much greater. This type of genetic algorithm has proved capable of simulating real phenomena evolution of life.
The idea is to understand the backbone of modern genetics and the neo-Darwinian theory of evolution by natural selection. Here the gene mutations occur at random, independently of the environment in which the organisms find themselves. Then, mutations that happen to be ‘adaptive’ to the environment are ‘selected’, while those that are deleterious are weeded out. This is controvesial so far.
If indeed Khalid, the genetic algorithm I developed can model the evolution of living beings. This requires the use of some kind of neo-Lamarckian functions. This same program is provided with a switch to randomize said functions, when it is in "on" position, then the algorithm fails and stops modeling as if extinctions happen always.
Check out "Zen and the art of evolutionary algorithms" http://dl.acm.org/citation.cfm?id=93153 One of the koans is "Let nature be your guide". In fact, most mutations in nature are fixed (by the reverse transcriptase enzyme, due to the double strand nature of DNA) which accounts for a mutation rate that is, in fact, several orders of magnitude smaller than used in evolutionary algorithms.
In any case, it depends on what you call "mutation rate". It's true that usually individuals are mutated by a small amount, but in some cases (canonical GA, for instance), all new individuals undergo mutation, so calling mutation rate "small" depends fully on the level you are looking.
All real living organisms are trying to reproduce themselves accurately. If the mutation rate is too high - basically, if bad mutations are introduced faster than selection can remove them - then it becomes impossible to preserve a genome, i.e. to transmit information forward through generations. So the mutation rate has to be lower than some limit, or the genomes get randomized in spite of selection pressures. On the other hand, if mutations are below that limit, then information can and will be transmitted. Hubert Yockey pioneered this kind of thinking, using Shannon's channel capacity theorem; for a recent survey article see "Genetic Channel Capacity Revisited" by Felix Balado. http://www.csi.ucd.ie/files/u79/balado-bionetics11.pdf
These arguments also apply to evolutionary computation, although some of the details differ.
In a standard Genetic Algorithm mutation is only meant to foster genetic diversity, in particular reintroducing lost alleles. For example, if at a certain point of the evolution process all solutions have a '0' allele for a given gene (i.e. all solutions feature a '0' at a given position in the binary string encoding the solution), the '1' allele is lost forever. This is because crossover can only recombine alleles that exist in the population. That is, crossover can not 'create' a child solution having allele '1' if both parents have allele '0' at a given position in the binary string If a useful allele is lost early in the evolution process by 'mistake' (e.g. premature convergence, or an 'unlucky' selection round), it can be reintroduced in the population only by mutation. However, lost alleles are usually not useful ones. They are removed from the population because they contribute negatively to the fitness of individuals. That's the basic assumption behind evolution theory: good alleles are retained and spread in the population, bad ones disappear. Therefore, in standard GAs mutation is kept very low because it is more likely to do harm (reintroduce a 'bad' gene weeded out by evolution) than good (reintroduce a 'good' gene lost by mistake). It is however necessary to ensure that no allele can be lost forever. In other kinds of evolutionary algorithms (or in non-standard GAs), mutation is used as a main search operator and the mutation rate is much higher.
Great answers. Roberto Vázquez hit on the important point it depends on the problem and to extend that further the type of genetic algorithm/evolutionary algorithm you are using. GAs are great technique but there is a strong experimental aspects to them and in my own experience actual setting due vary between types of GAs and even problems.
Many fitness landscapes display a critical mutation rate (error threshold) compatible with selection and adaptation. Beyond this threshold, the evolutionary algorithm behaves like a random walk and genomic information is lost, that is, structures are destroyed more frequently that selection can reproduce them. Empirical studies by Gabriela Ochoa have verified the occurrence of error thresholds in evolving populations of bit strings using a genetic algorithm (see for example: http://red.cs.nott.ac.uk/~gxo/papers/ETGasEC06.pdf)
The value may be different for your GA problem. What I did is mutate one bit of randomly selected chromosomes, if the chromosome size is small. for large chromosomes, more chromosomes/ more bits can be passed through mutation operation.
However, the idea is to find new better solutions that cannot be found by crossover operation.
According to my experience, mutation improve convergence.
This is my personal experience, depends on the nature of your problem (whether computationally expensive, combinatorial in nature, etc.), normally for continuous-value design optimization, you want to have a small mutation rate, too large a mutation rate, and you might get yourself a GRASP (grasp greedy randomized adaptive search) instead of GA. This assumes that you have a large population. However, in a computationally expensive combinatorial optimization problem, where you limit a GA population to be much smaller so as to control the computational cost, then higher mutation rate will be justified.
It is true that the mutation is used for introducing diversity in the population. High rate of mutation lead to RANDOM search. My personal opinion is that if the population size is small high rate of mutation should be used (2%-10%) but if you are working on larger population, the smaller mutation rate should be preferred (less than 2%).
Mutation regards the meta heuristic aspect of GA. But it must be treated very carefully. Because a high rate of mutation can lead the solution to diversity. Mutation works in a random manner and its duty is to avoid the solution from trapping in local optima through exploiting the unseen areas of the search space. But a big value for it certainly diverges the solution.
All of the replies provide good information on mutation rate. I want to add that in a generic GA, since there exists randomness in other components (like crossover and selection), mutation rate is taken small not to increase the diversification too much. If a GA is designed differently, then the mutation rate could be higher to find a good balance between diversification and intensification.
GA=Exploration(mutation=something willy nilly and good at low rate, so we dont get stuck in a small region of problem space)+Exploitation(Crossover+Fitness based selection=moves us to the global answer)
In addition it must be considered that mutation rate and mutation step is dependent to the other algorithm parameters such as selection pressure, population size and elitism strategy (if used).
Thank you guys, my confusion is what is the role of mutation and crossover probability. Because in one iteration of GA we do selection, cross over and mutation and evaluation.
the role of Mutation is to solve the local minimum problem. because the heuristic and meta heuristic algorithms may drop in a local minimum and can not find a better point so with mutation we have a chance to prevent to drop in a local minimum.
If mutations are random, all of the above is possible but limited. If mutations are directed (i.e. a sort of evolutionary algorithm based on the #SDMtheory) this is another story. Please follow me: @evolutionraad
@Borhan Kazimipour First , Thank you for your detailed answer. Can you recommend a refrence that mentions (mutation rate is a very probelm-specific) . Thank you very much.
I read in G. Moghadampur (2013), "Outperforming Mutation Operator with Random Building Block Operator in genetic Algorithm" that the lower limit for mutation probabilities can be this equation min_lim = 1 / len(chromosome). For instance if the length of chromosome (i.e. candidates) is 16bit, then the lower limit for the probability is 1/16 = 0.0625 or 6.25%.
I am personally using a multiplier (i.e. 2~4) to set the upper limit. This way there is statistically less chance for mutation to occur but at the same time, there is enough chance that mutation may occur in some of the generation so that the candidate designs do not get stuck on a local minima.
Please note this multiplier trick is a hypothesis and unknown to me as of writing this post, if somebody has already tried it.
Mutation-crossover ratio is actually based on the problem scenario, as others have also rightly directed.
If someone has not checked it yet, this article gives an overview of the different instances on the selection criteria for each of them:
Hassanat, A., Almohammadi, K., Alkafaween, E., Abunawas, E., Hammouri, A. and Prasath, V.B., 2019. Choosing mutation and crossover ratios for genetic algorithms—a review with a new dynamic approach. Information, 10(12), p.390.
In my view, the mutation probability is based on your problem you can determined it based on your problem. In General , the nature of GA , the mutation is like a global search, and it can occurs less than crossover probability (global search).