Why people from Mathematics don't like Evolutionary Algorithms like GA or PSO ?

Mohammad Asim Nomani @Mohammad_Asim_Nomani

10 October 2017 36 4K Report

GA and other Evolutionary Algorithms seems to be quite famous within the researchers from Computer Science and Management but researchers from Mathematics just don't like it. Is there any particular reason for that?

We know that with these algorithms we can't prove that our solution is optimal but still in most of the cases we end up with an optimal solution. Apart from that, these algorithms are very easy to implement and fast.

Adam N. Letchford Popular answer

Actually, I don't think it is the case that mathematicians don't like evolutionary algorithms. I think what mathematicians really don't like is the way that some researchers claim to come up with "new" meta-heuristics every month or two, based on things like bats, fireflies and water droplets. Very good discussions of this phenomenon can be found in the following papers.

http://onlinelibrary.wiley.com/doi/10.1111/itor.12001/abstract

http://www.iztok-jr-fister.eu/static/publications/Stu2016.pdf

http://www.sciencedirect.com/science/article/pii/S221471601500010X

Richard Epenoy

Hi, you are true as there is no mathematical proff of convergence for these methods the mathematicians involved in optimization theor do not appreciate these approaches.

Gennady Fedulov

Dear Mohammad Asim Nomani,

In my opinion, fashion for any heuristic solutions will soon pass since these solutions are not science but only craft. Suppose that there are two algorithms "A" and "B". Algorithm "A" produces a good solution A(D) for the intial data D, but this algorithm does not know anything about it. Algorithm "B" produces a bad solution B (D) >A (D), but this algorithm knows about it because it has a LOWER BOUND LB(D) and knows how bad this solution is as a measure 100% * (B(D) - LB(D) / LB(D), where LB(D) < A(D) < B(D).

Now I want to ask: what algorithm is valuable for science: "A" or "B" ? My answer would be "B". My opinion is the following: a solution without any evaluation of this solution IS WORTHLESS, whereas a solution with any evaluation of this solution is of practical interest.

Rodolfo Elias Haber Guerra

Hi all,

Very interesting discussion. I do agree that it is related with the proof of convergence, which can be guaranteed if you apply for instance simulated annealing. However, from my point of view approaches in Engineering are completely different from Mathematics, and therefore many suboptimal solutions are sufficient good to tackle real problems with contraints and nonlinearities in many variables.

Mohammad Asim Nomani

Dear Prof. Fedulov, thank you very much for your answer. I think what I wanted to know is quite clear and understandable now from your example.

Dear Prof. Guerra, I agree. And this different approaches or problem leads people from engineering to rely on the solution which can't be proven optimal.

But is it the case that these evolutionary algorithms are faster in case of highly nonlinear optimization problem?

Sheela Sobana Rani

Hai

The mathematical professors very concern about the laws and postulates. Whereas researchers from Computer Science and Management are concern about the evolutionary algorithms because the constraints are modified according to the user. So they feel it more comfortable because they do not prove these algorithms with laws.

Evan Hughes

A first start is to accept that not all mathematicians have problems accepting EAs. Yes there are a few, however there are also EA researchers who do not see why we still are using conventional methods, which is just as bad.

The second issue comes down to the structure of the problems being solved; not just whether the problems are of theoretical interest only, or are real problems. If the problem is unimodal and quite well behaved, then sometimes an analytical solution can be found, or classic gradient approaches work extremely well. An EA could solve the same unimodal problem,but rarely as efficiently (the few exceptions are often for very high dimensional problems where calculating the gradient is expensive). If the problem is multi-modal, then there is not a single solution and direct analytic problems cannot in general be found. A gradient based method will not guarantee finding the true optima either; many people forget that a random start gradient method is actually a meta-heuristic algorithm and therefore no different to an EA in terms of the arguments relating to the 'stochastic' content vs. the 'deterministic' content of the optimisation process.

The third issue is very contentious and relates to the 'reality' of convergence proofs. For the gradient methods, the proof of convergence is for UNIMODAL functions; there are no general proofs for MULTI-MODAL functions that do not require a second optmiser to be combined with the gradient method (e.g. a random restart etc). For simulated annealing, to have guaranteed convergence as per the proof, the cooling rate needs to be so slow as to be impractical in practice; therefore a realistic algorithm does not actually satisfy the proof. For an EA (this is where it gets *really* controversial) I have heard it mentioned that "the lack of conversion proof is more a statement of the current state of mathematics, than a fault with the EA concept itself"; I do not fully agree with the statement, but it does make one think! Being realistic, there was a time when a proof for simulated annealing did not exist either, so a proof for EAs may well exist and therefore make them just as acceptable as gradient approaches; it is just we have not found it yet.

In general, the optimiser needs to be matched to the problem. Analytic solutions and classic gradient methods should always be considered first as if they work, they will often be the fastest and most accurate. Then meta-heuristics such as combinations of small EA+gradient should be considered, and finally should a full EA solution be tried when all other optimisers have failed. Without a proof of convergence (or at least something close to it), applying an EA is a black-art and fraught with difficulties in practice. The other alternative is often to use the concept of "if you have a problem that is hard to solve, try changing the problem"; the concept is used routinely in EAs where different objective functions or chromosome structures are used. In classic optimisation, solving a least squares solution may be the most common, but is not always the only approach that can be mathematically useful.

Soumitra K Mallick

Dear Prof. Nomani,

We would like to add based on our Millenium Prize solution of the P vs. NP problem and the invention of Mallick-Hamburger-Mallick D Branes String Functor Algebra Calculus (Mallick (2016, 2017 (submitted)), may we humbly, that both Genetic Algorithms and Particle Swarm Optimisation are as we have all Scientific Theoretically, Mathematically and experimentally discovered are bound by certain Natural Laws of Systems (Computer Science) and String Theory (above references) as well as the Mathematical Computation Methods like Newton Raphson Method or Trapezoidal Method of Numerical Analysis and therefore at least in practice is both Mathematical as well as scientific and as we have discovered are conformal with Field Theory logic and trigger flow of information and energy hence are Relativistic. Sorry for overstating the answer but thought you and the other readers may be interested.

Soumitra K. Mallick

for Soumitra K Mallick, Nick Hmaburger, Sandipan Mallick

Mahamad Nabab Alam

A confident guider is better than many random guiders. Because, it can save both time and energy to get desired goal. Something similar philosophy is their on preferring analytical approach over heuristics.

with best regards

Pramudita Palar

Let me add my 2 cents from the aerodynamics field viewpoint, which heavily uses optimization technique to find the optimum shape of aerodynamic bodies.

Aerodynamic optimization is highly expensive, which hinders the use of metaheuristics for solving such cases except if you have the access to supercomputers. Luckily for aerodynamicists, they have the access to gradient information that can be obtained from the adjoint solver, with a cost roughly equals to one function evaluation for an arbitrary number of variables. When you have gradient, then you can mathematically check whether your solution is already optimum or not; by using gradient-based algorithm, convergence will also be very fast! Here, you can see than some researchers devoted their time so that they are able to develop an efficient gradient computation method. They will also be happy when they know that their optimizer reaches the global optimum, since it is mathematically makes sense. Also, most of them are mathematically trained, hence, their approach is also mathematical.

Now, you can use metaheuristics to solve such cases. But the problem is, they are too expensive and they cannot guarantee the optimality! It is much better if they use multi-start gradient-based search to discover global optimum than metaheuristics (please see this reference: http://www.tandfonline.com/doi/abs/10.3166/remn.17.103-126?needAccess=true&journalCode=tecm20)

There, you see that people in aerodynamics frequently uses gradient-based algorithms due to their practical (faster convergence) and mathematical advantages.

Lorenzo A. Ricciardi

[Edited for multiple posts]

I would like to add my 2 cents on the discussion. Gradient based deterministic methods not only are applicable to smooth problems only (ok, one can use subgradients etc if he is tough enough), but as already said they converge only to the local solution closest to the initial guess. Gradient free deterministic methods require to evaluate an exponential number of possibilities, and if the problem has many variables this approach can simply be considered as intractable. Some problems also simply cannot be dealt with gradient based methods because have a discrete nature, and for those problems it's very easy to have a combinatorial kind of complexity.

Evolutionary methods were born exactly to get around those difficulties, so there IS a point in their existence and usage. Moreover, there IS a theorem of convergence for Evolutionary Algorithms in General search spaces (Rudolph, 1996), so there is some strong mathematical justification also for those.

From a practical point of view, a limitation of ALL algorithms (evolutionary and gradient based!) is the absence of an upper bound on the number of iterations required: all methods converge in a finite but arbitrarily large number of iterations. If time to solution is an issue, this aspect has to be carefully considered.

A final consideration: optimisation is a wonderfully vast and rich field, and one has to choose the right tools for the given problem and means available, there's no silver bullet, thus it's not for everyone. I had the honour of participating in an extremely tough optimisation competition, the GTOC, where each team has only 1 month to solve a monstrously complex space trajectory optimisation problem (seriosuly, Google for it). The only way to tackle those problems is with a mixture of gradient and evolutionary methods, a very smart approach to the problem and a lot of insight. Teams that didn't use both approaches either couldn't return a feasible solution or returned rather poor ones. Teams that used a mixture of approaches (our included) managed to return feasible and very good solutions.

Lorenzo A. Ricciardi

@Gennady Fedulov: knowing the lower bound of a function is extremely useful and can be used also in evolutionary methods.

In most cases the knowledge of a lower bound is problem dependant, but for some problems you can refine it iteratively with an algorithm (like with Lipschitz optimisation or Branch and Bound). Those methods usually tend to perform a dense sampling of the search space, so the number of function evaluations can grow extremely fast.

The fact that some algorithms are just "craft" and ohters are "science" is mostly a matter of theoretical refinement, in my opinion. There's some very serious work going on by many scientists in that field, mathematicians included.

Finally, the judjement of a solution is not something that an algorithm can really provide, so only a field expert can tell if a solution is "worthless". An algorithm can converge to a provably optimal solution, but it can't tell if the problem was posed in a meaningful way or not: only a competent person can perform that kind of judgement on the solution.

Adam N. Letchford

http://onlinelibrary.wiley.com/doi/10.1111/itor.12001/abstract

http://www.iztok-jr-fister.eu/static/publications/Stu2016.pdf

http://www.sciencedirect.com/science/article/pii/S221471601500010X