I have found that some mathematicians disagree with meta-heuristic and heuristic algorithms. However, from a pragmatic point of view such algorithms often can find high-quality solutions (better than traditional algorithms) when tackling an optimization problem but the success of such algorithms usually depend on tuning the algorithms' parameters. Why some mathematicians are against these algorithms? is it because they don't posses a convergence theory?
I am looking for different arguments on the situation.
Thank You!
This question is similar to another one that I have seen. My response to that one was basically this:
1. I don't know any mathematicians who are prejudiced against heuristics per se. Many of them (myself included) use them regularly.
2. I do know a lot of mathematicians who are fed up with people claiming to invent dozens of "new" meta-heuristics every year (like harmony search or the bat algorithm), when really they are just old ideas expressed in fancy new words. Actually, many people in the heuristics community are unhappy with it as well.
3. I also know people working in combinatorial and/or global optimisation who are fed up with people saying "problem X is NP-hard, and therefore one must use a heuristic". This shows a breathtaking ignorance of the (vast) literature on exact methods (and approximation algorithms) for NP-hard problems.
(By the way, I agree with Michael's comment about "matheuristics" being a very interesting research direction.)
I think they see that these algorithms are very simple compared to their methodologies.
Because they are stochastic and not exact, and in most of the time is faced with unrealizable solutions. but it does not prevent their effectiveness provided you make a good setting.
You may argue that because of the No Free Lunch Theorem, you are doomed from Day 1. And indeed, in order to attack a new problem you may need several trials in several "departments":
* You may need to write down your optimization problem not only once, but in several attempts, so that you can decide how variables as well as constraints shall be defined. (Many students I meet think that there is only one way to "write down" the model, but decision variables can, for many actual problems, be defined in different ways, and some are profitable for some problems, for some others not. Expertize through lots of variations of model definitions will provide you with a better understanding what works the best for your type of problems. Perhaps you can run a software for smaller problem instances, and draw some conclusions, but if you need to solve a problem with very many variables and/or constraints, then you need to investigate - in more detail - possible representations of the problem, such that natural algorithms from the world of mathematical optimization can be applied.
* Nothing should stop you from combining methods, of course. Branch & Bound, Branch & Cut, Lagrangian relaxation, Benders decomposition - and heuristic versions thereof - may work for you in some cases, in some cases not.
* Remember that the "NP=P?" question is not yet resolved.
Mohammed Qais Well, they are indeed simple in the sense that they lack the rigor but, this is not the reason I think some mathematicians argue against this class of algorithms. These algorithms are sometimes useful and even practically can do better if compared to the traditional algorithms but there is always a trade-off involved. Like anything in life!
So, the final judgement is rather 'subjective'. Some for instance would argue against them because they are not deterministic in nature and I actually agree with these mathematicians. These mathematicians are more concerned with "perfection" rather than practicality which is not so strange, true mathematicians are artists in nature.
well, mathematicians tend to be theoretical and as an applied mathematician and an optimizations expert, we just want solutions, very good ones at that. we dont care how they are found. I have realized that in CPLEX for example, solution polishing which uses evolutionary algorithms to obtain new solutions from incumbent solutions can be sometimes better at finding optimal or near optimal solutions where branch and bound is still struggling.
With all due respect, being a mathematician, theory, is what we do. We make abstraction of situations that occur in life. Engineers use theoretical results develop by mathematicians, and they don't question the validity of them. But mathematicians we always question facts, and even theories.
May be intervention of non-mathematicians into the field of optimization who always develop black box type of solution methods. Mathematicians love anything which can be proved
Vijaya: we certainly do not always develop black-box methods. Have you been sleeping in the lecture room again? :-)
Dear Sir, I do not mean all non-mathematicians give always black box methods. This is just communication gap.
We do not love just anything - we always try to derive something worth-while - sometimes even approximate stuff - if the problem is stiff. We are not only theorists - also we do lots of work in industry collaborations, which means that we need to derive practical solutions. And we do, and we do it well. And we both publish and implement in the work-place.
To my opinion, nowadays, metaheuristics are just an easy way of publication. To publish one, select any nature's phenomenon, relate it somehow with few simple existing equations, get the results and prove anyhow this works better than the existing one. Definitely not a fan of those! However, in last decade some researchers have proposed some powerful ones. Some of them does not require any tuning parameters and can provide a good quality solution for the complex engineering problems for which estimating accurate derivates is almost impossible.
NOW I think I see what you mean! :-) Yes, I interpreted you wrong - sorry! Communication gap, indeed.
Jia Yao: Wrong. As we work on very complex combinatorial optimization problems together with industrial partners, we work on very practical things, too. Wherever you got the notion that a mathematician only worries about proofs, they were terribly wrong. I guess you are not a mathematician. Don't spread a false rumour, then.
We combine, with big success, practical models that are dedicated to the exact problem at hand, theory that enables us to create combinatorial models with nice properties regarding their practical complexity, we create smart implementations that run very fast, and that provide near-optimal solutions. That's what the industry needs - together with a valid estimate on near-optimality. And it's fun!
We also receive awards - if that can be seen as a positive thing: one in industrial maintenance, regarding aircraft engine components; and one on our work in transportation research.
And we run such projects to this day - since 20 years back.
Ergo: don't tell me that we do not care about practice. End of.
And I should add - based on the headline of this thread, "Why some mathematicians are against meta-heuristic and heuristic algorithms?":
I do not dislike metaheuristics per se; they have their pros and cons just like every other methodology have. What I do find troubling is that too many get stuck in the bubble. It would be beneficial for all if the two "communities" - if we can call them that - could work together. Perhaps that's what is called matheuristics?
We could definitely gain by talking more, providing valid estimates from the math side, and practical "short-cuts" and other tools from the other.
In my opinion, mathematicians are not opposed to such algorithms because they are able to solve very interesting problems accurately. They are opposed to people who, when faced with any problem, quickly turn to meta-heuristic algorithms without realizing the problem.
For example, I know a professor who asked his student to implement a meta-heuristic algorithm to solve a particular problem. That student found the optimum point for his teacher with a pencil because the total search space for that problem was only 7 cases. It ended with the rejecting of the poor student, and it became an experience for others to solve any problem with the meta-heuristic algorithms, regardless of its structure. In my opinion, such behaviors could be the source of opposition to these valuable algorithms.
In the time, where an exact or analytic solutions of a given problem take (hours, days or even months), heuristic and meta-heuristic solutions are efficient in both time and accuracy, without a doubt they are the best alternative when the problem is NP-hard. I think that, there is no mathematician who can oppose the use of heuristic and meta-heuristic algorithms.
It's a delicate matter, this one. I do not think that you are quite correct. If you were to look up
https://en.wikipedia.org/wiki/Knapsack_problem#Computational_complexity
you would find a rather nice description of ways in which to approximately solve the knapsack problem, for example. These methods both are fast and provide solutions that are proven to be near-optimal. I have not seen any metaheuristic that has a worst-case guarantee. Have you?
This question is similar to another one that I have seen. My response to that one was basically this:
1. I don't know any mathematicians who are prejudiced against heuristics per se. Many of them (myself included) use them regularly.
2. I do know a lot of mathematicians who are fed up with people claiming to invent dozens of "new" meta-heuristics every year (like harmony search or the bat algorithm), when really they are just old ideas expressed in fancy new words. Actually, many people in the heuristics community are unhappy with it as well.
3. I also know people working in combinatorial and/or global optimisation who are fed up with people saying "problem X is NP-hard, and therefore one must use a heuristic". This shows a breathtaking ignorance of the (vast) literature on exact methods (and approximation algorithms) for NP-hard problems.
(By the way, I agree with Michael's comment about "matheuristics" being a very interesting research direction.)
Not only mathematicians, but also some engineers will not recommend to use evolutionary algorithms and meta-heuristics in most cases. Specially, for large scale problems such as topology optimization of continuum structures. For example, in the paper: “Sigmund, O. (2011). On the usefulness of non-gradient approaches in topology optimization. Structural and Multidisciplinary Optimization, 43(5), 589-596.”, the use of evolutionary algorithms in topology optimization is discussed. In this paper, it is demonstrated that most of evolutionary approaches and meta-heuristic are hopelessly inefficient in solving the problems with many variables such as topology optimization and the final designs are not satisfactory (in familiar objectives such as compliance minimization).
In other words, non-gradient methods should be used in special and costly objective functions such as crashworthiness design in which there is computational noise in gradient information and analytical derivatives are unavailable (Bujny, M., Aulig, N., Olhofer, M., & Duddeck, F. (2018). Identification of optimal topologies for crashworthiness with the evolutionary level set method. International Journal of Crashworthiness, 23(4), 395-416.).
Metaheuristics are one tool among many others in the box of optimization professionals and researchers. Like any other method, they are useful in specific application contexts, typically in conditions where other methods (e.g., mathematical-programming based methods, dynamic programming, approximation algorithms, randomized algorithms...) fail due to excessive computational time. This situation, unfortunately, occurs often when facing practical large-scale problems combining multiple decision sets. Learning when to use metaheuristics is an important prerequisite before even applying any metaheuristic. You could check the following introductory slides (slides 29-35) for example:
http://www-di.inf.puc-rio.br/~vidalt/courses/Slides/1-Introduction.pdf
Regarding convergence properties, there are some for specific types of genetic algorithms and simulated annealing approaches, but these results are mainly theoretical and suppose an arbitrarily high number of iterations.
Regarding the "no free lunch" argument, keep in mind that the same argument applies to all search methods (not only metaheuristics). It says that for any arbitrary function (which arbitrarily associates INPUTS with OUTPUTS), no search method is better, on average, than any other. However, when you do practical optimization for useful problems, there is always some structure that can be exploited, i.e., you are not picking an arbitrary objective function in the space of all possible objectives. Otherwise, research on this problem would be pointless.
I have nothing to add to the answers of Michael Partriksson, because they are completely to the point: There is not one good model, there is not one good solution method. Study, play around with different models and methods!
As has been mentioned, not all problems are the same, and it is the characteristics of the problem that should guide the choice of the optimisation algorithm to use, not personal preference for a particular genre of solution. For example, a problem may be NP hard with huge numbers of decision variables, but essentially be unimodal; there are likely to be excellent classical optimisers that can be applied. In contrast, there are also many problems with small decision spaces that are highly multi-modal and many classical methods can struggle, but that is not to say that all classical methods will fail as there are good algorithms for multi-modal work too. Techniques such as Evolutionary Algorithms can tackle both the unimodal and multi-modal problems with few changes to the algorithm structure, but it is the multi-modal problem spaces where they provide a performance that can sometimes be as good as, or out-perform more classical approaches. On the uni-modal problems where an analytic gradient is available, their performance will be poor in comparison to the classical methods.
Yes a key issue is 'tuning' the algorithm to give the best peformance, but often when used in anger, an EA may only be used to first explore how a new problem behaves and locate a first few approximate optima, so that an appropriate classical method can be then applied to find a more refined solution. Solving problems in engineering where optimisers really make a difference is very often about finding a good solution to the problem once, but there are also times when a problem must be solved repeatedly, and then an efficient optimiser is super important (e.g. on-line adaptive systems often have to perform repeated optimisations).
I very much agree with the other responders to the question in that as there are no formal convergence proofs for many of the heuristic algorithms, it means that 'new' algorithms (which are often either a re-discovery of an existing method, or a minor incremental change) have been developed and allowed to be published quite readily in engineering and optimisation communities. Where the problems being tackled are truly difficult and a heuristic technique is a good choice, then the publication is often useful, but for a large quantity of publications, sadly the problems chosen are 'test functions' that can be solved efficiently with other means.
I also agree that the number of mathematicians who are against metaheurisics is small, although I have worked with a few. For many of those mathematicians however, they had very limited exposure to real engineering problems. Once provided with a real problem to solve which was highly multi-modal and had no analytic gradient available, soon changed their opinion! It must be remembered that a gradient descent optimiser, when combined with a random re-start to allow it to tackle a previously unseen multi-modal problem (i.e. we have no a-priori knowledge about how the local optima are arranged in the objective space), is now a meta-heuristic process. A very simple Evolutionary Algorithm, coupled with the gradient-based search, will usually out-perform the random restart metaheuristic alternative, unless the multiple optima are distributed with random depths across the optimisation space, and then the random restart would be best.
In my opinion they are too strict bound to Boolean algebra reasoning. Also physics does this in most cases. An exception in physics is the calculus with noncommuting operators which I persue in my MINT-Wigris model and research project. The orthomodular logic uses not the two-valued Boolean algebra, listed as yes-no or 0,1 or +1, -1 valued semantic. The syntax is different and the semantic replaces 0,1 by an orthomodular or modular lattice. You have then no deduction theorem for quantum mechanics as example and the modus ponens has to be replaced. It holds only for sets of commuting operators, not for instance for position-momentum or the other Heisenberg uncertainty operators time-energy, angle-angular momentum.
Your opinion is wrong; the orthomodular logic in my article from early 1980 is theoretical confirmed. If you apply it you can see how wrong the physics logical quantum paradoxa are. And mathematicians join in not using their brain for my quantum logic: operators are not commuting and this is theoretically confirmed and gives for instance the accepted Heisenberg uncertainties. Also the queries about not understanding the gyromagnetic relation is of that kind. It is helicity for electrical charged leptons and helicity of neutral leptons is accepted long time ago, but only recently the nonorientable version for spins up down changes is applied. I apply it to the superposition of two kinds of Gleason frames GF for leptons: spin GF and the helicity GF. The reason is logical and GF well founded theoretically but mathematicians and physics persons will not listen to me. For helicity the orientation clockwise cw or counterclockwise mpo is changing and the also the left- or right hand screw orientation in space. In reversing mpo for positron + charges to electron cw - charges the gyromagnetic relation reverses its superposition GF direction towards the spin GF. That is a noncommutative operator logic like the Heisenberg. But you just want to recognize my high quality research of orthomoduar noncommutative logic since the 1980. For Hopf and his geometry physics does it much longer. Why? In this case mathematicians show also a lazy brain. The Hopf map is due toe projecting 4-dimensional spacetime down to the Hopf Riemannian sphere in space, deleting time as a Hopf fiber in the Hopf fiber bundle. And applying after the Hopf map the stereographic map for the Riemannian sphere projection onto a plane E is not studied by them, but all this is only in my papers not quoted by them and Wikipedia. A new control system for quoting research results is required from me for Wikipedia at least. This is not honest but politics. The plane E for electromagnetic field rotation as a loop crossed by a magnetic field vector like magnetic momentum gives macroscopically as cross product the technically used angular momentum for your bikes dynamo. The loop starts rotating and produces light as current for you bulb.
Hi Yu Chen. The number of mathematicians who will not accept the use of heuristic methods as they do not have a formal proof of convergence is in reality very very small (although sadly not zero, I know more than one). Having a formal proof is not always as useful as many may initially percieve anyway: consider how simulated annealing is known to converge to the global optimum given the right conditions. Sadly the conditions needed to make the proof valid are usually impractical to acheive when real problems are being solved (rather than when solving contrived test problems), but that does not make simulated annealing any less useful as an optimiser. What we do gain from such a proof is some insight into how to structure the 'control parameters' that govern how the algorithm searches the optimisation domain, improving our chances of developing a good algorithm quickly, rather than having to spend time adjusting the algorithm configuration blindly. In consideration of the no-free-lunch theorem, a proof may also indiate the class of problems that an algorithm is well suited to solving, but also what it is not great at tackling.
Having a proof of convergence for something like Evolutionary Algorithms (that is, for a practical and workable EA, rather than a 'toy' algorithm), would also hopefully reduce the number of junk publications in the field that are aimed solely at adding to an authors list of papers, rather than adding productively to the field of optimisation.
The optimum solution is not guaranteed in given finite execution time. This is the major issue with heuristics. Additionally, all the heuristic methods primarily suffer from the following two issues:
Pre-mature convergence
Converging too early, results in suboptimal solution
Stagnation
Non-improving solution
However, I recommend to use it when the problem is not solvable by analytical optimization methods.
It is interesting to see the developments over the years at RG, regarding metaheuristics: I've been around for quite a while, and I see more and more that the questions are becoming more focused, and technical. Is it perhaps a sign that the communities are converging, or at least more often borrow from each other in more fruitful ways? That is then a welcome event!
In my opinion, meta-heuristic techniques rely on randomness and needs further statistical reasoning e.g. mean, variance, std calculations over repeated trials to justify the optimality of the solution found.
I understand mathematicians' concerns that metaheuristics are generally presented without deep mathematical analysis and there is also No Free Lunch theorem. However, from engineering point of view, these metaheuristics makes a bridge for engineers who want to optimize their systems, models etc. without dealing with complex math (headache :)) and these methods generally provide good results. I think that this is why these methods become more and more popular each day. They are easy to understand and implement, and anyone can use them to make their system, model etc. better, so do I.
Great, that engineers are able to design safe aero plains without headaches, and only by using heuristics! A bit of statistics and you will fly safely!
J. W. Nieuwenhuis Of ourse in safety critical areas of engineering one should consider headache. I am not claiming that without proper statistical analysis they should be applied or implemented. I just wanted to say that these methods are quite popular especially for engineers, because they are easy to apply and implement even though mathematicians are against them.
I am surprised to not see many more attempts to combine mathematical programming-type approaches that guarantee an optimal outcome (perhaps after a long time, though) and metaheuristics and other heuristics.
I have seen such things here and there, but as I am not perfoming research in it,
I do not know in fact if there are such successful attempts made.
Most researchers choices are guided on the outset by their approach, i.e. do they adopt "problem seeks solution" or "solution seeks problem" attitude in their endeavour. This will have a critical influence on the choice of optimisation method. On the engineering side of the research it seems that "problem seeks solution" approach tends to be favoured.
I like Michael's idea of exploring hybrid approaches that combine mathematical programming with metaheuristics, it certainly feels exciting area of the research.
Both have merits, and like anything, without research we only know what we know.
There are a number of areas in which heuristic and meta heuristic algorithm. To cite a few Job scheduling problems and other production/operations resource optimization problems. However researchers need to understand the limitation of these and deploy them. It is an ethical practice for researchers to declare input conditions which lead to poor quality of solutions when they apply heuristics or meta heuristics solutions. It will not be fair to say that in general mathematician are not in favour of these algorithms. They only point out the inherent weakness in the algorithms. If one is able to appreciate these, then things will be in proper perspective.
I do not mind if a (meta-)heuristic is used for a very complex problem, as some insight might come out, such as providing a starting point for more advanced methods.
My concern is based on the fact that I see even polynomially solvable problems being attacked with metaheuristics. To do THAT is to be extremely lazy!
deterministic and metaheuristic algorithms both have their strengths and weaknesses, the problem arises when one tries to use one method where the other is clearly more beneficial. For large problems like Michael Patriksson pointed out, a metaheuristic can provide a very good start solution. Truly in solving large MIP problems, CPLEX internally uses branch and cut and at a set frequency, it applies heuristics to find good solutions which helps prune the search tree.
I agree with Michael Patriksson, that laziness is not asked for, but at the same time I have the impression that too many people suffer from a lack of knowledge of Operations Research, the basics of the well known optimization algorithms, and all you need to remedy this , is the willingness to study the basics of optimization theory and practice! And, practice a lot, use various models and methods! Only this way you get a feeling for what your problem at hand really is about!
Indeed, J. W. Nieuwenhuis. Also you have seen that there are very many who will not, or cannot, embrace mathematical optimisation, even though the chances in the race against the optimum are typically much better if one uses mathematical optimisation tools - and with the still quick development of OR/math programming the advantages will be even greater tomorrow.
Check out the proceedings of the recent "matheuristics" conferences. I think it's a promising direction.
That sounds promising! :-) But I am not critical against matheuristics - only the meta variety.
Actually, MP solver, like CPLEX or GUROBI has heuristics inside, to find feasible solutions very fast to proof model reasonable, and could get better feasible solutions than branch incumbents sometimes.
Combining mathematical programming with heuristics usually helps enterprise applications to get good enough solutions in short time, also give lowerbound from LP relax to enterprise applications, which help stakeholders get insights.
because you can not prove that the obtained point in heuristics algorithm is the certain optimal point.
Follow this link please:
https://www.linkedin.com/authwall?trk=gf&trkInfo=AQGweqwUCz3WeAAAAXLTrX2wYw71It7rJHm1Eg1dl820Fv3VB81hrEPiZKrbjWbCVP5FBfy5_Uf8iSUhE7pIvKv
because heuristic algorithms do not present an analytical solution, it is just a bunch of data that you somehow interpret. The solution may be valid at time t, however, it may not give you a meaningful solution at time t+1.
The following table has an answer for this question:
Table I. Local search heuristic vs metaheuristic strategies
In order to see this table, refer to the following current-state-of-the-art article:
Article Sea Lion Optimization Algorithm for Solving the Maximum Flow Problem
I agree with other answers . The mathematics is an exact science, but most exact mathematical results are results of heuristic approach
Nidhal Kamel Taha El-Omari I commented on your paper as well. But I don’t see why one needs to introduce a meta heuristics for maximum flow problem!
Nidhal Kamel Taha El-Omari : Please can you explain why you used a meta-heuristic for the max-flow problem, when dozens of very good exact algorithms exist? And why did you only compare your heuristic with the oldest exact algorithm (Ford and Fulkerson), rather than one of the newer ones (preflow-push, etc.)?
Already in October I had a conversation with Nidhal about the sea lion thing. I guess my comments were a bit too much for him ...
I have had very many chats with meta-enthusiasts, and have pointed them to mathematical optimisation fora, but lo and behold - they are quite stubborn!! Their excuses are that they have bad computers, no guidance, and several other not very convincing arguments. I do tell them that they are shooting themselves in the foot, but you see - they have outlets that we do not use, and there they are queens and kings.
Michael Patriksson , just to take a contrarian position to your claim of "shooting themselves in the foot": take a look at the "most cited" list in this journal, for example: https://academic.oup.com/jcde :
=================
=================
I certainly do not think that putting one's (ahem) research output onto such lists necessarily constitutes "shooting themselves in the foot", academically speaking, at least in the short term. Promotion decision, which are often not made by one's peers and are often focused/based on such easily quantifiable factors, reward stuff like this.
Then I will write a paper on the spastic eagle algorithm for the TSP. I will be done tomorrow - that's how fast one can produce a paper. :-)
So now you plan to stop the flood of pseudoscience by DDOS-ing the journals with your own?
Certainly not. But I might again write a journal paper in my favourite journal - Journal of Irreproducible Results! I have one already accepted, but not yet published.
Anton Evgrafov In fact the publisher of that awful paper is issued under the Oxford University Press! Junk can apparently be published anywhere!
Speaking about (meta)heuristic algorithms: once you develop an algorithm, it would be nice if you prove that it is correct. In optimization, it frequently requires fine and sharp tools, mathematically speaking. It is difficult to imagine computer scientists producing complete, mathematically correct proofs, unless they are mathematicians. But it is not a reason why some algorithms are called meta-heuristic. It is related to the state-of-the art knowledge of algorithm theory, which reaches to the deepest layers of the foundations of mathematics (think of P=NP problem and its relatives). We can very fast solve random TSP problems, but should be careful as it may happen that we find a long way for solving some of such problems, but we do not know yet.
Let me make an analogy with medicine: we have a patient who does not want to be operated on, but the surgery will, with all likelihood, save his life, and he denies his OK on the paperwork. What are the doctors doing? They are waiting until the patient is on the verge of losing his consciousness and start operating.
The moral of the story is: use as much common sense as possible!
Roman Sznajder This may surprise you, but it must be stressed that the norm in the algorithms theory circles is in fact exactly what you stated here, there's no need to imagine it: " It is difficult to imagine computer scientists producing complete, mathematically correct proofs, unless they are mathematicians. But it is not a reason why some algorithms are called meta-heuristic. It is related to the state-of-the art knowledge of algorithm theory, which reaches to the deepest layers of the foundations of mathematics (think of P=NP problem and its relatives). "
Papers like this get published all the time by computer scientists. This is how standard theoretical computer science work is done in the area of algorithms & complexity. The folks not doing this aren't meeting this bar (and there are a lot of them, this must not be downplayed), and instead are accepting a different standard (not that is wrong, it depends on the kinds of claims they are making). Different people have different goals and standards: For example, when I study algorithms I want to actually learn things about (the nature of) computation itself by intrinsically learning how the problem itself has fundamental properties that yield an algorithm's discovery; not just make something "that works" (which is a goal of a lot of researchers, I don't know if they'd share a similar goal as me), that's a byproduct of doing solid CS research.
Indeed, there are some techniques very hard to pin down and prove claims about, I think there is a place for that sort of stuff where we accept lower forms of evidence (e.g. empirical evidence). There's a healthy medium, some research communities understand this (and it is appropriate), others do not. For example, if the work is motivated by an application or people "just want something", one can make justifications akin to your example. However, I think the concerns of some here are that certain communities are not doing this (I know I've seen this myself, and it's not even only this particular community of discussion, some CS-related research communities have this issue with standards). It must be stated there are excellent researchers in heuristics and so on, some that do excellent algorithms work and have contributed greatly to my field (especially theory), these comments do not involve them. It's more of the gamification of publishing papers that seem to blatantly ignore developments in theoretical computer science or even combinatorial optimization itself. For example, when we have rich theory in things like approximation algorithms and other algorithmic paradigms, I think these researchers can probably provide more/better justification for their research these days, depending on how novel the problem is (if applicable).
Some researchers have their favourite hammer and they go around trying to look for nails, even if they are not nails. I mean, it's exciting to try things out, but does it make sense to do? Does it actually tell us anything about the problem's computational nature? Does it actually do what is claimed? I think many of these questions are best answered by sitting down and studying the problem itself.
I guess that these conflicts can be ended by the separation of “Discovering” from “Optimization”.
Consider the case in which the metaheuristic enthusiasts remove the title “optimization” and use the exact meaning of “heuristic” which means “discover”. Also, they will no longer use the phrases the same as “optimal design”, “optimal solution”, and “global optima”. Instead, they can use the phrases like “identified design by search” or “better design”.
In this way, people like Michael Patriksson will have the optimization with its pure meaning which is mathematical programming. Also, people interested in “discovering” can continue their work with new keywords like “evolutionary computations” or “swarm intelligence” instead of optimization.
Maybe someone is not so interested in “global optimum”. Maybe someone hates to use sensitivity analysis, mathematical proofs, and optimality conditions (this person is not me!). You cannot disappear him! But you can divide him so easily!
Even the journals publishing these two can be divided. However, at the moment journals like “Journal of Optimization Theory and Applications” publishing mathematical programming and journals like “Swarm and Evolutionary Computation” are publishing metaheuristics.
But generally, I am agreeing that the community must gradually move to mathematical programming. That is what I am doing for myself.
Many world-leading computer scientists work on approximation algorithms, inapproximability, and parametrised complexity. Just check out the proceedings of FOCS, STOC and SODA over the past ten years. It's the weak ones who work on "zombie bat" heuristics and the like.
Dear Pooya Rostami ,
I am not sure that your proposal of newspeak for meta-heuristics adresses anything. As Adam N. Letchford points out, one can actually do very novel and/or substantial work with all kinds of algorithms, or one can submit and (unfortunately) publish papers of the type "We present a novel algorithm where we replace ants with zombies and a factor 0.5 at some random place with 0.31. We test it on two problems. Enjoy."
Dear Anton Evgrafov
Someone does not need to invent “the novel tired panda algorithm” to be “weak”. Even the present or old metaheuristic cannot be categorized as optimization. They are methods of discovering by search, and they are not following the scientific path of finding “optimum”.
I am saying that you cannot stop someone to invent or use such works. So, it is better to isolate them in a keyword except optimization. Gradually, when these people face large scale problems, such as topology optimization which has designed variables up to one million, they will leave metaheuristics themselves. They will be soon disappointed. This is what happened to me!
Dear Anton Evgrafov
You can easily show people that metaheuristics are inefficient. I have attached a simple 3D topology optimization Matlab code which is borrowed from the following website:
https://www.top3d.app/
This code uses an optimality criteria solver to update design variables. If you run:
top3d(50, 20, 4, 0.5, 3.0, 1.2)
you will have 4000 design variables and you can see how efficiently the algorithm can lead to the optimum. Now make a competition and ask people to replace optimality criteria with heuristics. Soon they will all fail!
All metaheuristics and evolutionary algorithms fail in large dimension and there is no difference between genetic algorithm (which is so old) and spastic eagle algorithm which was invented last night!
:-)
Pooya Rostami , e.g. Chapter 9 on Derivative-free methods in Nocedal&Wright's book contains material on direct search methods. Nelder-Mead's simplex is a classic example of direct search. From this perspective I see no dichotomy to be built between "search" and "optimization", as you suggest.
More generally, plenty of great ideas have been considered unscientific by some throughout the history. Recent examples in optimization specifically include using filters for globalizing the algorithmic convergence; using non-linear solvers for solving linear programming problems - ultimately leading to primal-dual IP methods; randomization of simplex method, etc. If someone just took their time to really show that their “novel tired panda algorithm” is (1) novel and (2) works, either by extensive empirical evidence or theoretical analysis, I would have no issues with this at all.
Finally, I am familiar with top3d, please see https://www.top3d.app/errata
Anton Evgrafov
OK, I will replace “search” with “discovering” which is the exact meaning of “heuristic”. Now your reference doesn’t work!
Yes, but I have brought a very simple example for you to show that they are not efficient. These algorithms are only worked by the search. When I increase my space, they will fail so easily. It seems like when you want to find a lost key in all over the New York with a group of your friends! :-)
Besides, what merit do you see in a metaheuristic to be known as “a scientific method” in the future? Any theoretical background? Any nice idea?
If you want to evaluate a metaheuristic, you need to make at least 30 single runs and average them and you will see that there is a heavy variation in the identified box plot. It shows that the algorithm is not reliable! So it cannot be a scientific way!
Dear Pooya Rostami , I for one do not know the future, but things change quickly as other means of computation become available (e.g. who knows what comes after quantum computers?). Just think about the algorithms, which were state-of-the-art before computers (e.g. long division, multiplication, simplex) and after (Newton's method, FFT+convolution, interior point algorithms=Newton's method). Personally I found this paperArticle An Optical Solution For The Traveling Salesman Problem
(which is also summarised here https://spectrum.ieee.org/computing/hardware/to-crack-the-toughest-optimization-problems-just-add-lasers ) rather inspirational.
Dear Anton Evgrafov
Yes, who knows what will happen in the future! But at the moment, there are not enough reasons to defend a metaheuristic.
Myself, I had been badly disappointed with the metaheuristics and evolutionary algorithms in problems of structural optimization due to the large scales. I know that you had so many experiences and proposed many developments in this field. I am sure you know better than anybody else, what I am talking about! But I cannot understand why are you standing in the side of metaheuristics this time!
I can refer others to this paper:
“Sigmund, O. (2011). On the usefulness of non-gradient approaches in topology optimization. Structural and Multidisciplinary Optimization, 43(5), 589-596.”
Dear Pooya Rostami ,
Let us actually re-examine what am I defending. You state that you are "badly disappointed with the metaheuristics and evolutionary algorithms", in your experience metaheuristics are inefficient for the large scale problems you are solving, and there is literature supporting the latter claim. Based on this, you declare that "metaheuristics are not following the scientific path of finding “optimum", and even question whether metaheuristics will ever become a "scientific method". I fail to see the connection from your observations/experiences to your conclusions. The presence of such a connection is in fact the very definition of a scientific method.
Let us celebrate 20 year anniversary of Wikipedia by reading its page on metaheuristics: https://en.wikipedia.org/wiki/Metaheuristic .
I quote selectively:
"...Compared to optimization algorithms and iterative methods, metaheuristics do not guarantee that a globally optimal solution can be found on some class of problems... Most literature on metaheuristics is experimental in nature, describing empirical results based on computer experiments with the algorithms... Many metaheuristic methods have been published with claims of novelty and practical efficacy. While the field also features high-quality research, many of the publications have been of poor quality; flaws include vagueness, lack of conceptual elaboration, poor experiments, and ignorance of previous literature."
My issue with metaheuristics is summarized the last two sentences I quoted. They clearly describe poor science, but it lies with the scientists/reviewers/journal editors, not with the idea of metaheuristics itself.
Dear Anton Evgrafov
My description of the inefficiency of metaheuristics is something different from being non-scientific. Please don’t combine these two.
Let’s give you another example. The attached figure is the box-plot for comparing two metaheuristics in an MBB problem. I used the moving morphable components (MMC) for the parameterization instead of SIMP which has very low design variables. The box plots illustrate the final compliance of identified topologies after 30 single runs. You can see the very high variation of final outputs in the box-plot! However, I cannot say what will happen in single run number 31. :-)
By increasing the dimensions (using SIMP), the unreliability increases! This easily shows that the algorithm is not reliable! People working in metaheuristics are so familiar with such outputs!
This is while an algorithm from 1987 same as the method of moving asymptotes (MMA) is robust and gives the optimal design.
I believe that something unreliable cannot be a scientific way! This is not an optimization, it is discovering! It has no theoretical background and it is just based on ideas! This is why the “spastic eagle algorithm” can be easily created. So, you should respect “GA” and “spastic eagle algorithm” the same, because they are both based on nice ideas!!! You have no theoretical reason to say which one is “weak”, “poor”, etc.
Dear Pooya Rostami ,
You do not help to advance your cause in a scientific debate by relying on arguments, which are not 100% accurate.
Article A globally convergent version of the method of moving asymptotes
orArticle A Class of Globally Convergent Optimization Methods Based on...
for details. I imagine that "an algorithm from 1987 same as the method of moving asymptotes (MMA)" behaves in the same way. More generally, a gradient-based algorithm relying only on the first order information is not guaranteed to converge to optima, unless some specific set of circumstances, external to the algorithm (such as e.g. convexity of the problem, SOSC, etc), hold. This is. not very likely, but stranger things have happened, see for exampleArticle Skeleton Alleged in the Stealth Bomber's Closet: A 40-year-o...
Dear Dr. Anton Evgrafov
Thanks for continuing the discussion. With you corrections, you help me to back stronger each time! Regarding your description, I should say:
Yes, it is true about MMA. Maybe I can give better examples like global optimization solvers or something else gives global optima.
I am not agreeing with the second description! Because I gave you an example that starts from a single starting point. Please have a look at the attached figure. I borrowed it from the following paper (I have some similar outputs):
“Bujny, M., Aulig, N., Olhofer, M., & Duddeck, F. (2016, July). A hybrid evolutionary approach for level set topology optimization. In 2016 IEEE Congress on Evolutionary Computation (CEC) (pp. 5092-5099). IEEE.”
The author used MMC coupled with evolutionary strategy (ES) and covariance matrix adaptation evolutionary strategy (CMA-ES). They started from a diagonal layout of components. But you can see that CMA-ES led to global optima in 0% of cases and ES with only 7%! The other local optimums had also a very low probability. But you would not see such outputs when using gradient-based solvers which are scientific tools with strong mathematical backgrounds. Do you like your car or house to be designed by unreliable heuristic algorithms?
If we know unreliable methods as science, we can expand it to other aspects. For example, we can remove exact and analytical solutions of differential equations and replace them with new algorithms to produce random, false and unreliable solutions. Soon we can see what terrible outputs will be produced! This is what people are doing in max-flow problems by using heuristics instead of exact solvers as mentioned by Adam N. Letchford !
Generally, I am mostly agreeing with Michael Patriksson who says " I'd like to throw them out the window... ".
Personally, I think that it would be very unwise for people who work on exact methods to dismiss all work in metaheuristics. Some of the early metaheuristics, like simulated annealing and tabu search, were major advances. They enabled one to obtain solutions to large-scale problems, that were better than those available with any other existing method, in reasonable computing times. They are still very useful today. For example, the famous CONCORDE package, which solves large-scale TSP instances to proven optimality, begins by running an iterated local search heuristic to get a good initial upper bound.
In my opinion, the real issue is that the vast majority of the more recent metaheuristics are just minor changes to existing ones. Just change the animal from bees to whales, frogs or bats, and pretend that it's new. The authors of these papers are being dishonest, and they are also damaging the reputation of the field. There are people out there who do excellent work on metaheuristics, and this situation is not helping them at all.
Dear Prof. Adam N. Letchford
I am not much familiar with TSP but for my very large-scale problems in structural topology optimization, hill-climbing heuristics same as Tabu search and simulated annealing will fail. I can only find a paper hybridizing optimality criteria update and simulated annealing in which the use of optimality criteria (which is sensitivity based) made it efficient:
“Garcia-Lopez, N. P., Sanchez-Silva, M., Medaglia, A. L., & Chateauneuf, A. (2010, June). An improved hybrid topology optimization approach coupling Simulated Annealing and SIMP (SA-SIMP). In IOP Conference Series: Materials Science and Engineering (Vol. 10, No. 1, p. 012183). IOP Publishing.”
Can you present mathematical proof to say that “GA” is better than “tired panda”? You can only discuss the idea. It is the same as discussing which color is more beautiful! I have some friends developing those useless “tired panda algorithms” and insisting that it is much better than GA.
However, in the example that I gave in the previous reply, the early algorithm, evolutionary strategy (ES) will reach the global optima with the probability of 7% with 10000 function evaluations (repeated 30 times!) while a gradient-based can find the global optima in 120 evaluations. It shows a big failure! I see no difference between old and new heuristics and evolutionary algorithms.
I find the debate arising from the question posed here interesting.
I have experimented with a number of metaheuristics inorder to assess their applicability/accuracy in determining discrete optimal shunt capacitor sizes that need to be installed at defined locations in a given electrical system.
https://www.researchgate.net/publication/344954880_Performance_Analysis_of_Six_Search_Algorithms_in_Minimizing_the_Cost_of_Real_Power_Losses_and_Shunt_Capacitors_Purchase_through_Optimal_Shunt_Capacitors_Sizing
I compared the results obtained by these algorithms with the exhaustive search (an approach which unlike metaheuristics, guarantees the attainment of the global optima). Most of the metaheuristics were able to find the global optima in relatively less computation time that the exhaustive search.
I have come to believe that exact optimization algorithms are time consuming and are inappropriate for instances where a solution is required quickly (e.g. power system dynamics and power systems transients optimization problems). I believe these are instances where metaheuristics come in handy.
Dear Pooya Rostami,
Based on your previous posts, you are making a distinction between first-order methods and metaheuristics, and you are visibly unaware of the fact that many metaheuristics use gradient descent or other forms of local search as a subcomponent.
Metaheuristics are often designed with a *problem decomposition* in mind, for example, focusing the search on the combinatorial aspects (e.g., choice of regions in a very discontinuous problem as in your case), while using any efficient tool at hand to handle the easiest parts (LP or continuous optimization for some sub-regions, or efficient algorithms known for some specific subproblems).
Such kind of decompositions are used, for example, in the following studies:
1)Article A simple and effective hybrid genetic search for the job seq...
2)Article HG-means: A scalable hybrid genetic algorithm for minimum su...
3)Article Heuristics for vehicle routing problems: Sequence or set optimization?
4)Article A Hybrid Genetic Algorithm with Decomposition Phases for the...
5)Article Two-Dimensional Phase Unwrapping via Balanced Spanning Forests
• In (1), the metaheuristic is designed to select a good order of the jobs, whereas an (exact) dynamic programming algorithms is used for tool selection in each solution generated by the algorithm. This algorithm can solve practical industrial cases, in comparison, exact methods are unable to solve problems with more than 20 jobs consistently despite over 30 years of research on the topic.
• In (2), the metaheuristic is designed to look for promising starting point for a subjacent local search using the K-means algorithms, widely used in clustering. K-means is known to be a natural and efficient local search for this problem. In this case, the main purpose of the metaheuristic is to guide the selection of a good starting point. We tried with a random selection of starting points (see experiments with multi-start K-means), and it was far worse. Again, exact algorithms for this problem are not applicable at scale, so good heuristics are needed.
• In (3), the metaheuristic is applied to find the best solution of a partition problem, in which each partition's cost is found by solving a TSP (the subjacent problem). It's a funny study as it compares different possible problem decompositions.
• In (4), the metaheuristic is designed to find the general (combinatorial) layout decisions, whereas a linear program is used for continuous placement and aspect ratio optimization for each solution.
• In (5), the metaheuristic is used to solve the issues raised by residual points, whereas a simple method (path integration) is used to recover the complete solutions each time the situation of these residual points (their mutual connections) is determined.
The list is long and encompasses many applications that would not be otherwise effectively solved to optimality. In several of these papers, we jointly developed some exact methods for comparison, or at least rely on comparisons from existing exact works in the literature (for problems of a scale that can be solved).
Finally, you seemed pleased with the idea of "repeating gradient descent" from different starting points in one of your previous discussions (https://www.researchgate.net/post/Is_it_easy_to_handle_discontinuities_in_objective_function_with_mathematical_optimization). By doing so, you are effectively applying the simplest metaheuristic that ever existed: multi-start local search. In that case, you would be randomly optimizing the upper-level decisions (the starting points) or enumerating them (which may be impractical in large-scale settings with an exponential number of regions)... in such a situation, you may be much better off with a CMA-ES that effectively learns directions of improvement instead of a random walk.
As a general recommendation, do not reject (or try to discredit) an entire field based on your experience on a single and very specific problem.
Good luck!
Good post, Thibault! I am geared towards math opt, but even I have had a lot of fun using a few (meta-)heuristics, based on the invitation of, typically, Iranian scholars. They seemed to me to be rather relaxed about whether the output is going to yield a guaranteed optimum - I sensed from the start that there was an understanding that it is modelling exercise first of all, and there was no pressure to yield optima - and in the papers that were produced I do not recall that at any instance we did try to find the optimum - the exercise was good enough! It was always a joy to work with them, and while I have not had contact for a while, I look back to it as a very nice series of projects.
Dear Thibaut Vidal
No, I am aware of those methods. The two last references that I gave in my past replies are hybrid methods (heuristics+ gradient-based)! But this discussion is about general zero-order metaheuristics. For hybrids, some things will change. The hybridization helps to increase efficiency. Thanks to the existence of gradient information, the unreliability will be decreased so much. We also had this discussion before and you had some contributions!
(https://www.researchgate.net/post/What_do_you_think_about_improving_metaheuristics_using_mathematical_programming)
Although the hybridization helps the heuristic to get more reliable, it is still unreliable! So why don’t we use the strong and reliable gradient-based global optimization solvers instead?! I am not sure that unreliable methods can be called “science”!
Regarding your comment about my question about finding the proper mathematical tool in large scale non-linear non-smooth non-convex programming, I should say: Thanks for the recommendation, but the method that I have been pleased with was be based on discretizing the smooth and non-smooth domains and applying the gradient-based global optimizers in the continuous domain and maybe finding a local optimizer for searching the domains around discontinuity. Due to the very large dimensions of the problems and the unreliability of heuristics, I will prefer to find a mathematical tool with strong theoretical background! Hybridization is itself a choice but the mathematical solver is always superior! CMA-ES will easily fail due to the very large dimensions of the problem and it is unreliable.
Also, I am insisted on the superiority of the mathematical methods in continuous optimization, and specifically, structural optimization. Not in other areas!
Finally, thank you very much for reading the comments and giving interesting references!
Let me first send my regards to Adam Letchford, who made me mindful to the ASTSPP!!
To emphasize the importance of meta-heuristics, I might only refer to my latest result as to the “Asymmetric Steiner Traveling Salesman Path Problem (ASTSPP) - Open and Closed Tours’ Efficient Determination in General Directed Graphs” (being published). For the ASTSPP, we are given a graph G with asymmetric arc weights, start point s∈ V(G), target point t∈ V(G), and a subset S ⊆ V(G). The objective is to find a minimal length route from s to t in G visiting all nodes of S at least once.
The basic heuristic (not so staggering) relies on a special scan of an approximate Steiner trees T⊂ G spanning S. But only the two meta-heuristics, being local optimizations, brought the impressive result improvements:
The tested and intensively verified deterministic solution approach (C++) comprising TSA and CCE provides a maximal sample standard deviation qmax ≤ 1,86 % (referring to the optima for |S| < 15 (time effort)) and remains real-time capable (≤ 2 sec) for |S| < 150.
Thus, this approach complies with demands for using graphs that must not necessarily be complete ones (traffic maps), that have generally asymmetric arc weights, and that have not to comply with the asymmetric triangle inequality. Further, it satisfies the request to evenhandedly compute near-optimal round trips (s= t) as well general routes (s≠ t) without any special precaution.
Have you ever used an exact solver like CPLEX or Gurobi or ... (there are many others)? Did you know that they inherently utilize heuristics? If you have a separation routine for cuts, which cut is going to be used firstly, which one second, ..., and how many are used before going on? May be it is not wise to exclude certain ideas per se as others might be able to use them in a very beneficial way even advancing the methods of your choice.
Sorry, your comment is not impressive at all! Operations Researchers wouldn’t hardly made such a formidable success in their field (as they have reached till now), if they first look for existing generalization tools (like IBM’s CPLEX Optimizer) before realizing any new auspicious idea‼ And to be clear: An efficient Algorithm (so far its objective is CPLEX customizable), adopted for a special task, substantially outperforms CPLEX.
I want to thank all colleagues who have given their contributions to this interesting discussion, which has been enriched with very pertinent references. From my experience with highly non-linear problems in chemical engineering: when the search space is very wide the solution of heurisitic methods is often superior, although the convergence is tedious, but if we have reduced the search space, as in a reoptimization in an RTO problem, gradient-based methods have shown to be the best option. So, in my opinion, both strategies have their applications, and hybrid strategies can often be the most efficient.
There is nothing wrong for using heuristics to guide the optimization process (i.e. to find/strengthen the bounds). But using them to obtain optimal solutions is a different story: unless there are theoretically proven performance bounds, it is hard to judge how good the solutions are. Also, there are a lot of metaheuristics that are "inspired" by nature, even though the metaphor that they use is usually a stretch; such as migrating birds, flowing water, swarming bees, musicians playing together, etc. See this: Article Metaheuristics -- the metaphor exposed
Thanks. Let me emphasize that I do not want or need to be "impressive." I just want to provide some guidance. You can take it or leave it.
The success story of operations research largely relies on the use of the simplex method and extended developments. With the use of systems (in the sense of "no systems, no impact") the success story had been extended and reached new heights. It is not about using these systems first, it is about whom we reach and how we can support the use of the undoubtably reached quality of research in operations research on a larger scale. In different threads, we have argued that the metaphor exposed is something that criticizes in a profound way and there are additional works to support that, including those from Sörensen, Weyland and others; see, e.g. Article Similarity in metaheuristics: a gentle step towards a compar...
Some ideas are "auspicious," others are not. As long as you know where you are and what you are doing, your work can and will be eventually be appreciated. Expediting an exact approach by using heuristics and metaheuristics is possible and that can even be done within the boundaries of and without destroying theoretical bounds.
May be some "mathematicians are against meta-heuristic and heuristic algorithms" because they can not properly differentiate between the good and the ugly.
Although i tend to be strongly biased to side with Professor Peter Richter, I would add that several comments also have important implications. Especially to the way we formulated or reformulated to a more linearized geometry to ease the teeth of a hybrid algorithm as long as the exact physical problem remains mathematically INTACT. We so much did such architectural equivalencing during conceptualizing, designing, engineering, and software code-building (C++ and JAVA J2EE standards) to best reach the exact optimal solution. We were casting the math reformulation in a very special form amenable to solution by our homemade customized optimal control algorithm constructed from electrical engineering continuous control therory, then integerize / solidify the continuous solution control output to solve and manage labor / workforce planning and scheduling CRM / ERP at SAP Waldroff Germany, and SAP Chicago Labs.
in ththe solution to the exact physical model of the problem at gand.
In my experience with a number of mathematicians who do not have a good view of these methods, the following reasons can be mentioned.
1) Unlike many exact methods that are difficult to use in problems, these methods can be used with less difficulty in problems. However, upgrading methods to find quality answers is difficult.
2) These methods usually choose the concept of accident, which is not acceptable to some mathematicians.
3) In most cases, a guarantee for quality of the answers obtained by these methods does not exist.
4) Their lack of mastery of these methods and lack of awareness about the difficulty of promoting them.