My answer is "No Free Lunch". The best algorithms will depend on the characteristics of each problem and the metric that it is used to evaluate the performance.
Alejandro's answer is short and direct :), but this is indeed a summary of what has been observed in the literature. There is no dominant stochastic optimization or genetic scheme (applied to multi-objective of mono-objective problems). I personally prefer to use deterministic schemes whenever possible.
People tend to think that the choice of a stochastic method is highly case-dependent, some algorithms being potentially more flexible and allowing more easily, e.g. the implementation of constraints. What is also unclear is the lack of rules allowing the user to draw conclusions of the form: I'm facing this situation, I should use this method. There's are review paper "Stochastic Optimization: A Review" by Fouskakis and Draper (2002), which you should perhaps read.
I agree with Alejandro and Gregory; let me share my two cents. As far as I know, and omitting the NFL theorems, each algorithm has a particular characteristic that makes it more convenient for a kind of problem. For instance, NSGA-II and PESA: both are Pareto-based multiobjective obtimization algorithms, but it turns out that NSGA-II has a better scale-up than PESA due to its internal architecture (see for instance "Multi-objective optimization using genetic algorithms: A tutorial" [Konaka et al. 2006]).
I measure best as lowest number of function evaluations, highest probability of finding the global, and robustness to surface aberrations (such as flat spots, stochastic aspects, DV discretization, discontinuities). Consider leapfrogging. Rhinehart, R. R., M. Su, and U. Manimegalai-Sridhar, “Leapfrogging and Synoptic Leapfrogging: a new optimization approach”, Computers & Chemical Engineering, Vol. 40, 11 May 2012, pp. 67-81.
>size of this set for it to be usable by the user?.
Indeed, most of the MOEA algorithms already have such limitation for maximum size of the Pareto set included. I work with SPEA (quite similar to NSGA) and it is the way there.
Fixing an upper bound on the number of points in the Pareto front is required, also for numerical reasons. But attention has to be paid to the repartition of the points. As far as I remember, if all the cost functions are convex there are no problems but if this is not the case, the repartition of points can be irregular and you can end up with, loosely speaking, "holes" can appear in the plot of the Pareto front leading to a more difficult decision making step. There are methods (Filip Logist and co-workers from Leuven, Belgium) which can be of help.
Agree with Alejandro's comment. Each metaheuristic algorithm has unique advantages with respect to robustness and performance in noisy environments, in the presence of uncertain parameters, and in different problem spaces. However, in line with the ‘‘no-free-lunch’’ theorem, it is impossible for one algorithm to optimally solve all optimizing problems. Some methods can perform better in some problems and vice versa.
NSGA-II (and SPEA2) is often used for comparison purposes, but it has been outperformed by many multi-objective algorithms in a wide set of optimization problems.
My experience says that SA (solely or hybridized with other approaches) works well in many NP-hard optimization problems, but PESA and DE works well in many problems.
The main advantage of SA is, in my opinion, that you can implement population-based versions such that each individual/solution use a different annealing scheme. If you apply parallel processing you can obtain good results.