The "no free lunch" theorem, in a very broad sense, states that when averaged over all possible problems, no algorithm will perform better than all others. For optimization, there appears to be some "almost no free lunch theorems" that would imply that no optimizer is the best for all possible problems, and that seems rather convincing for me: it is common sense in the optimization community that no algorithm is better than all others, and you need to use as much problem-specific knowledge as possible.

However, this fact (that no known algorithm is better than all others) may be due to limitations in our current understanding of optimization, and not to a provable theoretical result. One thing is saying that we don't have an all-in-one optimizer, and another very different thing is to say that it cannot exist.

For what I've seen, the NFL theorem for optimization only considers deterministic algorithms, while almost all good optimizers have some degree of randomness. Also, the NFL considers the set of all functions, most of which are un-compressible in the Kolmogorov sense, while most of the useful functions in practice are very compressible: expressed either as compositions of elementary functions, or by some probabilistic model, or by simulation, but I don't know of any useful function that is strictly un-compressible in the Kolmogorov sense.

So, my question is, for the set of practically interesting functions, which are very compressible, and for the set of currently used optimization algorithms, most of which are stochastic to some degree, does the NFL holds at all? Are there any theoretical results that disregard the existence of a "perfect" optimizer for this restricted set of functions and algorithms? Or should we strive for the design of an all-in-one optimizer that can beat all currently available technique.

After Göedel and NP-Completeness, I would rather think that the NFL holds for optimization as well, since it seems that there is no such thing as a free lunch in almost all areas of CS. However, a good theoretical result would be nice.

More Alejandro Piad Morffis's questions See All
Similar questions and discussions