Function evaluations are one of the most important criteria for comparison along with statistical analysis such as standard deviation, Friedmann's Test, Wilcoxon(due to the stochastic nature of metaheuristic), minimum fitness as well as average fitness. Function evaluation is important since several metaheurisitics have 2 or more stages in their optimization process (for example TLBO has teacher and student phase, GWO has only one phase (searching and hunting), ABC also has two loops, Jaya has one , BFO has 5 etc). For all of them number of iterations may be same but that does not reflect the true computational requirement of the algorithm in terms of how much effort was required by the algorithm. Therefore, by using functional evaluations, a more fruitful comparison can be drawn since we are aiming for a common ground, basically the number of times the fitness function was called.
Hi, for derivative-free methods the efficiency can be measured by the number of function evaluations required to converge to the optimum but also by the capacity of the method to converge to the global minimum instead of a local minimum. For the methods based on gradient evaluations (Newton, Quasi-Newton, SQP for constrained problems) it is not possible to use the number of function evaluations as a criterion.
Thanks for detailed explanation of the Q. I forgot to mention this Q is about metaheuristics. I would be glad if you please share your experience about metaheuristics and function evaluation.
Dear Ghanshyam, for metaheuristics (derivative-free) methods such as Genetic algorithms, Ant Colony methods, Particle Swarm techniques etc.., function evaluation can be a good criterion. I have used some of these methods and I think that the efficiency strongly depends on the problem to be solved and also on the tuning a each method (number of particles etc..). Thus, it is difficult to say that a method is more efficient than another.
Function evaluations are one of the most important criteria for comparison along with statistical analysis such as standard deviation, Friedmann's Test, Wilcoxon(due to the stochastic nature of metaheuristic), minimum fitness as well as average fitness. Function evaluation is important since several metaheurisitics have 2 or more stages in their optimization process (for example TLBO has teacher and student phase, GWO has only one phase (searching and hunting), ABC also has two loops, Jaya has one , BFO has 5 etc). For all of them number of iterations may be same but that does not reflect the true computational requirement of the algorithm in terms of how much effort was required by the algorithm. Therefore, by using functional evaluations, a more fruitful comparison can be drawn since we are aiming for a common ground, basically the number of times the fitness function was called.
The number of function evaluations is independent of the specific algorithm (as already mentioned by Dhruv), the speed of the computer, the particularities of certain implementations etc.
It's not really a question of advantages: it's just the only general measure there is. And yes, statistical measures are important as well, but they should also be relative to the number of function evaluations in some way.
By graphing the median performance of several runs against the number of function evaluations, one can compare algorithmic performance for different function evaluation budgets.
A variant is to use the number of simplex gradients (number of variables of a problem + 1).
The only "disadvantage" is that the algorithmic complexity of the optimization algorithms itself is not considered. But for most simulation-based problems, this complexity is negligible.
When this complexity is not negligible, one could measure performance in time, but that makes it impossible to untangle the effects of algorithmic complexity, particular implementations, computer speed etc.
The attached figure graphs median performance against function evaluations (on the left), and shows the deviations after 200 function evaluations (on the right). In this way, one can see performance in term of different function evaluation budgets (i.e., speed of convergence), as well as robustness.
Function evaluation really matters, especially because your computational budget might be limited in a lot of real-world applications. Therefore, it is a good idea to consider various scenarios with limited budget when you are benchmarking optimization methods; for example, analysing the optimization performance with function evaluations of 1000, 5000, 10000.. etc.
Function evaluation number (FEN) is a matter in getting the optimal solutions. If we see in olden days, the computational effort that is required for the machines is very high and hence, they took more time to get the optimal solutions irrespective of the number of FEN. Nowdays, the computational effort required is very less so that within a less time, we are getting the optimal solution. here, the number of FEN is not a matter but the time taken for it is important as per my opinion is concerned.
A rough rule is calculating the number of times you evaluate the objective function in the optimization technique which is given by the number of individuals used in the said evaluation and the total number of iterations.
for example: if you take the simple case of PSO, the population is only evaluated once in each iteration, therefore , Total NFEs = 1 x number of individuals x number of iterations. or TLBO which has two stages, the population is evaluated twice, in each iteration, therefore NFEs = 2 x number of individuals x number of iterations. You can easily gauge the number of stages by understanding the mechanism of each method. If not, you can try to look in the pseudocode of the technique where it states "evaluate fitness of population" and how many times it encounters it in one loop or look at the actual coding wherein how many times the objective function is called by the main algorithm.
I find the above discussion puzzling: you are comparing the number of function evaluations for methodologies that with a very low probability will find an optimal solution - and in the unlikely case that you do stumble upon an optimal solution you will not know that you have done that, as termination criteria are not built from any optimality criteria. What are you hoping to achieve?
To clarify: I have a problem discussing benchmarking methods that are not even designed to terminate at an optimal solution. Indeed, I have been involved in several practical projects where the objective function is a black box (i.e., somewhere inside the box is a simulation, and we do not know what goes on therein - we just get an output, just like a delirious oracle on speed), and it is quite hard to eventually decide the optimal value, and indeed to find an optimal solution. We can stumble on it, and yet we do not know it. We have occasionally built an RBF (radial basis function) to represent the box, and have in fact had some success, but we cannot be sure that the output from the exercise is what we want. On the other hand, we tend not to be interrogated by the math police - they know that this is not an exact science, and that engineers will adjust what we give them, anyway. :-)
The context of this question is optimization where the objective is evaluated via a (structural engineering) simulation. In my field (Architectural Design Optimization) I also consider for example building energy and daylighting simulations. In these kinds of situations, the optimum is typically unknown, as you say.
I agree that it's probably futile to consider methods that do not have convergence guarantees beyond randomness, such as most metaheuristics.
But many people in my field, including the questioner, don't nessecarily agree. Its therefore worth showing that, typically, methods that do have convergence guarantees perform indeed better.
Btw, I really enjoy your comments on this platform Michael. They tend to be among the most precise I come across.
There are machine learning optimization techniques like bayesian optimization which mainly uses when the function evaluation cost is high. But I think it may not perform well for higher dimensional problems.
Well, I would say that no method - in this problem setting - can be guaranteed to be better than any other, over a large enough problem set. That's the bug with problems that do not satisfy, for example, a Lipschitz continuity assumption on the objective function. I have been involved in rather many simulation-based optimization problems in industry (water jets, engine optimization, you name it), and it almost always boils down to trying to mimic the behaviour of the implicit objective function with a concrete - explicit - model that one can minimize fairly quickly. But there is not one method that fits all such problems, of course.