Hi, all depends on the kind of optimization algorithm. For example, in nonlinear programming the SQP method is based on duality. Thus in addition to the satisfaction of contraints, the norm of the gradient of the Lagrangian must be lower than a given epsilon.
Stopping criteria are typically based on the appearance of necessary optimality conditions for the problem you are solving. If you have an unconstrained problem the criterion is that the gradient vector (that is, nabla f(x)) equals zero. This is not a very nice criterion, of course, since you can't compare with zero. But there are also other criteria that you can use, provided you have scaled the problem, somehow. For example, you can compare the gradient value nabla f(x) where you are with the norm of x. But it all depends on whether the scaling of the problem is good or not. That is why Newton's method is also better, because there is a scaling involved in the method itself.
For constrained optimization problems, you have more options. If you have a constrained problem over a convex set, and all functions are differentiable, you can check the Karush-Kuhn-Tucker ("KKT") conditions. Again because of tolerances, you need perhaps to look at a least-squares solution to it, and compare the error with the size of the values of the variables x of with nabla f(x).
It's not an "exact" science, in the sense that it actually depends what you want to be measured, because it could be an approximation of the objective value, the error in the KKT conditions, or something else. Most solvers have several criteria that have to be fulfilled, in order for a solver to not be "fooled" by the problem being badly scaled.
In my personal experience working on problerms being solved by metaheuristics or searching locals minimums by Non lineal optimization methods in most of the case works very good the criterion of reaching a predetermined number of followed iterations without improving the objective function. If is to look for a single minimum by non lineal methods works very good a very small distance amongserial solutions or between two serial values of the objective function
Ah, but there is a big catch, as we say: the scaling of the problem! If you have modelled the problem badly, by mistake, or you simply have a badly scaled problem - for example that the Hessian has many small and several large eigenvalues (which also happen by design when you use penalty methods) -, then your favourite algorithms will not work very well. So here the scaling matters a lot. Regardless of if you are using metaheuristics or any theoretically valid non-Newton method you will end up with such a scaling problem - it cannot be avoided.
José: There are algorithms (of the zero:th or first order, typically) that are very slow. One classic example is the steepest descent method, which has notoriously bad convergence properties, since it does not take into account more than gradient information. In such a case you should not use a predetermined number of iterations. One should rather switch to a quasi-Newton method.
I'd argue the question is extremely vague. You can't know the stopping criteria of any optimization algorithm straight up because that is entirely dependent on the algorithm itself and the properties it exploits. I suggest that this question is made slightly more precise. For example, what types of "optimization algorithms". For example, the way you could even talk about a specific type of algorithm (and the problem it is designed to solve) can dramatically change the nature of the answer to the question. For example, I've seen my fair share of algorithms that employ LP-methods and local search techniques, which all stop at very different asymptotic rates, for very different reasons. Many of these things have been hit on by Michael and some of the other answers excellently.
A couple things I could suggest are giving some papers or a more specific type of optimization technique being employed, and maybe what type of problem is being tackled (optimization problem is incredibly vague, as there are not just magic one-size-fits-all solutions to any optimization problem most if not always).
Initial correct guess has 20% probability in achieving a good answer. But a good optimization algorithm will help very much in achieving the result faster. One should have a good basic knowledge of genetic algorithms. Many good e-books are available.
Hi, all depends on the kind of algorithm and problem (derivative-free, SQP method for nonlinear programming, linear programming etc..).
For example in linear programming the convergence is obtained after a fixed number of steps via the simplex method.
In nonlinear programming (SQP method) the convergence is achieved when the norm of the gradient of the Lagrangian is lower than epsilon and when the constraints and conplementarity conditions are satisfied.
In unconstrained optimisation all depends on the kind of algorithm. In GA for example or other heuristic method you can decide to stop the algorithm after a maximum number of evaluation of the objective function. In gradient methods the norm of the gradient muist be lower than a given epsilon.
Agree with Richard Epenoy. Stopping condition is a condition that force your algorithm to terminate the process, it can be any meaningful condition that you wish such as number of iterations, quality of solutions, statistical values, an external or internal condition, maybe a random termination key, etc. In order to obtain reliable results for an optimization algorithm, you should judge based on the "average performance" over multiple independent runs with random initialization. When you run your optimizer 100 times, average fitness values over all runs can show the capability of your algorithm for tackling that problem.
R1. Stopping criteria refers to conditions that must be reached in order to stop the execution of the algorithm. Some of the most common stopping conditions are: execution time, total number of iterations, non-improving iterations, optimal (lower bound for min, upper bound for max) solution found, etc.
R2. Common sense usually lead to good results (principle behind heuristic algorithms), however there is no guarantee about optimality. It depends as mentioned in other answers on the logic behind the algorithm.
R3. The most important think is the logic of the algorithm. Most of them need an initial solution which in several cases is randomly constructed. A good algorithm should improve any initial solution in a shorter time. However, good initial solutions need less time to converge to near-optimal solutions.
Many of the stopping criterion were already mentioned by others. If you use algorithms basing on population, like for example GA, or PSO you can compute main and variance of any variable, or objective function over population at every iteration. The closer the algorithm gets to the global optimum the more uniform will be the population, because all individuals are trying to be as optimal as possible. It also means that the computed variance will get smaller. So you can put stop condition when the variance is smaller than specified epsilon.
Jacek: for every rule that you can construct in a non-convex (or combinatorial) world, I can construct an example where your stopping criteria WILL fail to identify an optimal solution. That's the nature of non-convex optimization. [Complete enumeration is the only guaranteed method, I'm afraid, At least in the WORST case.]
In Algorithmic approach it is hard to justify the optimality of the obtained solutions. But we can talk about acceptable solutions.
There are many situations where it is difficult to obtain optimal solutions due to several reasons. In such situation depending upon the limit of acceptance we can set the search boundary to obtain a solution that satisfy our purpose.
I think you need to start the analysis based on whether the problem is convex (unlikely), or differentiable. In the former case there are stopping criteria, while general nonlinear programs are hard, even if they are differentiable.
Meenakshi Kollati That the objective value does not change does not necessarily mean that you have found an optimum, but just the bottom of a local valley. You may be far away from an optimal solution.
If you are keen on solving nonlinear programs, I would not ask questions here at RG, but actually take a course on the subject of nonlinear mathematical optimization - that way you receive the building blocks of the theory of optimality, descent directions, and how those two build up an array of techniques - some better than others, depending of the properties of the sets and functions involved.
Note that the case of convex optimization is a "friendly" subject within nonlinear optimization, as it is much more easy to test whether a feasible vector may be optimal or not, while the whole universe of nonlinear optimization problem include both easy problems AND terribly hard ones. I suggest that the first step is to learn what "optimum" means, and to read about stationary points, as well as the Karush-Kuhn-Tucker conditions - they are fundamental.
Heuristics will certainly not lead to an optimum. The only methodologies that provide such a guarantee are iterative methods that step-by-step yield a descending path towards a minimum (or, at least a stationary point). Among very many I can refer to basic ones, like steepest descent (with a variety of line searches) and quasi-Newton methods for unconstrained problems, Frank-Wolfe (for differentiable problems over polyhedra), and many other more modern and specialised ones. You should definitely learn these, and try them, before you fiddle with any metaheuristics.
Richard Epenoy The simplex method has a reputation for being untrustworthy, because of possible degeneracy. Yes, it can be avoided, at a cost, but how many can say that they thought about it?
To complete the process of stopping the algorithm already it is a condition of force of the stop. The significant conditions are the statistical values, the quality of the solutions, the number of iterations, etc.
I always advocate using methods that have a known convergence criterion, as in "when the method stops we know that the current vector of decision values is a point from which the method cannot proceed" - which in the case of continuous domains will be a stationarity criterion. Not all methods have that "free" criterion, but we also know that some problems have no upper bound in terms of how long we need to proceed.
In a practical implementation of optimization algorithms, the algorithm is run multiple times, and each time with different random initial conditions. The final result can be reported by the average and best results over the multiple runs.