I'm looking for methods for optimizing stochastic functions. By stochastic functions I mean functions whose output is a random variable, such that every time an evaluation is made F(x), with the same value of x, the result is a random value of some, probably unknown distribution. In a sense is like trying to optimize f(x), a deterministic function, but you cannot evaluate f, but intead some other noisy functionF(x) = f(x) + e, where e~Y is some random variable, with unknown properties (I can alleviate this un-knowledge if necessary). Examples of this kind fo functions are simulations and functions in real life with measure errors.
So far I've found what's called stochastic approximation algorithms, which attempt to solve the problem, but seem to only be good for functions f(x) that have nice properties, like global convexity. I'm dealing with stochastic functions that are multi-modal and with non-global structure, so standard gradient descent methods don't do, even if we remove the noise source.
One alternative that's been suggested to me is to simply run a metaheuristic, but do 30 or more evaluations of F(x) in each step in order to approximate f(x) as best as possible, and then assume I've removed the noise. Even though I think this can possible work, I also think there has to be a better method, that doesn't involve running 30 or 40 times the number of function evaluations, specially because each evaluation consists of a simulation (hence the stochastic nature) which is already a lengthy process.
My guess is that by making some modifications to standard multi-modal optimization techniques (metaheuristics and such) which remove the effective sampling bias introduced by techniques such as elitism, we can expect the algorithm, even with noisy samples, to converge on the long run towards a good enough local optimum, without needing to evaluate 30 or more times each sample. However, I have no theoretical or practical proof of this assumption.
Can anyone supply links or resources in this topic?