When more than one model is calibrated the question arises if the calibration scheme should be equal for all models. Consider the following example: Four models are calibrated for one catchment. The first model is a lumped model and has four parameters, the second model is lumped and has six parameters, the third model is semi-distributed and has 14 parameters and the fourth model is semi-distributed and has 18 parameters. As a modeller you now have several options to calibrate those models. Let's just mention a few:
I have seen all of those schemes being used in papers (though it seems like option 1 is the most common) and I am sure there are plenty of others. All come with certain drawback. For example option 1 treats the more complex models unfair, as they have more parameter interactions, while option 2 and 3 suffer from a rather arbitrary choice of the amount of runs for a parameter/model kind.
So how do you handle this problem of deciding how many runs your model should have in calibration in regard to its amount of parameters? Is there a paper where this topic is discussed and a framework is devised?