The following link presents a general framework for understanding the uncertainty associated with the assumptions made in representing reality by numerical simulation. The IPCC climate change attribution research is illustrated as an example.
The measured output of the real process is the real output accounting for uncertainties. The modeled part of the system can give an output obtained computationally from the stated nominal model which differs from the above one.The error between both of them is the uncertainty contribution. So , if we have a nominal model and a real system including uncertainties , we can get one output from the first one while we can measure another ( real ) output from the real system.
Another situation is to have a nominal model ( for instance , a nominal transfer function G0 in linear control or a linear system , in general) and a worst- case real model of transfer function G ( a nominal transfer function + a perturbed one which can be additive , multiplicative or mixed- typical cases G=G0+DeltaG ( additive absolute perturbation model ), G = G0(1+DeltaG) ( relative perturb). The " Deltas" can be characterized for a concrete case ( the parametrical perturbation is known) or in a worst-case context- it is unknown but we know at least " a class" or a " famility" to characterize where the current model perturbation belongs to..
The framework is typical of model validation approaches (which should be referenced), where numerical model validation attempts to answer the question "how accurately does the numerical model represent reality (with all of its complexity)?" Given the many uncertainties inherent in the model and in reality, the demonstrated accuracy may be exceedingly low. Climate Change is an excellent example: the temperature anomaly charts seem unscientific and deceptive because they hide the data variability and uncertainty. Scientific conclusions cannot be drawn from unscientific data.
In my view, uncertainty within the model itself, can be extracted from the sensitivity of the model calculations to the model parameters which may have a range of 'realistic' values. There is also uncertainty related to the reliability of the model, which can be evaluated by comparing the model outcomes to those of other 'reliable' models.
As your schematic of the stages that connect reality and numerical solutions indicates, there are many degrees of freedom on this way. The most important of those, in particular those of un-understood effect, should be implemented along with one's favorite implementation. The learnings from the observed variations of the solution trigger new variations of the model's implementation. In this way an evolution of the models and its results is triggered which should finally result in some stable understanding of the part of reality which is under consideration.
Since running non-trivial models of non-trivial parts of reality may require heavy computational ressources, there may be strict limits to the number and nature of model variations that can be tested. Therefore it may turn out to be necessary to device submodels which need less computational resources for studying the influence of
method variations.
I once had the opportunity to run through an industrial modelling effort (for the toning process in electrophotographic copiers and printers employing magnetic brush technology) for some years. We started on PC's an ended up at the parallel computing facility of Cornell University. Nobody involved in this effort would have come up by mere thinking or by physical experimentation with the insights that we were able to distill from computed 2D and 3D movies of particle flows.
Thank you all for your feedback. I have updated the post with a recent reference to model validation. However, if any of you are aware of the most appropriate foundation reference in this area please let me know.
Hi Vassili. This is a big question Uncertainty in the model parameters can be (with a good prior) dealt with by sensitivity analysis. Uncertainty in the observation/state, particularly for chaotic systems, is usually accounted for with multiple simulations. In a general sense these model errors can either be observational or dynamic, and, model uncertainty can either mean mis-specification of the model or simply parameter mis-match. The effect of each is different. As you can imagine there is a big literature on the subject, but for my own bias have a look at: Physica D, (2001) 151: 125-141; Physica D (2004) 196: 224-242; QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY (2007) 133: 1309-1325 - and the references therein.
It is fairly easy to take the model, even a very complicated one, and see what are the uncertainties of simulations based on it, when the uncertainties of model parameters are known in advance. Interval computations is the tool to solve this task in a reliable manner (google it, if you haven't heard about interval calculations).
The reverse problem, i.e. finding of model parameters together with their uncertainties, in such a way to replicate true observations, is more complicated. I'm not sure which one you had in mind when asking your question, so I'm not going anything more at the moment. Except one feature: we solve this (reverse) problem under assumption that the model itself is correct. But is it? You can never be sure, 100% convinced..
It is difficult to identify a foundation reference because the modeling communities have established many inconsistent and misguided practices. If you are focusing on the simulation of reality and its uncertainties, your objective is "validation." The different communities routinely group "verification" with "validation" and proceed to mix the two as if they were interchangeable. They are not. Another common practice is to apply software verification and validation to a simulation implemented in software; software validation is part of simulation verification only. If something does not contribute directly to an assessment of the real-world accuracy and uncertainty of your simulation, it is not scientific validation.
It's necessary to understand the systematic effects that the model introduces in the description of the process one wants to model. Once more, the expression ``complicated'' doesn't mean anything useful, without further qualification. Every simulation method has systematic effects, for instance when solving differential equations a major systematic effect is how well the symmetries of the original equations are taken into account by the algorithm and how the finite step size and the finite simulation time can affect the precision with which they are realized. So the algorithm has to correct for that in some way.
Such effects are distinct from *statistical* uncertainties that involve trying to extrapolate to infinite time, or infinite number of samples.
In case you wish to provide an in some sense "optimal" design of a structure subject to uncertainties in, say, the material properties, then there are several frameworks for modeling and solution. One that I was involved in establishing is the SMPEC - Stochastic Mathematical Program with Equilibrium Constraints - in which sources of uncertainties lead to a stochastic type of hierarchical ("bilevel") model. Check these links to see if this may be of relevance to your research topic:
(in)validatios is a better terminology, based on epistemology (see K. Popper), because in science you can ONLY invalidate a model (or theory) given a data set, validation is impossible because you do not know future data (see for example what happened to Newton's theory with Michelson-Morley experiment which concluded in special relativity). A good reference for (in)validation is based on Roy Smith's dissertation (CalTech): Smith, Doyle, Model validation: a connection between Robust Control & Identification, Trans. Aut. Control, Vol 37, No 7, 1992.
Terminology is difficult to apply, because "validation" implies an endorsement or stamp-of-approval (i.e., "validated") to the general public. In computational sciences, however, model validation is carefully defined as a process to determine the degree of accuracy of the model. In other words, validation in computational sciences should result in neither a valid/validated nor invalid/invalidated model. The demonstrated degree of accuracy may be sufficient for one application, yet far from sufficient for another.
An interesting paper in Journal of Physics G: Nucl. Part. Phys., vol. 41 (2014) p. 074001, which is attached, addresses this question in the nuclear physics context. Hope this is useful.
Vassili, As was answered by Marek Gutowsky also in my opinion we should distinguish between the scope and the meaning of "model validation" and "model sensitivity/uncertainty".
In general in the inverse problems (Tarantola, 2005), we assume the existence of a "true and validated" theoretical parametric relationship (with a fixed number of selected parameters) between the model parameter space and observable data space. In this case our scope is the parameter estimate and in this framework we should distinguish between the model appraisal (measure of the parameters and its error) and the model precision (measure of the prediction capacity of the model, ie the residuals).
The appraisal and the precision of a reconstructed model are respectively infuenced: by the imperfections with which we are able to reconstruct the model parameters and by the measurement errors with which we are able to perform the measure . Obviously there is a trade-off and the non-uniqueness of the solution is an intrinsic feature.
If our scope is the validation or (IN)validation of the theoretical relationship (Ricardo S. Sánchez-Peña answer) fondamentally we should have and use a best practice protocol based on a simple and transparent operative logic (reference reported in Alexis Diaz-Torres and Hans U. Mair answer)
Finally if our scope, I think this is probably your case, is to estimate the prediction power or sensitivity and uncertainty of a simulation, I agree with the Michael Small's answer. In this case the Occam's razor principle (model semplification) should be applied, in order reduce the number of parameters and initial conditions to those which are sensitive with respect the process that you want simulate. Probably basic references are the Henry Poincarè, Edward Lorentz and David Ruelle works.
In my opinion your question is very important and constitutes the central point when we want to pass from the "parameter inversion" to that, we can define as " process inversion" with an approach based on a multi-parametric inverse problem in evolutionary sense. A difficult task if the process is characterized by an high degree of non linearity and sensitivity to initial conditions (complexity).
Be very careful when assuming "the existence of a "true and validated" theoretical parametric relationship (with a fixed number of selected parameters) between the model parameter space and observable data space". This assumption can only be a step in the prediction process, because a more fundamental realization is that all models are wrong (i.e., approximations and uncertainties). The scientific objective in making predictions is to understand "how wrong?" Only after understanding "how wrong" can one estimate the accuracy of the prediction; only pseudo-scientific estimates of accuracy could be derived from ignoring the approximations and uncertainties, assuming that the model is accurate enough.
Hans, I agree with you in fact I have distinguished the inverse problem issue and theoretical relationship validation issue. In the inverse problem this assumption is based on a coherence principle on the basis of the measure procedure and the degree of the theory approximation (parametrization). In your comment you should distinguish between "how wrong is the model" and "how wrong is the assumed theoretical parametric relationship" .
A simple example: in the seismic tomography is common practice to invert the first arrivals or with a theory of continuously refracted waves (diving waves->velocity gradient) or with pure refracted waves (headwaves->velocity discontinuity) and to each assumption correspond a fixed (sometime approximated) parametrization. It is clear that each assumed theory reaches a model characterized by the corresponding parameters and errors, perhaps with same precision (ie residuals). The only way is to select the assumed theoretical relationship on the basis of the observed wave propagation process on the data (coherence principle), if the measurement errors allow us to do it. The ideal condition is an inversion procedure which switches from an assumption to the other changing the corresponding parameterization of the model and that respects the coherence principle.
You can use random sampling of all the uncertain model parameters and study the resulting distribution of your output. This approach requires knowledge on what the realistic distributions for your model parameters would be. It also requires that you have identified all model parameters that significantly affect your output. This is however not the same thing as model validation, as has been pointed out earlier.