We want to estimate parameters of a given model from data, we have the choice of using the frequentist approach and then minimizing an estimator built from the model or maximizing a probability according to Bayesian approach. Any comments ?
It seems to me that frequentists want to start an analysis from scratch. When estimating parameters, they refuse to use any a priori knowledge of the situation. In throwing away all a priori information, they do worse when the prior information was useful, and they do better when the prior information was systematically biased. Supporters of clinical researchers are deeply afraid of rewarding bias, and accordingly clinical researchers will probably remain frequentists. (There are lots of other issues involved, but I am fond of trying to cut to the chase.)
As for the difference between "randomness" and "uncertainty", the difficulty is that the definitions are not universally agreed upon by leading scholars, let alone the general population. Some use the terms interchangeably. Others consider the terms to have very different meaning. In order to be understood, scholars need to define these terms at the beginning of a conversation. Our disagreements in definitions lead us to become annoyed when others use the words differently than we do. I predict that we will not resolve this by revising the terms so that there is more agreement. Rather, we will continue to use the terms even though we have not agreed upon the meanings. I give it 80% odds, but I am uncertain.
For having no special prior knowledge, both approaches are similar. If you do have prior knowlegde you want to consider in the analysis, then you must use the Bayesian approach.
There are some differences between the interpretation of the interval obtained for the parameter's value, due to differences between the hypotheses made to construct the estimator.
> Jochen: it seems there are ways to introduce prior, external information in a frequentist approach, but I cannot say more for the moment since I am just discovering this field. If it is indeed the case, the main difference between the two approaches would be in the interpretation of the results...
Frequentist : the estimated parameter is a constant; confidence interval are built using a method that ensure they contain this constant in 95% [conventionnal] of the cases.
Bayesian : the estimated parameter is random, having a probability distribution that is estimating from the data using Baye's theorem; credible interval is an interval is built using this distribution. I am not quite clear on the exact meaning of the 95 % associated to it (I never had the opportunity to think about Bayesian approaches in details, despite interested in it ;): 95% of this distribution?
There are different ways to contruct 95% credible intervals. They all refer to the posterior distribution. In principle, one can draw a horizontal line through the posterior so that the area under the posterior between the intersection points is 95%; or one can "cut off" 2.5% of the posterior area at both tails (similar to the contruction of confidence intervals, which is based on the sampling distribution instead). The give the same results for symmetric posteriors.
For a "flat prior", both likelihood (which is a representation of the sampling distribution) and posterior are the same thing (only difference is a scaling factor making the area under the posterior density equaling unity). Hence, flat-prior-cred.int. and conf.int are numerically the same.
I personally have a problem with the interpretation in the "frequentists sense". The parameter may well be a constant, but this is not at all relevant to the statistical problem we have. Relevant is: we DON'T KNOW the (actual or constant) value of this parameter. In other words: we do have uncertainty about its value. And "uncertainty" (or incomplete knowledge) is expressed in terms of probability distributions. So it is natural and essetially required to assign a probability distribution to the unknown parameter (no matter whether or not this parameter is a constant or a variable or whatsoever). We dont know its value -> we describe our "lack-of-knowledge" by a probability distribution. Since the frequentists do *not* do this, the result can only be interpreted in a statistical way, that is, it doen *not* tell us anything about *this* particulat result/experiment/finding. This is exactly *not* what a researcher usually wants. This is ok for screenings, as an automated sorting-out of results that won't be further considered. Is is relevant wherever "false-positive rates" (and "false-negative rates") are itself of scientific or - more often! - of economical interest. In (basic) research, however, we think long befor performing a well-designed experiment based on a lot of experiences and expertises, models and hypotheses. The we do have only this data from this experiment, and we want to learn something about our models and hypothesis (how good are they? do we have to refine them?). In the ovverwhelming majority the main questiong in research is "how BIG is an effect (change, difference,...)"? Instead, as being trapped in the "school of frequentistic null-hypothesis tests" most researchers reduce it to "IS there an effect?". Firstly, this question is rarely sensible in real-life experiments (there will always be negligible/irrelevant effects). Secondly, the answer from the hypothesis tests is *NOT* the answer to even this simpler question.
Hence, the "frequentist interpretation" is:
"The result is statistically significant (we reject the null-hypothesis at the 5% level of significance; the confidence interval for the effect did not include the null value). We still do not know whether or not we do have an effect here ("true effect"). We can anyway never ever know if we have a true effect. Based on our experimental results we now just decide to state that there is an effect. We can not say how confident we can be that we are right with this statement in this particular case. However, since we stick to this testing procedure all the time, in the worst case there will be not mot than 5% wrong decisions."
The "Bayesian interpretation" is:
"We do not know the value of the parameter. Based on the data from this experiment (and on prior knowledge, findings, data, experiments,...) we almost certainly expect this value to be in the range given by the credible interval. If you would repeat this experiment, we expect you to get similar results; your best guess for the parameter value will likely be in this credible interval, and our best guess will likely be in your credible interval (when you use the same prior)."
As a researcher, I can barely see any advantage/usability from the "frequentist interpretation", whereas the "Bayesian interpretation" is what I intuitively would expect from the analysis of an experiment.
But, following the Bayesian "tradition", I am willing to change/adapt my perspective based on new information ;-)
Hi Jochen,
I think I disagree with several points in your answer, so would be very happy to start discussion...
First of all, I don't think that « frequentist approach » is responsible for the abusive use of p-values and tests instead of estimations. You can make tests with the Bayesian approach also... and frequentists approaches can give confidence intervals that will answer the problem of estimation. So the discussion about abusive use of tests is, in my opinion, another topic not related to this one (however, I completely agree with you on this special point, regardless of the kind of approach used). I would also add that likelihood ratio tests are just based on this idea of learning from the experiment: is the experiment more likely with this model or with this other model? (Remember previous discussions with Jeffrey Welge...)
Second, I think you mix "randomness" and "uncertainty". I mean, the confidence interval is exactly that: a tool to have an idea of the uncertainty on the estimated parameter, without the need to assume it is random... We do not known what the true value is, but we now is should be in this interval.
Reading your construction of the credible interval, it means that you have 95% chance that the realisation of the random parameter will be in this intervall (or was in that interval). But if you make another experiment to measure the same parameter, this suggets that the parameter value has changed, since it is a realisation of a random variable ... or do you think there was at the beginning of world a realisation of this parameter and since not any one? But why call it random then? and if not, is it realistic to think that the parameter will change its value from one experiment to another ?
Assuming that the estimated parameter is itself random is, in my mind, often counter-intuitive and not realist. Let's take two simple examples...
First, you try to measure the speed of light in void / the height of a given tree in a forest at a given time. This quantity is a given one: what would be the meaning to
say that it is random itself? To model the imprecision of the measurement, you will make the assumption that the _measurment_ is a random variable, with (let's say) mean the true value, but not the _true_ value. Why adding additional, counter-intuitive, hypothesis in the model? So in metrology, the idea of the parameter itself random seems to me very strange. The _parameter_ is what it is, it is not the result of a random experiment!
Second, you try to measure an average blood pressure in a population. Then the given blood pressure for a patient is a random variable with mean this average blood pressure. This random variable contains the variability between subjects and all others things you want. But what about this, purely conceptual, average blood pressure of the population? How can you justify that it is itself random? You can assume it to build a Bayesian model, or assume it is constant, this is just hypotheses. In both cases, you will have insights on the uncertainty on its value by the built intervals...
So, I do not understand why you says that assuming it is constant is not statistically relevant...
However, I agree that Bayesian approach can be a fruitfull model to give answers to some questions, by leading to easier to compute methods... But would not say that the "frequentist" method is useless... and that it does not answer the question of uncertainty on the parameters!
>> I think I disagree with several points in your answer, so would be very happy to start discussion...
I would be glad to get a constructive discussion. This way I always learn a lot.
>> First of all, I don't think that « frequentist approach » is responsible for the abusive use of p-values and tests instead of estimations. You can make tests with the Bayesian approach also... and frequentists approaches can give confidence intervals that will answer the problem of estimation.
As I understood, a (95%) confidence interval is "an interval around a point estimate that ensures that 95% of such intervals constructed the same way will contain the true in the long run". It explicitly does not claim that this particular true value has a chance of 95% to be in this interval. The frequentists approach does not allow getting a measure of (un-)certainty of a particular result. The interval either contains the true value or it doesn’t. There is no probability value assigned to this statement. I think this if often mistaken. To interpret this as probability is a Bayesian point of view.
>> Second, I think you mix "randomness" and "uncertainty". I mean, the confidence interval is exactly that: a tool to have an idea of the uncertainty on the estimated parameter, without the need to assume it is random... We do not known what the true value is, but we now is should be in this interval.
This is just not what a confidence interval actually says. But it is what many people think they should be. This, in turn, is a Bayesian view, at least to my understanding.
To mixing "uncertainty" and "randomness": this is a crucial point. I tried hard to make a distinction between these two words, but I can't see it. In statistics, a process (or variable) is said to be "random" when the exact outcome (value) is not known a priori (before actually doing the experiment/observation/measurement). Philosophically (!), randomness is seen as the opposite of determinism – but this is not the topic of statistics. Now, those things we call "random" (in statistics) we express that we lack knowledge to give an exact prediction. This "lack of knowledge" is called "uncertainty". Hence, "randomness" refers more to the variable from which we do not have complete knowledge (of dependencies, interactions, relations, …) whereas "uncertainty" more refers to our (incomplete) state of knowledge about a variable. In essence, both words are just about the same. Maybe I am wrong here. I have to admit that I never understood the distinction made by some people. It would be great if you find a way of explaining that I will understand. So far, all further arguments from my side will be based on this equality of randomness and uncertainty.
I know that frequentists say that a parameter is not random. It is just a fixed thing, only that we do not know its (exact) value. Here my personal problem arises since I do not understand this subtle thing that something is not random but unknown (i.e. we are uncertain about its value).
>> Reading your construction of the credible interval, it means that you have 95% chance that the realisation of the random parameter will be in this intervall (or was in that interval). But if you make another experiment to measure the same parameter, this suggets that the parameter value has changed, since it is a realisation of a random variable ... or do you think there was at the beginning of world a realisation of this parameter and since not any one?
Here I see a conceptual mistake: The estimate is *not* a realization of the random parameter. It is a guess about the value of the parameter, based on some data. The parameter does not "realize". There does not even need to be any physical basis of the parameter. For instance the mean (average) is nothing that physically exists and tries to emerge in "realizations". The sample mean is not an offspring of a "population mean" existing in the subspace and as such popping up in reality. The only things really existing are the observed data. There is no change in the parameter value just because we use different data to get "best guesses" of a reasonable value of the parameter.
>> But why call it random then? and if not, is it realistic to think that the parameter will change its value from one experiment to another ?
We call it random because we do not know its value. Changing: Yes, why not? But! If we talk about a fixed, defined population that is sampled, the parameter value obtained for the data from the whole population is fixed, for sure. Here, we do have something like a "population parameter" that is estimated by a "sample statistic". This is a just special case. Estimates can still refer to parameters that may change (over time, for instance). Clearly, the frequentistic claim that the long-run average will approximate the (population) parameter value does not hold anymore (because then there is no such thing like a population and a population parameter).
>> So, I do not understand why you says that assuming it is constant is not statistically relevant...
I hope this is clearer now (this may still be wrong, but then you can explain more precisely where and how…).
>> However, I agree that Bayesian approach can be a fruitfull model to give answers to some questions, by leading to easier to compute methods... But would not say that the "frequentist" method is useless... and that it does not answer the question of uncertainty on the parameters!
Exactly your last statement is a Bayesian statement, not a frequentists one.
Looking forward revising my perspective :-)
Hi Jochem,
I'll try to give answers (representing my own conception) on your points.
First, you give in your (correct) description of the confidence interval a perfect illustration of the difference between "uncertain" and "random": as you said, the sentence « The true value is in the (observed) confidence interval » is either true or not, but one does not know it --- this is uncertainty. « There is no probability associated with it » --- this is not random.
So, for me, uncertainty and randomness are two different things, the second one being related to probability theory and being a useful tool to evaluate the first one in some cases, not in all.
« Uncertainty » means you do not know, or not with the required precision you expect, that the « truth » is --- for instance, what is the real value of speed of light in the vacuum, reusing this example.
« Randomness » means that the result of your observation/experiment (in a large meaning) is not completely controlled, hence cannot be predicted without uncertainty. It can be from « real » randomness (atomic radioactive desintegration) or from « convenience » randomness, just because you do not want (or practically cannot) consider all the conditions that will lead to the result despite it is determinist (biological variability for instance).
Probabilist theory can be used to try to model the random part of the prediction of the result of an experiment, to « quantify » the uncertainty. But it cannot handle all kinds of uncertainties, or at least is not always required.
For instance, imagine you says « tonight, I'll go either to theater or to see a movie ». There is uncertainty, but there is no randomness in the result itself. You can however use a probabilist approach to try to quantify the odds of each of these choices... But unless you really choose by throwing a dice, there is no randomness in the process itself, despite there is uncertainty in the moment speaking...
Assuming these examples are convincing, I continue making the distinction between uncertainty and randomness.
*** On the confidence interval interpretation ***
I think your interpretation of the confidence interval is, in a way, too frequentist. For me, the approach is (still using the speed of light example)
c is the real speed of light in the vacuum. I will make an experiment to measure it; however, due to various reasons I do not want to explore yet, the result of the experiment will not be c, but something "close" to it: the result is somehow uncertain. To model this, I will say that the observation result is the consequence of a lot of random events, hence the observation of a random variable X, whose distribution function depends on the real c value. Assuming the expression of this law, I can derive a random interval [A, B] that satisfy p( c \in [A, B] ) = 0.95: the confidence interval. Then, I do the experiment : I get the realisation of X, then of A and B, then a realisation of the confidence interval, [a, b]: the « observed » confidence interval. Note that there is no notion of « If I would repeat the experiment » or things like that: I think this last part is just a convenience to illustrate the probability, using the old, historical, definition of probabilities (this is why I said « too frequentist » before, and I don't think the « frequentist » term is really adapted now, but that's a personal idea).
I do not know if c is in it or not; however, because of the way I built it, I am confident it should be in it. There is no probability associated to that last sentence, as you noticed; however, I still believe it (or not, but in that case you just don't do the experiment & the model this way...): I have an idea of the uncertainty on c. Despite no randomness is introduced on c. Neither any Bayesian notion.
*** On the Bayesian version of this ***
(at least as I understand the Bayesian framework)
In a Bayesian framework, you will say that the speed of light in the vacuum itself is modeled by a random variable C (or is, but seems strange for such a physical parameter). But this means that each new « need » of the speed of light is a new realisation of this random variable, hence will change (unless it has a null variance).
Now, assuming this, and also a model for the observation, you will say that p(X = x/C=c) has a given law, and assuming a law for p(X = x) you will use the Bayes theorem to get p( C = c / X = x ), the conditionnal law of C knowing your data - right?
Then, you build an interval I such that p( C in I / X = x ) = 0.95: credible interval, still right? But what you wuld really like is p(C in I) alone to completely answer your question, no?
Now, you believe the « right » value of your parameter is somewhere in I. This also quantify your uncertainty. But once again, when you say this, this either true or not. And if c is not random, what is the exact meaning of the probabiliy you constructed? Since this probability means « the probability that the realisation of the random parameter C is in I is 0.95 », and not « the probability that c is in I », because « c in in I » is either true or not once again and you do not know this once again...
For such physical parameters, well defined, Bayesian hypothesis of using a random variable instead of a constant seems strange to me (I recognize it can be useful, but the interpretation of the result is not as straightforward as it may seem, I think).
However, change « speed of light » by « normal blood pressure in a population » and I will be more convinced, because defined as such, blood pressure is not a physical parameter but just a simple way to describe a distribution of values in a population, which is well represented by a random variable... And in that case, I agree that Bayesian approach may be more natural or give a more interesting result. --- Note that changing by « measure the instantaneous blood pressure of a given patient at a given moment », I will still stay to the frequentist approach instead...
But (and I hope the parallel between the two examples will be convincing) I think saying that Bayesian is better than frequentist is abusive. Both have advantages and drawbacks, and since their assumptions are different, different problems may be treated more naturally with either one or the other.
As for parameter changing with time/other reasons: this is the reason for covariates, so it does not make the point between Bayesian and frequentists approaches, unless you basically assume that your quantity of interest is intrinsically random (as in the normal blood pressure above) and not constant (as the speed of light in vacuum).
I think our main difference is in fact the interpretation of randomness and uncertainty.
You talk about "truth" to distinguish both expressions. This is a very philosophical aspect. Practically, we only have data (observations, measurements), and we try to find patterns, rules, structures, to somehow organize the data is a *useful* way. What is considered "useful" is another issue. Maybe we can come together in saying that the way of data analysis is useful if it allows us to make predictions that are precise enough and reliable enough to reach some goal (this goal can be just to talk about something where both have a similar understanding of what they are actually talking abaout; it can also be something to ensure some outcome, for example a functional maschine, a successful treatment of a disease, a monetary profit, ...). To give an example: We gather a lot of data (with all our senses) and find it useful to attribute certain collections of these data to objects. This allows us to distinguish different kinds of objects and also to generalize objects as instances of "classes". One of these objects may be a you, as a person sitting somewhere in front of a monitor reading this text. We generalize you as a part of the class "human being" (hopefully :-)) because of similar patterns in the data we get from other objects. This is surely "useful" - but it is neither clear what precisely is "you" and what precisely is a "human being". For instance: Do the bacteria on your skin and in your intestines belong to "you" or are these other objects? Are the cells you lose still part of "you"? At what point are the food, water, and air you take in part of "you"? Is it still "you" in a couple of years when most atoms in your body are exchanged? Is it possible to define a human being isolated from the environment?...
To my opinion, it is practically very useful to talk about "things", "humans", "persons" and so on, but in essence, all these "structures" are just our brains concepts, they are models. Nothing more and nothing less. It is cumbersome to speculate about a philosophical "truth" of existence of any of these things.
So we get to your example with the (vacuum) speed of light: You use it to have something indiscussable "truth". But the whole frame od the definition of the speed of light, the definition of light itself is just a model. Space, time, energy - they are nothing else but our ideas, our models to structure data in a useful way.
In our physical models, "speed of light" is taken to be constant. Unfortunately, we defined its exact value. But this (exact) value is defined, not measured. Actual measurements give (more or less) different results. Statistically, "speed of light" is thus a random variable. We could well think/imagine that the speed is *not* constant, that there are in fact "quantum fluctuations" in the speed. The speed may be considerably different in each attometer, "averaging out" when a distance of many of such attometers is used as reference. Also, it may be imagined that time and space are varying in such a way that - by all means we have to measure - the speed of light is constant, just because the variability is cancelled out.
What I try to say: a concept of "truth" and "constancy" is (scientifically!) as needless and as problematic as the concept of "god" or an "author of our being" (note to say that such concepts may be well useful/helpful personally in some circumstances for some people). We have data, that is all. Whatever we construct are models. And we do not measure parameters of distributions. We measure data, and we derive parameter values based on this data. There is no such thing like a "truth parameter" existing. Neither the "unmeasured data" is existing.
I think that we should fist find a common position on this before we can discuss about differences in randomness and uncertainty. If you insist on the concept of "truth" - that's perfectly fine - then I think we won't be able to come to a conclusion here, unless you convince me that such a "truth" trueley exists... what I cant imagine to be possible...
Hello Jochem,
I think I agree with your first sentence, we seem to disagree about the difference between uncertainty and randomness, so may be it will be difficult to go to a consensus, but let's try, it's always interesting to have to think about our bases ;)
In the great lines, I agree with you also that all that we can do is some model of the reality, so I will not retake all of your post, but only give my personal opinion about a few points on which we seem to disagree.
First, about « truth ». I think, if I say « the Earth is a nice cube », or « a man has 3 lungs », you will say without difficulty that it is wrong, no?. But this means that you implicity classes sentences (affirmations) in (more or less) wrong catagories. And if you accept that some are wrong, then some must be true, as the opposite. If you do not like the words, you may say instead « contradictory to our experiments/data » and « not in contradiction », but that does not change the idea behind this I think.
However, I definitely agree that all we can do is some things that are more or less close to reality, and what I call « Truth » is just an ideal impossible to attein, so that some of my affirmations may be too strong. The aim is to have this model as close as reality than needed, but not necessarily more if adding too much complexity that will make it too difficult to handle.
Despite this, I think most of my arguments are still valid removing the idea of truth and seeing « uncertainty » as a way to define how big is the amount of unknown things in our result, and « random » as a (theoretical also) way to define a kind of process generating results. The second beeing a good tool to quantify the first, but not a necessary one.
For instant, imagine you have a straight piece of wood to measure [length I mean], and you have a graduated ruler. Imagine the piece of wood ends just in between two graduations. You will say « the length of my piece of wood is between 12 and 13 mm » : you have uncertainty on your results, usually sufficient for all practical purposes, but not a single need for randomness in quantifying it or defining it. You can even use it for prediction purposes, like « if a make a square the side of which is this piece of wood, its surface will be between ... and ... » or things like this (using error propagation theorem or even direct calculation for most simple cases), with no need for probabilities and statistics. Of course, you can add probabilities after if needed or wanted, like « the length I measured is a uniform random variable on [12, 13] » or things like that, but that's not required in the process of defining uncertainty.
Randomness now... This is much harder to describe outside the mathematical framework of probabilities, but this last one is, as all mathematics, just an abstract thing so has no absolute concrete interpretation. But I think something practical like « you say that the result of something is random when you cannot predict which of the possible result you will obtain, just because there are too many and too complex causes to be able to make this prediction using deterministic methods. To overcome this, you try to imagine all possible results and "class" them as more or less probable ». I think this describe quite well most of the situation in which we use random variables, and corresponds also quite well with the idea of a space of events and functions associating them to 1) a measure of how likely they are to be observed [probabilty law on the space] and 2) an observable result [random variable from this space to the space of observable results]. For the classical dice, space of events would include various possibilities of strength, direction, dice exact form and weight... and observable results space of {1, ..., 6}.
As it, in randomness, uncertainty is on « which result will we observe » but not on the result itself --- when you throw the dice, you do not known what number will appear, but if a 6 appears, there is no uncertainty that it is a 6. But, in a way, I would agree that randomness implies uncertainty --- this is not equivalent however, as (tentatively) shown by the previous example.
Now, going back to your post. There is the data, but not only: there is a reality outside it, that we try to model and understand. I agree with the problem of defining objects in all details, but as before I would say it will depend also on the need of these objects and, anyway, seems to me another debate, so will not discute it further. When making your experiment, you have an idea of what should describe this reality. For instance, for speed of light, that a constant is enough [at least, as far as I know]. So rejecting a model to interpret the results that it is not consistent with this idea of the reality seems strange (I did not said wrong, however), especially if not needed to answer the question asked.
I also think there is confusion in your post between what we measure and the tool we define to measure it. The randomness may come from the measurement process, not from the object itself (think of the wood piece: at a given time, for given physical conditions, it has a given length for all practial purposes, unless you want to go to the atomic scale but what would be really an overcomplex model if you just want to see if it fits in your new furniture...). Despite this, we have uncertainty on the object itself, because of the measurment process imperfections (not all random...). Does this mean that we _must_ model the object length itself by a random value?
« Frequentists » would say « no, it is enough to model the measurment we make of it as a random variable » ; Bayesians (if I understood correctly) say « yes ».
My opinion would be that Bayesian approach is certainly useful in some circumstances [I think in biology it may be particularly true because values of interest are often not so physically defined than a piece of wood], but not all, and the « frequentist » confidence interval is also a suited, albeit different, response to the question « what is the uncertainty of our estimation? ». Simply rejecting the frequentists approach as unappropriate is, I think exagerated.
Last, I think my examples above are still valable without the word « truth » in mind, using only the definitions of "randomness" and "uncertainty" as introduced here, so I would be interested if you have comments on it (or on this last point)...
Hi Emmanuel,
it is fundamental for learning to think about our bases. This is what we do here, and this is a very good thing. It might be that we won't be able to come to a conclusion, but we might both get new impuses to direct our thinking. Believe me, even if you have the impression that I am ignorant about your arguments – they well do influence my thinking. It may take much longer than this discussion to open up really new insights for me. I also find it quite challanging to discuss a philosophic topic though I am not a philosopher – and on top all this in a foreign language…
The fact that some data (strongly) contradict some hypotheses does not imply that there must be "true hypotheses". Surely, there may be other (wrong) hypotheses which are not so much contradiced by data. Scientifically, we *should* not say that the statement like "men have 3 lungs" is wrong. Experinece shows that most have two lungs, few have only one. If having 2 lungs was the "truth", what is a man with one lung then? Untruth? Unexisting? Further, we just do not know if there is some strange mutated phenotype with three lungs, too (at least I don't know). Shortly: having a theory / hypothesis that does not fit to the data does not imply that there must exist another hypothesis that better fits to the data.
Moving "reality" and "truth" to the border (or beyond it) of practical perception does also not solve the problem. Just because some experiences are very reliable and highly precise does not imply that these experiences itself are related to a reality or truth. We can determin the half-life of some radioactive isotope very precisely and reproducibly, but the concept of "time" itself, also the concept of "atoms" decaying sending out other quantum particles and/or elektromagnetic waves are all models. Time is what we measure with clocks. It is the measuring instruction that "creates" time as a physical quantity. The same is the case for all other things. The fact that most of these models are very, very good does not by itself imply that they do describe a reality/truth. All these things might be "matterless shadows" of something very different. Or something entirely different (we generally have a very limited imagination).
I agree on "uncertainty" as a measure of "lack-of-knowledge", but not of knowing the reality/truth, but knowing what to expect when we would do a measurement (prediction). I also agree on "random" as a process of generating results. A random variable is a random process translated by a measurement instruction (i.e. operationalized). I do not see how the latter can be a tool to quantify the first. To my opinion, uncertainty is related to random variables. If there was no uncertainty associated with the variable, we would be absolutely sure about each value ("realization") and in this way the variable was not random. The problem with the word "random" is the doulbe-meaning as a) "not knowing what happens next" and as b) "not deterministic". Definition b) implies a), but a) does not imply b). Definition b) is not the subject of statics (be it metaphysics or philosophy). Thus when I talk about "random" I mean that it is a process or variable where I do not have enough information/knowledge (i.e. I am uncertain) about to predict an observation/measurement.
To the ruler-example: The fact that the piece of wood can in principle have different possible length makes it being random. If it was given, then there is no need to measure its length. In contrast, in such a scenario the length of the wooden piece would be used to calibrate the ruler. Further: Surely, I do not need a mathematical description of a probability distribution to define uncertainty. Probability theory just allows a better quantification (and propagation!) of uncertainty.
Here I may cite you: "But I think something practical like « you say that the result of something is random when you cannot predict which of the possible result you will obtain, just because there are too many and too complex causes to be able to make this prediction using deterministic methods. To overcome this, you try to imagine all possible results and "class" them as more or less probable »". I absolutely agree with this. The rest of your paragraph I did not really understand. Data is not random. Data is constant. It is the variable that is random, the space of possible data, not actual data. There is no uncertainty associated with data. The data at hand is the only thing that is certain. The big question is what this data might tell us about these things we have not observed yet.
To my confusion regarding the measurement you suspected: Essentially, it is not at all relevant (from the statistical point of view), for what reasons observations are variable. The wooden piece you are talking about is also just part of the measureing instrument/instruction/process leading to the data. We may consider and model a lot of covariables to improve our predictions/expectations about the length of a wooden piece, as we can consider a lot of environmental influences on a physical device we use to measure something. For instance if we measure the blood pressure, we can measure it by different instruments. Each kind of instrument reacts more or less sensitive to some environmental factors. We can model these factors (and measure them, too) or we can try to keep them constsant. Still the results will depend on the mood (and activity state and health state…) of the patient, that may change between measurements. We can also try to keep this more or less constant. If we consider different patients, more factors introducing variance come into play that we may not be able to control. Using different patients but is nothing principally different to using different instruments. The combination of instrument/patient/environment is always the inextricable entity leading to the data. Varying the patients gives us information about what to expect in unseen patients. To answer your question: There is no law telling us to model anything as a random variable. It is convenient to model things as being known if the marging of error (or range of uncertainty; I don't know how to say) is negligible for the problem at hand. The random variable is an operationalzed variate, so there is no such distinction between "mesurement process" and "thing on it self".
I am not rejecting the frequentists apporach as unappropriate. It is very useful for a couple of problems (for instance controlling error rates). Any exagerated formulation from my side is intended to be provocative. Nevertheless, it is the very heart of frequentistic reasoning that the confidence interval is again a random variable that is explicitely not estimating a precision. I am still of the opinion that the interpretation of the CI as a measure of precision (or uncertainty) if a "Bayesian view on a frequentistic result". Get me right: this interpretation is useful! But is is not the frequentistic interpretation, rather it is the interpretation as a Bayesian credible interval with a flat prior.
Looking forward reading your resonse :-)
PS: There should be an award for most text-intensive discussions...
Hi Jochen
OK for the extra-award ;) And I was wondering, by the way, if we could continue by e-mail, because I find RGate not very practical for long texts and reading the message we are answering two in the same time as writing the response. For such a discussion, I find this annoying because I always think I forget something in your messages so answer "on the side" because of that...
As for truth, I think as far as you introduce the idea of a measure of wrongness, with propositions more wrong than others, the idea of truth is necessary as an ultimate goal, or as the limit when this measure tends towards 0... Which, as all limits, does not mean it can be achieved (like, in physics, the 0 K, absolute zero temperature, which cannot be achieved but still is the goal... or many limits in mathematics)
A question about your sentence "A random variable is a random process translated by a measurement instruction (i.e. operationalized)": do you mean "random process" as the well defined mathematical object? Or as the idea of a process in the physical meaning, and that is (seen as) random itself? In the first case, I do not really agree, because it is not a mathematical definition. In the second case, I do not have the same idea: it can be, but the random variable can also be the result of the measurement, and not of the thing measured itself: the measurement process introduces the randomness, because of all its limitations & imperfections. But may be it is the point we are differing on till the beginning, hence our different conception on what is uncertainty and what is randomness.
As for the ruler example: I would agree with your idea of a population of sticks from which I take one if I were interested on something like an "average" lengh of sticks, like I can define an "average normal blood pressure" in a population. However, this is not what I am interested in in this example: I just have this stick, and want to know its own length (and not the length of any population) --- this is why assuming length itself as random does not make me happy. The "only" reason for which I cannot answer precisely enough to this answer (assuming the stick has a well defined length and we agree on what it is --- see it as a perfect cylinder, both ends are plane and parallel, so no ambiguity on this, even if this is a model of the stick itself) is the fact that my measurement method is not perfect. That does not mean either it is all random: if you repeat carefully the experiment, the result will always be between the two graduations. So the answer to my question "what is the length of this precise stick?" is "it is between 12,5 and 13 mm": uncertainty, no randomness.
For data : "data is not random", I would say it is the same as "result from a dice is not random". The result of the measurement is strictly speaking not random, as it is a number, I agree with you. However, whatever the approach you use, if you make statistics you see them as realisation of random variables (the big question beeing "are they really random" sometimes, that is another debate), no?
In my opinion, in most cases (but not all: see ruler examples), at least a part of the data are well seen as random variable realisations, either because they are measured on individuals that were randomly selected in a population in which each individual has (can have) a different value or because the measurment process is inherently imperfect and well modeled by a random contribution. But some part of the data can be unrandom --- sex for instance if you _choose_ that half people will be men, hald women. In fact, knowing which part is random and which part is not may be important to select appropriate tools in the analysis...
Last, frequentist CI : I agree that confidence interval is not estimating a precision (which is something different againts than uncertainty, at least to me since it is very related to a mesure of dispersion like variance; precision is also a measure of uncertainty however, at least in my opinion). However, I don't agree with your consequences of a Bayesian interpretation of the CI: as you said, interpreting it as "there is 95% chance that the value is in [1.25 ; 2.45]" (let's say) is wrong --- so it should not be done, even if easy and tempting.
But that does not mean the [1.25 ; 2.45] does not give you an idea of the uncertainty on your parameter: if you trust your method (and you should, however why use it?) and your data, then the parameter value should be somewhere in this interval ==> this a measure of uncertainty (but not a probabilist/random one, I agree). The 0.95 probability used in the construction of the interval is just here, I think, so you can faithfully trust the method: you know before doing the experiment that it has a high probability to give a correct answer.
Hope I forget nothing ;)
There are a thousand philosophical differences. But, in practice, there are three:
1) Bayesian methods like MCMC can estimate models with hundreds or thousands of parameters, even highly nonlinear ones, in a 'reasonable' amount of time. This is especially valuable when there is only a small amount of data per unit (e.g., information on each individual).
2) The Bayesian can treat ANYTHING unknown as a "parameter". This means missing data, or even sampling for unidentified models, and worrying about setting identification later.
3) Most important: the frequentist needs to make parameter distributions look like a known distribution, usually z, t, chi2, or F. The Bayesian just samples, and doesn't care whether it comes out looking pretty. The Bayesian also doesn't need to worry about reparameterization, since substantive results don't change under them.
FF
One more thing to consider: who is your audience? While I am currently trying to learn BUGS and the Bayesian approach for my own edification, I don't see clinical research letting go of the frequentist approach, especially p-values, any time soon.
It seems to me that frequentists want to start an analysis from scratch. When estimating parameters, they refuse to use any a priori knowledge of the situation. In throwing away all a priori information, they do worse when the prior information was useful, and they do better when the prior information was systematically biased. Supporters of clinical researchers are deeply afraid of rewarding bias, and accordingly clinical researchers will probably remain frequentists. (There are lots of other issues involved, but I am fond of trying to cut to the chase.)
As for the difference between "randomness" and "uncertainty", the difficulty is that the definitions are not universally agreed upon by leading scholars, let alone the general population. Some use the terms interchangeably. Others consider the terms to have very different meaning. In order to be understood, scholars need to define these terms at the beginning of a conversation. Our disagreements in definitions lead us to become annoyed when others use the words differently than we do. I predict that we will not resolve this by revising the terms so that there is more agreement. Rather, we will continue to use the terms even though we have not agreed upon the meanings. I give it 80% odds, but I am uncertain.
See the Springer, 2010, book, which answers this question well from both philosophical and statistical/technical perspectives:
A Comparison of the Bayesian and Frequentist Approaches to Estimation
Francisco J. Samaniego
Jochen Wilhelm
Hi Jochen:
Thank you very much for detailed answers, and they still shed light on the critical distinction between Bayesian and frequentist approaches.
From my limited knowledge, Bayesian statistics would be the more natural tool to uncertainty quantification. First of all, the widely used Confidence Interval in the frequentist framework is actually often misinterpreted in a Bayesian manner - i.e., there is a 95% probability that the true value of the parameter of interest falls within the estimated interval. I prefer to use Bayesian approaches because of the more intuitive and convenient interpretation of the posterior results. In particular, the Bayesian posterior results always provide a full probability distribution, from which one can conveniently get any statistics about that parameter, not just the mean.
This Bayesian feature is a huge strength in the field of uncertainty quantification, for easy interpretation of uncertainty. Also, this full probabilistic characterization of parameters allows us to investigate the problem of uncertainty propagation within a system, although it may be extremely computationally expensive, e.g., the robust design problem.
In the frequentist framework, I was wondering what else can be used quantify uncertainty other than the Confidence Interval (it has an awkward interpretation)? I don't think the frequentist approaches are able to produce a full probability distribution for parameters.
I would appreciate it if you could kindly brief me on how frequentist approaches are used in uncertainty characterization.
YF
Dear Yu,
I am not sure how to answer your question.
Frequentists restrict uncertainty statements to data (obeservations, events) and thus all interpretations (p-values, CIs) are about data (and not, as we might hope, about our models or parameter within such models). The good thing is: given a correctly specified model, the uncertainty about data is objective. If probabilities estimates are derived from relative frequencies, then all probability statements derived using such probabilities can again be interpreted as estimates for relative frequencies. The problem is that it usually is quite a matter of subjective arguments which model should be considered "correct". And at no point there is "our knowledge" involved.* The results of a frequentists analysis never allows us to say what we know and what we don't know. Knowledge is not part of the frequentist philosophy. We can only say: assuming this parameter has that value, this data would by so-and-so likely. And assuming it has another value, the data will have a different likelihood. There is no way to judge the quality/correctness of the different assumptions. Interpreting the assumption giving the maximum likelihood of some observed data is actually beyond any frequentist possibility (as you already mentioned).
Bayesians extend the application of uncertainty beyond data to models and parameters therein. These things don't "replicate" in any way, so probability statements about such things cannot mean anything related to frequency. This view is not only used for models/parameters but also for data. Although such probabilities may be "adjusted" or "calibrated" by relative frequencies of observations, the meaning remains different from a relative frequency, and the actual probability value can be arbitrarily far off from a relative frequency one may observe. Such probabilities don't refer to frequencies, they don't refer to data (observations/events) - instead they refer to what we know about such things. This knowledge may be good or bad, it may be severely biased. There is no way to judge this. The posterior simply describes what one ("we" as a community, or a single person) knows about the phenomenon. This knowledge is subject to revision by incoming data.
*I personally think that the "subjective" choice of a model essentially renders frequentist probabilities into something subjective. If we say that the probability of a tossed coin landing "heads" is 0.5, we meant that - by all we assume about the tossing experiment - we expect "heads" as much as we expect "tails" to come up in any particular toss. If the assumption of the experiment change, our expectation will change. That way, "probability" is actually nessecarily a measure of our expectations, thus in a way representing our knowledge, even in a frequentist setting. Why else should we need the concept of "probability" if we could otherwise call it (estimated) relative frequenties?