Fisher introduced the concept of fiducial inference in his paper on Inverse probability (1930), as a new mode of reasoning from observation to the hypothetical causes without any a priori probability. Unfortunately as Zabell said in 1992: “Unlike Fisher’s many original and important contributions to statistical methodology and theory, it had never gained widespread acceptance, despite the importance that Fisher himself attached to the idea. Instead, it was the subject of a long, bitter and acrimonious debate within the statistical community, and while Fisher’s impassioned advocacy gave it viability during his own lifetime, it quickly exited the theoretical mainstream after his death”.
However during the 20th century, Fraser (1961, 1968) proposed a structural approach which follows the fiducial closely, but avoids some of its complications. Similarly Dempster proposed direct probability statements (1963), which may be considered as fiducial statements, and he believed that Fisher’s arguments can be made more consistent through modification into a direct probability argument. And Efron in his lecture on Fisher (1998) said about the fiducial distribution: “May be Fisher’s biggest blunder will become a big hit in the 21st century!”
And it was mainly during the 21st century that the statistical community began to recognise its importance. In his 2009 paper, Hannig, extended Fisher’s fiducial argument and obtained a generalised fiducial recipe that greatly expands the applicability of fiducial ideas. In their 2013 paper, Xie and Singh propose a confidence distribution function to estimate a parameter in frequentist inference in the style of a Bayesian posterior. They said that this approach may provide a potential conciliation point for the Bayesian-fiducial-frequentist controversies of the past.
I already discussed these points with some other researchers and I think that a more general discussion seems to be of interest for Research Gate members.
References
Dempster, A.P. (1963). On direct probabilities. Journal of the Royal Statistical Society. Series B, 25 (1), 100-110.
Fisher, R.A. (1930).Inverse probability. Proceedings of the Cambridge Philosophical Society, xxvi, 528-535.
Fraser, D. (1961). The fiducial method and invariance. Biometrika, 48, 261-280.
Fraser, D. (1968). The structure of inference. John Wiley & Sons, New York-London-Sidney.
Hannig, J. (2009). On generalized fiducial inference. Statistica Sinica, 19, 491-544.
Xie, M., Singh, K. (2013). Confidence distribution, the frequentist distribution estimator of a parameter: A review. International Statistical Review, 81 (1), 3-77.
Zabell, S.L. (1992). R.A. Fisher and the fiducial argument. Statistical Science, 7 (3), 369-387.
Dear George,
Thank you for your reference to the psychologist Michael Bradley’s works on inferential statistics, and I entirely agree with him when he writes in 2014: “It is a fallacious and misleading exercise to imply measurement accuracy by assuming set errors rates from the specific statistical samples presented in any particular exploratory study”. This paper however was investigating a part of a widest cleft, as Savage put it in 1961, between frequentists and this cleft was linked to Fisher’s fiducial Inference and other points. For example, Neyman said in 1941: “the theory of fiducial inference is simply non-existent in the same way as, for example, a theory of numbers defined by mutually contradictory definitions” while Fisher accused Neyman to use his own work without reference. But their disagreements were also about refutation and confirmation, which was the topic of Bradley’s work, and about experiments.
So that if Bradley point of view was interesting, I don’t think that it might answer entirely to my more general question.
References
Bradley, M.T., Brand, A. (2014). The Neyman and Pearson versus Fisher controversy is informative in current practice in inferential statistics. Conference paper to the Canadian Psychological Association.
Neyman, J. (1941). Fiducial Arguments and the theory of confidence intervals. Biometrika, 32, 128-150.
Savage, L. (1961). The foundations of statistics reconsidered. In Neyman J., ed., Proceedings of the fourth symposium on mathematical statistics and probability, Berkeley, vol. 1, 575-586.
Dear George,
Thank you for your reference to the psychologist Michael Bradley’s works on inferential statistics, and I entirely agree with him when he writes in 2014: “It is a fallacious and misleading exercise to imply measurement accuracy by assuming set errors rates from the specific statistical samples presented in any particular exploratory study”. This paper however was investigating a part of a widest cleft, as Savage put it in 1961, between frequentists and this cleft was linked to Fisher’s fiducial Inference and other points. For example, Neyman said in 1941: “the theory of fiducial inference is simply non-existent in the same way as, for example, a theory of numbers defined by mutually contradictory definitions” while Fisher accused Neyman to use his own work without reference. But their disagreements were also about refutation and confirmation, which was the topic of Bradley’s work, and about experiments.
So that if Bradley point of view was interesting, I don’t think that it might answer entirely to my more general question.
References
Bradley, M.T., Brand, A. (2014). The Neyman and Pearson versus Fisher controversy is informative in current practice in inferential statistics. Conference paper to the Canadian Psychological Association.
Neyman, J. (1941). Fiducial Arguments and the theory of confidence intervals. Biometrika, 32, 128-150.
Savage, L. (1961). The foundations of statistics reconsidered. In Neyman J., ed., Proceedings of the fourth symposium on mathematical statistics and probability, Berkeley, vol. 1, 575-586.
See http://www.rug.nl/research/portal/publications/statistical-inference-via-fiducial-methods%28c274b103-2b18-4783-9dc2-0937e761b33e%29.html for a PhD-thesis from 1998 with a late-20th-century view on the applicability of fiducial inference. (Several chapters in this thesis have also been published as 'stand-alone' papers).
Dear Casper,
Thank you very much for drawing my attention to the PhD Thesis of Salomé Diemer on Statistical inference via fiducial methods. As she so pleasantly said: “This thesis is not only the result of four years of blood, sweat, and tears, but it also marks the end of four pleasant years at the department of mathematics at the University of Groningen”. In fact it is a very interesting discussion on fiducial inference but also on the Neyman-Pearson interpretation of this inference and more generally on the statistical controversies which are clearly described in this PhD thesis.
However as it was written in 1994-1998 it could not take into account the new arguments on generalized fiducial inference (see Hannig (2009) paper on this topic), and my question was more clearly devoted to this kind of statistical inference: will generalized fiducial inference become a big hit in the 21st century?
Dr Courgeau/You may have a look at this post by Christian Robert about a work of Abhishek Pal Majumder and Jan Hannig on general fiducial inference based on matching priors!
https://xianblog.wordpress.com/2014/08/08/jsm-2014-boston-3/
Dear Jean-Louis,
Thank you for your information on the JSM conference in Boston 2014 and on the talk by Abhishek Pal Majunder and Jan Hannig. I am not surprised about what Christian Robert said on it, in his Xi’an’s Og blog: “this is yet another area where I remain puzzled by the very notion. I mean the notion of fiducial distribution”. I already read Robert’s comments on another paper on a similar topic, the Xie and Singh paper (2013) I cited in the comments to my question, and they were very negative: “I fear that the authors have not made a proper case in favour of confidence distributions”. However when he knew that Singh most sadly passed away he wrote: “I think he would have appreciated the intellectual challenge raised in this intellectual dispute and responded accordingly”. Xie answered his main comments and said: “From his discussion, it is not difficult to tell that Professor Robert is very passionate about Bayesian inference. I appreciate the passion and conviction exhibited in Professor Robert’s discussion, even though I am surprised by many of his comments”. However he avoided to make comments “prolonging the existing patriotic debate between different philosophical points of view”.
Will it be possible to have here a less passionate debate, in order to be able to see more clearly what are the pros and what are the cons of generalized fiducial inference?
It is good to see the the term "fiducial" as a subject of debate. When I was young in the century before this one, I was a surveyor. A member of the crew was from England and used the term fiducial as a synonym for benchmark. Years later I read Fisher andI assumed the "benchmark" meaning. A benchmark is a relative point appropriate for local measurement, but until it is tied into a concept such as sea-level it remains only valuable as a local reference. Fisher in his agricultural research used measured plots and I naturally assumed in the simplest case he was referencing one untreated plot against the treated plot. The untreated plot was the benchmark and if the treated plot differed somehow he could conclude that with this reference say "fertilizer" worked. Of course, he might expect different results in different years and locations, but in theory he always had a benchmark as a portable reference to see if anything is there. I think a theory anchoring 'benchmarks" or "fiducial distributions" to their purpose would be valuable. From what I read in of Fisher he thought the concept of fiducial was self evident. Joan Box (nee Fisher) in her book on her father says somewhere that her father could be impatient. He thought that those who refused to understand him were dumb, or, if not, were teasing him (Neyman, perhaps) so he did not take pains to explain everything.
Dear Michael,
I am very happy to see that this debate is not only between statisticians but between researchers in different domains. For over six decades life and social sciences have been dominated by NHST as the APA Publication Manual, which sets the editorial standards for over 1000 journals in these domains (Fiedler, 2010), proposed only the ritual of null hypothesis testing until 2002 and even, afterwards in 2006, only 23 journals in psychology have policies warning of pitfall of NHST (Fiedler et al., 2006). I think that the joined paper by Lecoutre (2006) shows clearly that the use of NHST “is so an integral part of scientist’s behavior that its use cannot be discontinued by flying it out the window. Faced with this situation, the suggested strategy for training students and researchers in statistical inference methods for experimental data analysis involves a smooth transition towards the Bayesian paradigm.” May be even that a more general fiducial inference may provide a potential conciliation point for the Bayesian-fiducial-frequentist controversies of the past?
References
Fiedler, F. (2010). The American Psychological Association Publication Manual sixth edition: implications for statistics education. In Reading C., ed., Data and context in statistics education. Towards an evidence-based society, Proceedings of the Eighth International Conference on Teaching Statistics, Ljubljana, Slovenia, Voorburg.
Fiedler, F., Burgman, M., Cumming, R., Thomason, N. (2006). Impact of criticism of null-hypothesis testing on statistical reporting practices in conservation biology. Conservation biology, 20 (5), 1539-1544.
Lecoutre, B. (2006). Training students and researchers in Bayesian method for experimental data analysis. Journal of Data Science, 4 (2), 207-232.
Thanks Daniel for your last comments. You argue for a "smooth transition towards the Bayesian paradigm". Nice will but I am not sure that the available intermediate alternatives can be more easily grasped and understood by the vast majority of statisticians (especially the applied ones) than Bayesian statistics. I had a look at the 2013 review paper by Xie & Singh you mentioned in your reference list. This is a very comprehensive review about one of these alternatives. They put forward the concept on confidence distributions (CD) (inverse functions of upper limit confidence limit) which relies completely on frequentist interpretations contrarily to fiducial distributions (FD). For instance, on page 13, they defined "a CD random variable which is not random parameter" but "a randomized estimator of theta zero" the true value of the parameter theta. Not so easy to digest and make the difference between CD and FDs. This reminds what Savage said about all these attempts by statisticians "to make the Bayesian omelette without breaking the Bayesian eggs". Do we really need to get through these steps that might be not so smooth!
Dear Daniel
Thank you for the reply. This year I am presenting the various statistical recommendations from APA beginning in 1929 until 2009. APA still runs the danger of confusing the blunt instrument of inferential statistics with accuracy and precision associated with measurement. They are better though by including effect size calculations with non-significant results. I think by the wording, however, the ns results featured or not will be at the discretion (bias) of an editor and author. Your idea that, which I have not thought about, in reference to Bayes is what I am trying to do with our 3 categories of analysis. In pure exploration (fertilizer for Fisher) there are no prior probabilities. Any given year may vary in rain, wind, sun, temp, etc even location. But over a few studies some priors emerge, and this I think is what Neyman and Pearson were trying to capture. And they succeeded in quality control where an unexpected deviation could mean that something is wrong. The success was through the mechanical control of all factors associated with the process. The fascinating aspect for me then becomes what of the middle ground where we have some idea that a drug has a measurable beneficial effect, but we have subtle indications of side effects. I don't have the sophistication other than replication to capture those side effects (essentially a frequentist approach) but where prior probabilities could be assigned to evidence of side effects given the beneficial effect that would be valuable to know. It would require an outline of fiducial assumptions because the baseline or benchmark would shift given different populations, but the question would be given a beneficial effect what is the probability of a side effect.
Dear Jean-Louis,
I am glad about your look to the Xie & Singh review and your reminder of Savage sentence is good to recall. When I wrote with Henri Caussinus the joint paper (2010) we had to reject earlier methods referred to as Bayesian but in fact “making a Bayesian omelette without breaking Bayesian eggs”. However we had some problems for the choice of a prior distribution for the parameters we had to estimate. This leads us to consider different possibilities, such as Dirichlet distributions, which are not entirely satisfactory, as they don’t take into account the correlations existing between neighbouring ages classes. More recently we used a smoothing technique in order to take into account these correlations, which lead to better credible intervals, but we don’t have a satisfactory statistical argument to sustain this choice, even if there are some demographic arguments for it. Will it be possible to solve this problem while using other approaches?
Reference
Caussinus, H., Courgeau D. (2010). Estimating age without measuring it: a new method in paleodemography. Population-E, 65 (1), 117-144
I cannot find your paper in the "Contributions" section. Would it be possible to have a copy of it?
I sent you a joined copy in my previous message, but you can find it also in the section "Articles" of my "Contributions".
Dear Michael,
I will be very interested to read your presentation of the various recommendations from APA through the period 1929-2009, when you finish to write it. I think that APA was very powerful and unfortunately badly informed about the different ways to formalize the intuitive notion of probability, so important in this domain into at least three broad types – objective, subjective and logical probability. APA was mainly promoting objective probability but without telling it during a long period of time. May be a reunification of these approaches is possible as I tried to show in the Conclusion of Part I from my book on Probability and social science (2012). You will find it here enclosed, if you are interested to read it.
Reference
Courgeau, D. (2012). Probability and social science. Methodological relationships between the two approaches. Methodos Series, vol. 10, Dordrecht Heidelbrg London New YorkSpringer.
Yes I will be very happy to send you a copy of the presentation. I agree that APA has conceptual problems in this area. I will be very happy to read your book that you reference.
I have a fairly simple minded (and unoriginal) approach to such questions.
From a decision theoretical perspective, it is well known that admissible procedures are either Bayes or almost Bayes (limit of Bayes, epsilon Bayes, or with an improper prior). So if a fiducial distribution is not almost Bayes it is inadmissible for some loss function. As the loss function is rarely known exactly, perhaps this is not very important. Fisher objected to decision theory (at least for questions of science) for this reason.
Here's an alternative argument. The laws of probability were motivated by gambling problems where the probabilities could apply to repeatable trials. But words such as 'likely' and 'probable' were already applied informally where repetition is not meaningful. So it is not surprising that Bayes and Laplace sought to give degrees of belief for the parameters of their models.
Suppose Laplace had a model p(x | \theta) and reported p(\theta | x) as his degrees of belief about the parameters. (I'm using 'p' as a generic probability density, not always the same function. Also x and \theta could be vectors.) If his result is based on the laws of probability we would be entitled to assume that he had a joint probability density p(x,\theta) in mind. Then
As a kind of converse of Bayes theorem,
As this does not depend on x, we can substitute an arbitrary value, x0 for it. Then we see that p(\theta) is proportional to
p(x0) being the constant of proportionality. If this is a proper distribution, p(x0) may be determined by setting the total probability to 1, if not, p(x0) may be set arbitrarily to 1 and p(\theta) is improper. Choosing p(x, \theta) directly is difficult; we would assume that Laplace would have chosen p(\theta) which we can determine from his published results.
Therefore, any method that purports to output a probability distribution for the parmeters can be expressed Bayesian terms. If some other (possibly more compelling) principle is used, such as the fiducial method, one can think of it as a way to justify a choice of prior.
My argument above seems to contradict Lindley's result that the fiducial distribution is only a Bayes posterior in special cases.
In summary, my argument is that p(x | \theta) (defined for all x and \theta) together with p(\theta, x0) (defined for all \theta, and any x0 (especially x0 = the specific sample obtained)), comes from a unique prior distribution p(\theta). This then allows us to define p(\theta, x) for all x.
Contradicting Lindley, this seems to imply that any method that gives rise to a distribution for \theta for given x0 (including a fiducial method), coincides with a Bayesian posterior.
Perhaps someone can resolve this conflict.
Terry, I don't see your conflict.
As I understood it, the fiducial distribution is only one special/paricular Bayes posterior. Or formulated the other way around: there is on "special" Bayes posterior that is the fiducial distribution.
But I can be wrong here (in which case I would be happy to get corrected).
Dear Terry;
Thank you for your interesting answer. I know very well the decision theoretical perspective and the excellent book by Robert (2007). In this book, he wrote about Fisher’s fiducial inference:
“Fisher, who moved away from the Bayesian approach (Fisher (1912)) to the definition of the likelihood function (Fisher (1922)), then to fiducial Statistics (Fisher (1930)), but never revised his opinion on Bayesian Statistics. This is slightly paradoxical, since fiducial Statistics was, in a sense, an attempt to overcome the difficulty of selecting the prior distribution by deriving it from the likelihood function (Seidenfeld (1992)), in the spirit of the noninformative approaches of Jeffreys (1939) and Bernardo (1979).
For instance, considering the relation O = P + ε where ε is an error term, fiducial Statistics argues that, if P (the cause) is known, O (the effect) is distributed according to the above relation. Conversely, if O is known, P = O − ε is distributed according to the symmetric distribution. In this perspective, observations and parameters play a symmetric role, depending on the way the model is analyzed, i.e., depending on what is known and what is unknown.
More generally, the fiducial approach consists of renormalizing the likelihood (1.2.1) so that it becomes a density in θ when
Sum l (θ|x) dθ < +∞,
thus truly inverting the roles of x and θ. As can be seen in the above example, the argument underlying the causal inversion is totally conditional: conditional upon P, O = P + ε while, conditional upon O, P = O − ε. Obviously, this argument does not hold from a probabilistic point of view: if O is a random variable and P is a (constant) parameter, to write P = O − ε does not imply that P becomes a random variable. Moreover, the transformation of l(θ|x) into a density is not always possible. The fiducial approach was progressively abandoned after the exposure of fundamental paradoxes (see Stein (1959), Wilkinson (1977) and the references in Zabell (1992))”.
However I don’t agree with the fact that Fisher never revised his vision on Bayesian Statistics: he wrote in a letter to Barnard in 1958 “In fact the more I consider it, the more clearly it would appear that I have been doing almost exactly what Bayes had done in the 18th century.” As Jochen just said “there is one special posterior that is the fiducial distribution”. And as I previously said the fiducial approach was not abandoned as Robert concludes.
As I already said in a previous answer to Jean-Louis Foulley comments, Robert’s discussion of Xie and Singh paper was also very negative for the same reason. I am evidently not against Robert’s decision perspective, as I have used it in a number of papers. But I wonder if the other perspectives on probability would have to be thrown out, or if there is a possibility to try a more general synthetic approach to probability?
References
Robert, C. (2007). The Bayesian choice. From decision theoretic foundations to computational implementation. Springer.
Jochen: I think I see how to resolve the conflict. Given a distribution, p(\theta | x0) for \theta given the particular sample x0, found by method F (whatever F is), we can find the equivalent prior that would have given the same p(\theta | x0). This prior would then give p(\theta | x1) for any other sample x1. But, according to Lindley's result, there is no guarantee that method F = fiducial would give the same p(\theta | x1) unless \theta is a location parameter.
Daniel: your discussion is in terms of a location parameter where Lindley's result does agree with the Bayesian approach. More generally, it is a special case of Fraser's transformation group approach.
I intended to mention the confidence approach. Generations of student's have been taught (and struggled to understand) that the coverage probability of a confidence interval is not the probability that the parameter lies in that particular interval. But why not interpret it that way? If I toss a coin and don't tell you the result, is the probability of a head still 1/2 (from your point of view)? You would be a fool to bet with me on it, because I know the result, but you could bet with a third party. Similarly, if you don't know whether or not \theta lies in the confidence interval, then why not regard the coverage probability as the probability that \theta lies in the interval? With this point of view, upper confidence limits define a probability distribution. It seems to me that this gives a generalisation of the fiducial distribution. There is a difference: Fisher insisted on using a sufficient statistic to remove any arbitrariness. I believe he used the word 'fiducial' to indicate that the distribution was true to the data and used all the information known (I don't know whether or not he knew the use of the term in surveying). On the other hand, upper or lower confidence limits are not unique, and Fisher didn't want subjectivity to appear. However, if we specify a loss function, we can often obtain unique optimal confidence limits, and sometimes there is are uniformly most accurate limits (that, for upper limits, minimise the probability of coverage for values greater than \theta--this corresponds to UMP tests). These criteria remove some of the arbitrariness for confidence intervals.
I believe there is now some interest in confidence distributions. I wonder if this finesses fiducial theory?
Dear Terry,
Yes, I clearly agree with you that the objectivists say that the coverage probability of a confidence interval is not the probability that the parameter lies in that particular interval. It is only if we carried similar procedures time after time that the unknown parameter would lie in this confidence interval 95% of the times, if this is a 95% confidence interval. It is difficult for a social scientist, like me, to agree with this procedure as generally we are not able to make a great number of surveys and that the observed population may change between each survey.
So that a Bayesian subjectivist point of view is easier to understand for us. We make a survey. What is the chance that a non-observed individual will follow the pattern observed in the survey? A Bayesian credible interval tells us exactly what we want to know.
However this result depends on the prior you take and a logical probability, if it can give you the good prior using all the information known, may seems better than a subjectivist one. Many attempts have been made in order to find a way to determine unambiguously what we may understand under the term “all he information known”, but in my opinion no one entirely succeeded..
I wonder if our two ways of reasoning about fiducial probability may lead to a more complex theory which can take into account our common interest on confidence or credible distributions.
Daniel
Daniel: as I said, I tend to be fairly simple minded about these things. I can reconcile frequentist inference with probabilities for parameters by reinterpreting the frequentist confidence coefficient as the probability for the parameter. But this is not unique when the confidence interval is not unique. Nevertheless, even diehard frequentists, probably think about confidence intervals as probabilities for parameters, at least subconciously. If you look at official statistics websites, you will probably find this interpretation in the technical notes pages even though it is incorrect from a frequentist point of view. (Actually, official statisticians are starting to use Bayesian methods for small area estimation.)
Fisher's fiducial probability is limited to when there is a sufficient statistic. This means that it only works in the exponential family of distributions. And in multi-parameter cases, different versions don't agree with each other. But that doesn't stop us from reinterpreting the intervals in the way I described above, but we'd better give up on uniqueness.
An attempt to reconcile such interpretations with Bayesian methods works so long as you don't mind the prior depending on the actual data. (Actually, empirical Bayes methods, which Bayesians regard as sinful, does just this.)
Most modern Bayesians use subjective priors, but some hope to be more objective by using priors that represent ingnorance. Unfortunately there is no such thing. If, for example, you are ignorant about a probability \theta, you are also ignorant about \theta(1-\theta) and about its square root, both of which are important functions of \theta. No prior works for all such functions.
One approach is to let the prior depend on the model, or the way the model is parametrised. Using Shannon's information E[p ln(p)] as a measure of the information in a distribution, we should let the data add as much information to the prior as possible, in other words the information in the prior should be minimised with respect to the posterior. I don't know enough about this to know if it always avoids paradoxes.
Perhaps sometime I'll write some thoughts about hypothesis testing, but not now. I'll just say that point hypotheses are never true. In fact they are meaningless: if H0: mean difference = 0, how do you define mean difference? If the data are yields of a crop, what is the population that the parameters are the means of? It is entirely fictional and so are the means.
Dear Terry,
Thank you for your thoughts about hypothesis testing, which appear to be quite clear for me. I think that you may agree with Xie rejoinder to his paper (written with Singh) on Confidence distribution … (2013): “any approach, regardless of being frequentist, fiducial or Bayesian, can potentially be unified under the concept of confidence distributions, as long as it can be used to build confidence intervals of all levels, exactly or asymptotically”.
However, I don’t think that such a point of view may restore the unity of classical probability, as Shafer regrets it (1990). May be that some more recent approaches, such as the one developed by Knuth and Skilling (2012) which tries to propose a synthesis of probability theory, information theory and entropy, may be able to give a more solid basis for inference?
References
Knuth, K.H., Skilling, J. (2012). Foundations of inference. Axioms, 1 (1), 38-73.
Shafer, G. (1990). The unity and diversity of probability. Statistical Science, 5 (4), 435-462.
You mention the "confidence distribution": what -in principle- is so different between a confidence distribution and the likelihood function (except that the confidence distribution is scaled differently)? Is there any more information in the confidence distribution than there is in the likelhood function? Or is my point of view to narrow or in a completely wrong direction?
Daniel I think this thread is so important, and I am particularly enjoying the back and forth between yourself, Terry Moore, and Jochen Wilhelm. I feel the lack of skill that I bring to the thread when I read the offerings. In my naïve way, I focus on measurement as a hallmark of science. Inferential statistics is a good way to discover worthwhile things to measure. I interpreted Fisher as saying exactly this, and he took great pains to point out that inferential statistics is not measurement. In part, he emphasized this because basic probability theory emerged from situations such as card playing where probabilities could be expressed exactly (Neyman, 1941 mentioned production processes as a highly determined situation definitely appropriate for his model and in this he is correct). That is not the case in most scientific exploration. All that is available are samples from the population to which a scientist wishes to generalize. Fisher called the population of ultimate interest the population of the “imagination” to emphasize this indefiniteness (I have to pin down that reference). Given indefiniteness, empirical distributions had to be of some value for a scientist. So if he fertilized one plot and compared it to the natural plot, a rich harvest that was improbable for the natural plot told him he had a finding of value (I am leaving out the Latin square design to stick with a simple explanation). The natural plot served as a benchmark for the fertilized plot, but he wanted to explain that his manipulations were of greater generalizability than on the experimental farm he worked on. By the way, the term fiducial as a benchmark is used or was used in physics, astronomy, biology (work with a microscope) as well as survey. Thus in my reading of Fisher, he used his local samples to conjecture roughly about samples in other years, areas, or conditions. Exactness would come later and would not be through statistics. Inferential statics and measurement are two different processes. The analogy I use is: a geologist picks up a rock, looks for mineralization (this is the statistical test), and if there is something he grinds it up for measurement (a second distinct process).
Jochen: according to Bayesian dogma, all the informarion is in the likelihood function, so nothing else has more information. Contrast this with frequentist methods in which there is also information in the sampling methodology. E.g. unbiased estimators for the binomial and negative binomial disagree even though they have the same likelihood (Bayesians don't accept unbiasedness--or any integration over the sample space--as a valid criterion, though).
Rather than adding information, I think confidence distributions might help us to interpret the information. In particular it gives a probability interpretation, which is, unfortunately, not unique except in those one parameter cases when the fiducial method is unique. However, confidence distributions are sinful from a Bayesian point of view as they do involve integration over the sample space.
One problem, maybe the main problem, with the likelihood function for continuous distributions is that it represents a probability density rather than a probability. In pathological cases, the density can have narrow peaks which the data only has a small chance to hit (admittedly, this is unlikely in practice as soon as there are more than two observations, but a sound method should not have exceptions). Integrating the density smoothes it.
In the book, 'Likellihood', AFW Edwards says that he has trouble explaining that his approach to inference using the likelihood function alone (no frequentist or Bayesian probabilities) is not just 'maximum likellihood'. With one observation from the exponential distribution, the maximum likelihood estimator for the mean is zero regardless of the value of the observation. Looking at the whole likelihood function tells us how concentrated it is near zero. Confidence distributions do the same thing but also allow a probability interpretation. (On the other hand, Edwards was trying to avoid both frequentist and Bayesian approaches, so this is not an interpretation he would like).
I should say that I'm not trying to push for any particular approach. I just want to understand.
This part I don't understand:
With one observation from the exponential distribution, the maximum likelihood estimator for the mean is zero regardless of the value of the observation.
The density of the exponential distribution with rate parameter "r" is
r*exp(-r*x)
So this is also the likelihood of a single observation. The log-likelihood is
log ( r*exp(-r*x) ) = log(r) -r*x
Its derivative for the parameter r is
1/r - x
Setting this equals zero and solving for r (to get r for the maximum likelihood):
1/r = x r = 1/x
So the 1/x is the maximum likelihood estimate of r for a single value. It depends on x and it is greater 0 for any x
Of course, you are right, the exponential distribution is well behaved. Although I was talking about the mean, 1/r, the same applies. I visualised the density without bothering to do the algebra. There are some cases where ML estimators are bad, but the exponential isn't one of them. Sorry.
Dear Jochen
Yes, we can say that likelihood functions are as closely related to confidence distributions as fiducial distributions, and that a confidence distribution may be derived asymptotically from a likelihood function.
However the likelihood function is generally different from the confidence distribution and has a different shape. More, a given confidence distribution may relate to many different likelihoods functions, depending on the sampling distribution behind the confidence distribution, and a sensible confidence distribution may not always exist. And finally a simultaneous confidence distribution for multiple parameters can be difficult to define.
All this debate is more or less about "objective" (or default) priors viewed through the eyes of a frequentist. From what I understood, confidence distributions as fiducial ones have serious drawbacks as well as matching priors especially due to the difficulties encountered in high dimensions. What to do? one may tackle the problem within the Bayesian framework via eg Jeffreys, Reference, Haars etc.. priors (see a catalog by Yang & Berger)
To that respect, one can have a look at the special issue (2011, No 26, 2) of Statistical Science entitled "On Bayesian Methods that Frequentists should know" with a very comprehensive review on "Objective priors: an introduction for frequentists" by Malay Ghosh and rejoinders by JM Bernado and T Sweeting.
Dear Jean-Louis,
Thank you for your answer. When I asked this question on fiducial inference, I did not want a debate centred on frequentism but a more general one on statistical inference. I am very happy to have your point of view on this debate.
I appreciated the issue of Statistical Science (2011) On Bayesian methods that frequentists should know. As I already told you I am currently using Bayesian methods in demography and I agree on many points of this special issue. However I don’t know how it may solve our question on: Can Fisher’s controversial idea of fiducial inference, in the 20th century, be accepted by the statistical community in the 21st century? as Ghosh paper on objective priors did not even cite Fisher. So that do you conclude, from a Bayesian point of view, that such a question is without interest in the 21st century, even if Fisher said that he was doing the same thing that Bayes had done for the 18th century?
Dear Michael,
Yes, I also appreciate very much this exchange of ideas on fiducial inference from researchers working in different fields and having different ideas on scientific issues. I also agree with you that the situation was clearer when working on card playing the probabilists could express exact probabilities, but now in the case of most scientific explanations this is no more the case. As a demographer, when I am using exhaustive census data I can use frequentist probabilities without problems, but when I am working on paleodemography with few observed skulls I will have to use Bayesian probabilities.
Re Terry Moore's comment, for examples of "bad" (e.g. inconsistent) maximum likelihood estimators, see
http://www.stat.washington.edu/jaw/RESEARCH/TALKS/talk-gent.pdf
(J.A. Wellner, University of Washington)
Thank you Michael, the examples are excellent. I was only thinking of MLEs that were silly for small samples, but inconsistency is far worse.
Daniel: I remember a seminar JohnTukey gave in 1975 on 'gathering strength'. This has since become a common term for using information from other sources than the data at hand. He mentioned an archaeology textbook that discussed inferences about a species from a sample of size 1 (e.g. a single skull).
In general, a subjective Bayesian prior is essentially gathering strength, but frequentists can do the same, for example, they might guess the variability of a measurement on a skull of a species from variablity in other species.
I'm not sure that we need a probabilistic interpretation for specific intervals. A confidence coefficient, c, is a measure of the strength of our knowledge that the parameter lies in the specific confidence set found. There is nothing wrong wth interpreting c as the probability that the parameter lies in the set, so long as 'probability' means 'degree of belief based on the data'. The main problem with this is that confidence sets are not unique without further conditions.
Terry,
but frequentists can do the same, for example, they might guess the variability of a measurement on a skull of a species from variablity in other species.
To my understanding this is deeply Bayesian. There is a prior belief required to assume that the variability will be similar, that the problem is transerable. A Bayesian might try to formalize these beliefs, making them explicit. But a person who uses such beliefs implicitly is not acting in a frequentist's sense.
The problem here might be that almost all people/scientists who are not even aware of different philosophies of probability and the controversy discussed here would act just like that. Every person with a bit of common sense would do so. This is the inherited and natural way of thinking and using information. But it is not "frequentistic", like the use of p-values to judge effects and to control error-rates (things most of those people are concerned with most of the time they try to publish their results).
You make a good point Jochen: judgements made about the parameters of a model are not frequentist. But the same could be said of the model itself. Apart from people who use the normal distribution because they think it's the proper thing to do, we base our models on past experience of similar situations (or those we judge to be similar). This is 'gathering strength'. (Sometimes there are also theoretical models, but it is a matter of judgement how well the theory applies to the situation at hand.) I reserve the term 'Bayesian' for choosing a prior distribution for the parameters rather than for some order of magnitude jugement about a parameter. But that's just terminology--what we do is more important than what we call it. With this definition, empirical Bayes is not Bayesian because it leaves parameters in the prior to be estimated from the data.
It seems that the division betweem the various philosphies is not as strict as people pretend. Even Lindley remarked that frequentist methods (which he considered logically unsound) usually gave sensible answers, not too different from Bayesian methods.
Dear all,
From all previous discussions it seems to me that the main question is to find the best prior for the studied problem or more precisely a prior-free formulation of the problem. As I said in the text presenting in more details this question, Fisher’s fiducial inference was an effort to develop prior-free probabilistic inference, as Dempster-Shafer theory, Fraser structural inference, Chiang and Weerahandi generalized p-values, Hannig generalized inference, Xie and Singh confidence distributions, and so on.
Ryan Martin and Chuanhai Liu have recently shown that fiducial inference and its variants are in fact not prior-free, and are proposing a new paradigm called inferential models (IMs). See the references given below which may be found in Research Gate for these authors. Do you think, as me, that their proposition seems very promising, in order to solve our problem,and do you have more comments on their papers?
References
Martin, R., Liu, C. (2014). Discussion: foundations of statistical inference, revisited. Statistical Science, 29 (2), 247-251.
Liu, C., Martin, R. (2014). Frameworks for prior-free probabilistic inference. arXiv: 1407.8255v1 (math.ST).
Liu, C., Martin, R. (2015). Inferential models: Reasoning with uncertainty. Chapman & Hall. In preparation.
I have too little time to promptly read (and understand) these papers. So for now I can only give a quick prejudice (my subjective Bayesian prior, so to say), maybe it is helpful:
There ain't no such thing as a free lunch
(ref. http://en.wikipedia.org/wiki/There_ain%27t_no_such_thing_as_a_free_lunch)
I think that data is like a force*, whereas opinion** is like a position. A force will never tell you where you are, it only indicated how you will move, relative to your current position.
* a better allegory would be a vectorized force-field. The more data you have the more of this field is known. Knowing very much about the entire field leaves little space for really different opinions. The better you know the filed, the less relevant becomes your intioal position.
** we can only have opinions about things we do not observe. Factual knowledge exists only about facts, that is, about data. However, opinions can be more or less founded by data, so knowing the same data will cause very similar opinions. The key aspects of these different opinions that are essentially identical could be termed (inter-subjective or quasi-factual) knowledge.
Our connection to the world (reality, existence, whatever you call it) is neccesarily based on opinions, on models we make to structure "data". There is no way around, no way out. We can "know" only data, the whole rest is neccessarily only model and opinion. If the way how data modifies opinions is based on some consistent rule (like "resonability", what would need to be further defined...) then a growing amount of knowledge (data) will inevitably lead to a relative convergence of opinions.
As Terry said: already the selection of the model (on what any further Frequentist's analysis is based upon) is a "Bayesian act" and finally just another opinion that enters the mathematical world to be exposed to data...
[Edit: I confused "Terry" and "Bruce" in the last paragraph; dunno how it could happen. Sorry Terry, I corrected it]
Dear Jochen
I don’t think that the inferential models proposed by Martin and Liu are offering us a free lunch, but on the contrary a very thorough reflection on the questions of probabilistic inference we are discussing in this group. I think that you will have to go through these papers, even if you have little time to do it, in order to see how they tackle these problems, particularly the distinction between frequentist and Bayesian probability.
Your distinction between “data” and “opinion” is interesting and I would like to add a little more about it. You mainly consider how opinions can be more or less founded on data and how data modifies opinions. If I agree with this interpretation, I think that it may lead as well to Baconian induction as to Baconian idols: see our recent paper on the Baconian idols (2014). So that something more may be added to “data” in order to induce scientific thinking. You will have to infer from these data the formal structure which is implied by their properties. This is no more “data” nor “opinion” but a way to be able to infer a theory.
Reference
Courgeau D., Bijak J., Franck R., Silverman E. (2014). Are the four Baconian idols still alive in demography? Quetelet Journal, 2 (2), 31-59.
Thanks very much Daniel for your detailed reply to my comments and to other ones. You said: " So that do you conclude, from a Bayesian point of view, that such a question is without interest in the 21st century, even if Fisher said that he was doing the same thing that Bayes had done for the 18th century?"
I should have missed something but I do not see how Fisher could have done the same thing as Bayes. Bayes did not directly choose an uniform prior on the p (probability of the event). If you read his scholium (just after proposition 9), you see that Bayes relied on assuming that "in a certain number of trials (n), it (the event) should rather happen any possible number of times than another". In modern words, he based his reasoning on what we now call the prior predictive distribution assuming that the probability that the number of successes Y=k in n trials is the same for k=0,1,...,n. This is exactly what happens if the prior on p is uniform ie Pr(Y=k)=1/(n+1), the reciprocal being true. So Bayes is doing a kind of thought experiment based on hypothetical data that could be observed using a certain design and machinery. The assumption of equiprobability at this level is not free of criticism (see eg Stigler) but is definitively different from what most people think and do including probably Fisher.
Dear Jean-Louis,
Thank you for your last answer to my comments. I agree with you that there are some points of disagreement between Fisher and the Bayesian approach. However as Jeffreys stated in 1939: “The apparent differences have been much exaggerated owing to a rather unfortunate discussion some years ago, which was full of misunderstanding on both sides. Fisher thought that a prior probability based on ignorance was meant to be a statement of a known frequency, whereas it was meant merely to be a formal way of stating that ignorance, and I had been insisting for several years that no probability is simply a frequency. […] I have in fact been struck repeatedly in my own work, after being led on general principles to a solution of a problem, to find that Fisher had already grasped the essential by some brilliant piece of common sense, and that his results would be either identical with mine or would differ only in cases where we should both be doubtful”. Similarly, speaking about Fisher’s fiducial argument, he wrote: “My only criticism of both his arguments and ‘Student’s’ is that they omit important steps, which need considerable elaboration, and that when these are given the arguments are much longer than those got by introducing the prior probability to express previous ignorance at the start.”
And Barnard wrote about Bayes and Fisher (1987): “As to the identity of usage of the term ‘probability’ by Bayes and by Fisher, of course all that any one can show is, that in the texts as we have them there is no clear difference. […] Nowadays the term ‘Bayesian’ is being applied to many who deny the possibility of any probabilistic assertions other than of a purely personal kind. It is clear that neither Bayes nor Fisher took this view”.
Can we conclude that Bayes, Fisher, Jeffreys and Barnard were all adepts of a ‘logical probability’, and against a ‘subjective probability’ as proposed by de Finetti or Savage?
References
Barnard G.A. (1987). R.A. Fisher – a true Bayesian? International Statistical Review, 55 (2), 183-189.
Jeffreys H. (1939). Theory of probability. Oxford University Press.
Dear Daniel/Thanks for your reply and additional comments on the Bayes/Fisher controversy which seems to have been exaggerated at least from the points of view of the "Objective" Bayesians. May be you could add to this camp: Edwin Jaynes with his principle of maximum entropy and Bernardo & Berger with their theory of reference priors.
At a more practical level, I would act using both types of priors (objective and subjective). It depends on the kind of problem addressed, structure of data and what we know and what kind of questions we are faced with and types of models we could implement. I am inclined to use historical data as much as possible to build priors especially in hierarchical models at their first stages and default priors for hyperparameters of the last stages.
Coming back on my previous comments on Bayes vs Fisher, I view predictive distributions, either prior or posterior ones, as major tools offered by Bayesian statistics to practitioners as compared to the ones of the classical school. This is especially true in model comparison and validation: see eg works by Gelman, Plummer and Watanabe.
Dear Jean-Louis,
I am happy to see that we agree that there are two major Bayesian points of view. The first one, you called “objective” and which I prefer to call “logical”, was followed by Laplace, Jeffreys, Jaynes, etc. The second one, we called “subjective”, was followed by de Finetti, Savage, etc. The “classical school”, which I called “frequentist” or “objectivist”, was followed by Kolmogorov, von Mises, etc. As Efron (1998) said: “Fisher’s philosophy is characterized as a series of shrewd compromises between the Bayesian and frequentist viewpoints…”.
However, as you told that at a more practical level you would act using as well “objective” and “subjective” priors, you seem to think that the differences between the two Bayesian schools are not important. For me however, I agree with Jaynes (2003) when he said: “if any rules were found to possess the property of coherence in the sense of de Finetti, but not the property of consistency in the sense of Cox, they would be clearly unacceptable – indeed functionally unusable – as rules for logical inference”.
References:
Efron B. (1998). R.A. Fisher in the 21st century. Statistical Science, 13 (2), 95-122.
Jaynes E.T. (2003). Probability theory. The logic of science. Cambridge University Press.
I have been quite uncertain about joining this discussion among highly competent professional statisticians, because I am not. I am (have been) a metrologist, now retired, for whom statistics (in the broader sense) is vital for high-precision data treatment.
However, since I am presently assembling a paper about the different meaning of probability in exercises like throwing dices (or playing cards) and experimental science, I hope to have more light shed by this debate.
First, be aware that a comparison between frequentist, Bayesian and fiducial approaches was done a few years ago by NIST (USA), in a chapter of the multiauthor book that you find in the references below. It was later transformed also into an ISO Technical Report.
The issue is that many people is uncomfortable about the dispute between frequentists and Bayesians, where both try to prevail as the single frame, good for all seasons.
Personally, from my long experience in a field where we try to get the maximum information out from a relatively small number of (very costly) experiments, but “without torturing measurement results until they confess “ (see De Bièvre in the references), I formed an opinion against the possibility to have true objectivity (see Pavese & De Bièvre below) –you may see also at my most popular paper on ResearchGate in the references below.
This does not mean that there is no difference between data and opinions (Jocken): however, data, not only are obviously all uncertain, but are ‘occasional’ and not necessarily ‘representative’ in many respects.
One feature that I can state, as basically an experimentalist, is that they can be biased in many respects, and that almost never they are directly the instrumental indication, but the data used in the analyses undergo before it many steps of ‘torturing’ that are basically subjective, even in non-Bayesian treatments. Good measurement skill is still an art.
So data are not necessarily a fair sample of anything: we use them for necessity, as the only available evidence. This is why avoiding to omit any piece of information is important. In this respect, I certainly do not agree with frequentists, should they disagree about taking into account also previous knowledge on the same/similar issue. This is vital at least in metrology, where the back history of any standard is valuable, and the population is assumed to be the same until evidence is gained of the contrary. However, I do not consider this as necessarily being a Bayesian behaviour, nor needing a Bayesian treatment: in most cases, a mixed effect analysis is sufficient for the purpose of the analysis of pooled data. Analysing pooled data is different from assuming an a priori probability distribution (incidentally, why only the probability frame?).
Saying that the simple modelling of the experiment is already a Bayesian feature is, in my opinion, an abuse. I may admit that Bayes first indicated the need to take all the current knowledge into account. However, after the choice of the prior, the full Bayesian method is then a standard engine, bringing to the posterior, ‘too simple to be always true’ (“there ain't no such thing as a free lunch” -Jochen).
We really urge to come out from a biased dispute where there are no winners.
Fischer was supposed to be one proposing one of these new routes: others would be welcome, not only about the stochastic part of measurement uncertainty. Even more, we need, in my opinion, a rethinking about that part, often prevailing, concerning systematic effects bringing to systematic errors. That part is sometimes labelled epistemic uncertainty, or better, epistemic and ontologic uncertainty. In my opinion, this is only partially sufficient. There is still another part of systematic effects: the one where the expectations of random variables are what is called ‘bias’, which in experimental science must be ‘corrected’ according to a universally-adopted procedure dating back to Gauss – not necessarily the best way-out.
W. F. Guthrie, H-K Liu, A. L. Rukhin, B. Toman, J. C. M. Wang, Nien-fan Zhang, Three Statistical Paradigms for the Assessment and Interpretation of Measurement Uncertainty, in Advances in data modeling for measurements in metrology and testing (F. Pavese and A.B. Forbes, Editors), Series Modeling and Simulation in Science, Editor N. Bellomo, Birkhauser-Springer, Boston, 2009. ISBN: 978-0-8176-4592-2 (print) 978-0-8176-4804-6 (ebook), with 1,5 GB additional material in DVD, pp. 71–116
P. De Bièvre, Measurement results should not be tortured until they confess, Accred Qual Assur (2010) 15:601–602
F. Pavese and P. De Bièvre: “Fostering diversity of thought in measurement”, in Advanced Mathematical and Computational Tools in Metrology and Testing X, vol.10 (F. Pavese, W. Bremser, A.G. Chunovkina, N. Fischer, A.B. Forbes, Eds.), Series on Advances in Mathematics for Applied Sciences vol 86, World Scientific, Singapore, 2015, pp 1–8. ISBN: 978-981-4678-61-2, ISBN: 978-981-4678-62-9(ebook) (slides available for downloading)
F. Pavese, Subjectively vs objectively-based uncertainty evaluation in metrology and testing, IMEKO TC1-TC7 Workshop, 2008 (available for downloading)
Dear Franco,
Sorry for this delayed answer to your vey interesting contribution, but this was due to some urgent work to finish and to my whish to read the papers you cited before answering you.
First, I am very happy to have the contribution of a metrologist to this important debate. In the second place, I see many convergent points between our thoughts.
In the paper by Guthrie et al., I am interested by their use of the term “paradigm” in statistical thinking. These paradigms are taken in a different sense from Kuhn (1962), for which Masterman (1970) identified 21 different meanings, but, I think, in the sense of Granger (1994) which addresses the following question: how does one move from the experienced phenomena to the scientific object? For him “the complex life experience grasped in sensitive things has become the object of mechanics and physics, for example, when the idea was conceived of reducing it to an abstract model, initially comprising only spatiality, time and resistance to motion.” And he recognizes that the content of this object is not explicitly and broadly defined at the outset. I took such a definition for the different paradigms in demography and probability (Courgeau, 2012), and it seems to me that Guthrie et al. take a similar definition for their statistical paradigms.
Similarly in your paper with De Bièvre you said that “Also in science, ‘diversity’ is not always synonym of ‘confusion’, a popular term used to contrast it, rather is an invaluable additional resource leading to a better understanding”. This recalls me the other sentence by Granger: “True, the human fact can indeed be scientifically understood only through multiple angles of vision, but on condition that we discover the controllable operation that uses these angles to recreate the fact stereoscopically”. This is also consistent with Guthrie et al. conclusion: “The existence of different paradigms for uncertainty assessment that do not always agree might be seen as an unfortunate complication by some. However, we feel it is better seen as an indication of further opportunity. It is only by continually working together to appreciate the features of different paradigms that we will arrive at methods for uncertainty assessment that meet all of our scientific and economic needs”.
References
Courgeau, D. (2012). Probability and social science. Springer.
Granger, G.-G. (1994). Formes, opérations, objets. Librairie Philosophique Vrin.
Kuhn, T. (1962). The structure of scientific revolutions. The University of Chicago Press.
Masterman, M. (1970). The nature of a paradigm. In Lakatos and Musgrave, eds., Criticism and the growth of knowledge, Cambridge University Press.
To all the followers of this question,
I just read the book written by Tore Schweder and Nils Lid Hjort on Confidence, likelihood, probability. Statistical inference with confidence distributions (2016) published by Cambridge University Press, and I think that they answer this question on fiducial probability in a very interesting way.
They propose an epistemic probability understood as confidence distributions. They have developed and investigated such an approach free from the philosophical, mathematical and practical difficulties of putting up prior probabilities for unknown parameters. They give a detailed view of Fisher’s fiducial argument and discuss the big debate over it: “Despite the potential problems with multivariate fiducial distributions, their marginals are often exact or approximate confidence distributions.”
However they don’t give an axiomatic theory for epistemic probability understood as confidence distribution and say that the fiducial debate showed that Fisher was wrong to assume that Kolmogorov’s axioms apply to fiducial probability.
As the authors of this book are ResearchGate members I will be happy to have their advice on this questions, and evidently I will be happy to have the reactions of the followers of this question on this very interesting book.