According to Fisher [1], “… probability and likelihood are quantities of an entirely different nature.” Edwards [2] stated, “… this [likelihood] function in no sense gives rise to a statistical distribution.” According to Edwards [2], the likelihood function supplies a nature order of preference among the possibilities under consideration. Consequently, the mode of a likelihood function corresponds to the most preferred parameter value for a given dataset. Therefore, Edwards’ Method of Support or the method of maximum likelihood is a likelihood-based inference procedure that utilizes the mode only for point estimation of unknown parameters; it does not utilize the entire curve of likelihood functions [3]. In contrast, a probability-based inference, whether frequentist or Bayesian, requires using the entire curve of probability density functions for inference [3].

The Bayes Theorem in continuous form combines the likelihood function and the prior distribution (PDF) to form the posterior distribution (PDF). That is,

posterior PDF ~ likelihood function × prior PDF (1)

In the absence of prior information, a flat prior should be used according to Jaynes’ maximum entropy principle. Equation (1) reduces to:

posterior PDF = standardized likelihood function (2)

However, “… probability and likelihood are quantities of an entirely different nature [1]” and “… this [likelihood] function in no sense gives rise to a statistical distribution” [2]. Thus, Eq. (2) is invalid.

In fact, Eq. (1) is not the original Bayes Theorem in continuous form. It is called the "reformulated" Bayes Theorem by some authors in measurement science. According to Box and Tiao [4], the original Bayes Theorem in continuous form is merely a statement of conditional probability distribution, similar to the Bayes Theorem in discrete form. Furthermore, Eq. (1) violates “the principle of self-consistent operation” [3]. In my opinion, likelihood functions should not be mixed with probability density functions for statistical inference. A likelihood function is a distorted mirror of its probability density function counterpart; its usein Bayes Theorem may be the root cause of biased or incorrect inferences of the traditional Bayesian method [3]. I hope this discussion gets people thinking about this fundamental issue in Bayesian approaches.

References

[1] Edwards A W F 1992 Likelihood (expanded edition) Johns Hopkins University Press Baltimore

[2] Fisher R A 1921 On the ‘Probable Error’ of a coefficient of correlation deduced from a small sample Metron I part 4, 3-32

[3] Huang H 2022 A new modified Bayesian method for measurement uncertainty analysis and the unification of frequentist and Bayesian inference. Journal of Probability and Statistical Science, 20(1), 52-79. https://journals.uregina.ca/jpss/article/view/515

[4] Box G E P and Tiao G C 1992 Bayesian Inference in Statistical Analysis Wiley New York

More Hening Huang's questions See All
Similar questions and discussions