De finettis theorem and two various convergence results?

05 May 2016 1 1K Report

Let me rephrase this.

My central question is what is de finetti's theorem as opposed to the convergence result and which convergence result is deemed the central de finettis law of large number convergence result

I presume, that de finettis theorem is merely the idea that an exchangeable prior subjective probability distribution over sequence results expressed as

P(x1, x2,x3,.....etc ) that is not independent can be expressed as a probability over IID probability hypotheses, that is hypotheses on which the outcome is conditionally independent. IT is this, and not the associated convergence results that makes up what is called de finetti representation theorem.

For instance where there say r heads and n-r tails, P = integral (from zero to one) { K^r * (1-K)^(n-r) dQ(K) } where Q(K) gives the distribution function (the probability of probabilities, the probability for any given parameter K)). This being the content of the representation theorem, that the Prior of some sequence or rather a joint distribution can be given this decomposition in terms of the intergral? Is this correct?

Is the the whole point of this enterprise that simply a subjective prior that is exchangeable can be expressed as a probability of objective probabilities on which the data are conditionally independent; and this being so because in order to use bayes rules, a probability of probability is required, not just a prior probability for the outcome (and we if we have just have a prior probability then we could have the problem of old evidence or independence where no data alters the probability). BY allowing for an exchangeable prior, that is not independent that can be decomposed as a probabilities of probabilities, the subjective bayesian of de finettis sort who disagrees with the concept of probabilities over probabilites can make use of the results of classical statistics. That is acting as if 'there were such a probability over probabilities.

In the subjective bayesian paradigm, probabilities of probabilitiesare not really part of the picture yet they need some way of expressing their prior in a way that can learn from experience ie vai probabilities of probabilities in bayes theorem

This is de finettis result strong law result i presume I presume. The convergence of opinion results are often called 'de finetti's theorem' in the literature but in reality they should only be called corollaries of this result (the result being the decomposition into a probability of probabilties).Is this correct. Often they (that is both the decomposition result and the convergence result following from it) are unfortunately given the same name.

One convergence such result is (A): if the limiting relative frequency of the data = r/n, then the posterior distribution will converge to Q(K=r/n)=1. The problem here is that K is a chance hypothesis, so i presume the subjective bayesian must read K as itself a subjective credence, and one can then derive that K=r/n, by the law of total probability where K is a subjective posterior credence (othewise you have to make use of david lewis principle principle which forges a link between chances and credences and the purely subjective bayesian loses the ability to express events in terms purely of credences, and must make use of the notion of chance we they often disagree with. Is this what the subjective bayesian has in mind (that a prior probability can be expressed as a subjective credence over subjective credences, so that with subjective credence 1 the posterior subjective credence=r/n which simply collapses to posterior subjective credence=r/n).

Another result (B) I see listed (see the discussion of Gillies 'philosophical theories of probability' starting at page 70) is that after an infinite amount of data, one can prove directly from the exchangeable priors P(xn+1=heads)=r+1/n+1, as n limits to infinity, (and so P(xn+1)=r/n) assuming that that the prior probabilities P'(any with sequence with R heads and n trials )= x and P'(sequence of R+1 heads and N+1 trials)=y are such that C=x/y =1 as n goes to infinity.

Is this result (B) distinct (does it follow from A or the other way around and is there a reason that de finetti convergence is rarely rarely expressed in this format- is because it assumption c above or because it does not follow from de finetti representation theorem- is the representation theorem and the result A which follows supposed to put on solid ground the cursory result B which is not as formally valid given the assumptions such as C above). In other words is (B) a different result from the first convergence method )A) above? (B) does not make use of probabilities of probabilities (the representation theorem) yet it is the one that is rarely cited. It would appear to be de finetti's interpretation and perhaps (B) follows as a result of A. I presume this would be the subjective bayesian preferred view(otherwise the subjective bayesian would have to interpret the probability of probabilities as a subjective probability of subjective probabilities) and it would defeat the entire point of not having to need probabilities of probabilites. IT gives directly the posterior probability for the outcome and does so without having to use the probability of probabilities simply instead using the exchangeable trial (so one merely needs a prior probability for each outcome, not a probability over probability hypotheses let alone the extra hypothesis that these hypotheses are subjectivist to make them mesh in bayesian terms). However not phrase convergence using IID random variables (or conditionally independent random variables) or hypothesis and so perhaps this is why this result is not favoured. By this i mean it tells us that the posterior probability of P(xn+1=heads) as n goes to infinity must be r/n but it does not give a general result for P(x=heads). I think this is a martingale result presumably showing that all n bigger then n+1 one will get the same posterior and that might be its advantage because otherwise it gives different prior probabilities for each outcome before hand (for anything event that has already occurred xn-4 for instance if its heads, P(xn-4= heads)=1.? Is this one of the problems and why formulation (A) is often preferred or is formulation A just a rigorous proof of this result B I presume that this result B is only to be used for xn+1 when the evidence concerns all and only that trials up to xn+1, and the posterior probability that it gives for P(xn+1/xn, xn-1.....) = expected value of the posterior probability of K in (A) above after n+1 trials, the posterior probability it gives for P(x3/x2,x1) will be the weighted expected posterior value of k given trials x1,x2 in A above Is this correct? By expected value i mean the weighed average across all K, weighted that is by the posterior probability for each such K given teh evidence at that point). If this is the intended result then why bother with the de finetti mixture interpretation to begin with and its associated convergence result (A) given that it invokes chances and if it is not, then what does de finetti's theorem prove except that it almost defeats itself by presuming that there are objective probability hypothesess (or is (A) supposed to be interpreted using a credence mixture of IID subjective credence distributions and this fills in the gaps for the proof of (B) and removes some of the problems that (B) has including, (c) as mentioned and the other problem which is that in (A) each variable has its own distinct credence value (ie Cr(xn+1=a) is attached a credence possibly distinct to cr(xn=a), distinct from cr(xn-1=a), insofar that there is no general hypotheses for Cr(x=a) in this approach (B).

Please tell me if this is correct, and the motivation behind one representation over another. OR which of the two representations is often referred when they speak of de finetti's convergence theorem or de finettis law of large numbers.?

David A. Jones

This link might be helpful :

http://www.stats.ox.ac.uk/~steffen/teaching/grad/definetti.pdf

Badges
Science topic

Similar topics
Mathematics
Statistics

More William Balthes's questions See All

What should be used to instead of '='mean equal in the order, /numerically equal in value for non numerically identical events in order theory?

Suppose one has a numerical order (objective probability) that is to be represented by another numerical order F (subjective probability) .Where the events are ordered pointwise numerically. W...

09 October 2017 3,020 0 View

Is the 2 canonical probability simplex (triples)uniquely represented by a vector probability function ranked point-wise,/numerically by tself ?

Ie where the function is non negative and bounded 0

09 October 2017 7,981 1 View

Difference between an order automorphism and order isomoprhism?

Is there any difference between an order iso-morphism (often defined as a -bi-jective (or sur-jective ) order embedding and an order auto-morph-ism in the context of the same numerical domain of...

09 October 2017 6,153 16 View

Do strong qualitative probabilistic representations over finite sets constitute order isomorphisms when not unique or an order emebdding without =?

When a complete comparative qualitative probability order defined over a finite but complete power algebras that is complete in the order. \forall events in P(\omega): A

09 October 2017 2,206 0 View

Are there counter-examples to strong representability by a merely normalized probability like function? (only closed singletons , unit,0?

It is well known that for a total qualitative probability order 〈 S, F=𝒫 (S), ⪋ 〉↦ S, F=𝒫 (S),P〉 Scotts axiom's, in addition to (1), (2), (3) tothe axioms of non negativity : (1)∀ (A_i)∈ F:...

08 September 2017 1,221 0 View

Is the Canonical unit 2 standard probability simplex closed under all and only convex combinations of two values and which sum to any value in (0,1)?

Is the canonical unit 2 standard probability simplex, the convex hull of the equilateral triangle in the three dimensional Cartesian plane whose vertices are (1,0,0) , (0,1,0) and (0,0,1) in...

07 August 2017 4,386 2 View

State dependenent additivity and state indenpendent additivity ; akin to the cauchy kolmorgov additivity distinction?

State dependent additivity and state independent additivity? ; akin to more to cauchy additivity versus local kolmorgov additivity/normalization of subjective credence/utility, in a simplex...

07 August 2017 8,614 1 View

Midpoint convex monotone increasing F on closed interval rationally convex? linear with three fixed pts

presume that midpoint convex F: [0,1] to [0,1] is monotone increasing (not necessarily strictly ) F(0)=0 ,F(0.5)=0.5 F(1)=1 , As its monotone increasing and midpoint convex with F(0)=0 F(1)=1 it...

06 July 2017 1,275 0 View

Are these two pieces of notations equivalent

is the notation,'≧' equivalent to ≥ ,likewise,are tfor greater then or equal to, at least in the context of analysis or order theory is notation for super ⫆ equivalent to ⊇ .? I am seen this use...

06 July 2017 4,585 1 View

F:[0,1]to[0,1]Functional equation,ifF(x)+F(y)=1\F(x)+F(y)+F(z)=1,cauchy's equation over the unit triangle with F(1)=1 i

Is the following function F:[0,1] to [0,1] F strictly monotonic increasing F(1)=1,(i presume this unnecessary as its specified by the first two (1)ie x+y=1 if and only iof F(x)+F(y)=1...

05 June 2017 10,098 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

1. If I can quantize the atom using this hyperbolic spiral and classical physics, could nature do the same?

If we map as a continuous motion an ionising electron (beginning its journey at n=1) in an H atom, a specific hyperbolic spiral appears (see animation). When we solve this spiral formula, we find...

07 August 2024 5,343 2 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View

How to back transform the results generated from analyses using log transformed with In(X+1) data?

I am conducting my analysis using SPSS. I log transformed my data using In(X+1) as my data contain zero values. However, when I want to back transform the regression coefficients generated from my...

31 July 2024 7,860 3 View

Have you tried using Vizly for your data analysis? Use the link: https://vizly.fyi/?via=olatomide. How do you see it?

AI has made it easier to code and analyze data

25 July 2024 9,861 1 View

Is it appropriate for researcher(s) to collapse five or four rating Likert scales to three or two as the case maybe during data analysis?

Five or four rating Likert scales e.g. Strongly agree, agree, neutral, disagree and strongly disagree or Strongly agree, agree, disagree and strongly disagree are usually collapse to SA/A, N, D/SD...

24 July 2024 9,841 4 View

How to test multivariate outlier in STATA?

Hey all, I need help testing for multivariate outliers using STATA for my master thesis. The literature recommends the Minimum Covariance Determinant (MCD) (Verardi & Dehon, 2010). I found the...

22 July 2024 8,821 2 View

What is the physical meaning of the magnetic scalar potential?

In cases where the rotational of the magnetic field H is zero, we can define this field as the gradient of a scalar function defined as the magnetic scalar potential (similar to the electric...

21 July 2024 9,633 4 View

Who wants opportunities for scientific cooperation?

Dear Colleagues, I hope this message finds you well. My name is Noor Al-Huda K. Hussein, and I am a researcher specializing in deep learning applications in genetic data analysis. I am currently...

16 July 2024 3,981 6 View