Hi again.

I have the following problem:

Let say  I have two random variables X, Y, discrete, each with three possible values, a,b and c.

I would like to test if in a sample of 500 objects, the two values X and Y are independent in a special way:

H_0:  Pr(X=x ^ Y=y) Expected_ij.

In order to calculate p-values, I calculate the distribution of s(T') over a million random tables, each with the same marginal frequencies of T'.

To do so, I generate a random permutation of the values of X on the  500 objects, and another random vector with the frequencies of Y,

calculate the contingency table T' and s(T'), over a million T'

Then, the distribution of s(T') is used to calculate p-values.

But I realize that the null hypothesis H'_0 for the Montecarlo procedure is  that X and Y are independent which  is not exactly what I need.

Fortunately, for a fixed r, the number of times that there is a T' where  s(T') >= r, when H'_0: Pr(X=x ^ Y=y) = Pr(X=x) Pr(Y=y), is greater than

the number of times that there is a T' where s(T') >= r, when H_0: Pr(X=x ^ Y=y) = Pr(s(T') >= r | H_0).

So, if I have a table T, and I calculate s(T) = r, I know that if Pr(s(T')>=r | H'_0) = r | H_0)

Similar questions and discussions