I was getting some unexpected p-values when performing statistical tests in Matlab (such as very small p = 1e-40) so I decided to check their
behaviour on a synthetic dataset.
I performed tests 100 times, each time generating a new pair of datasets from N(0,1), as follows:
pd1 = makedist('Normal', 'mu', 1, 'sigma', 0);
sample1 = random(pd1,100,1);
sample2 = random(pd1,100,1);
p = kruskalwallis([sample1, sample2], [], 'off');
p = ranksum(sample1, sample2);
p = signrank(sample1, sample2);
[h, pValue] = kstest2(sample1, sample2);
sample1zero = sample1 - median(sample1); %for dispersion test
sample2zero = sample2 - median(sample2); %for dispersion test
[h, pValue] = ansaribradley(sample1zero, sample2zero);
[H, pValue, SWstatistic] = swtest(sample1);
[H, pValue] = lillietest(sample1, 'MCTol',0.1);
Results are shown on the graph(image).
I expected the p-values to be high and consistent, since both samples come from N(0,1). However, the p-values of all tests range from 0 to 1 and the
medians are around 0.5.
Are these results correct? Is my methodology faulty or expectations incorrect? What am I missing?