I want to ask about the usage of parametrical and non-parametrical tests if we have an enormous sample size.

Let me describe a case for discussion:

- I have two groups of samples of a continuous variable (let's say: Pulse Pressure, so the difference between systolic and diastolic pressure at a given time), let's say from a) healthy individuals (50 subjects) and b) patients with hypertension (also 50 subjects).

- there are approx. 1000 samples of the measured variable from each subject; thus, we have 50*1000 = 50000 samples for group a) and the same for group b).

My null hypothesis is: that there is no difference in distributions of the measured variable between analysed groups.

I calculated two different approaches, providing me with a p-value:

Option A:

- I took all samples from group a) and b) (so, 50000 samples vs 50000 samples),

- I checked the normality in both groups using the Shapiro-Wilk test; both distributions were not normal

- I used the Mann-Whitney test and found significant differences between distributions (p

More Mateusz Soliński's questions See All
Similar questions and discussions