Sometimes, in large samples, the data looks stable, but normality tests still fail. Some argue that parametric tests work fine in these cases because large samples reduce the impact of non-normality. Do you know of any studies that support this? What do you think?