Because the LHS has one random variable from Y. the regression coefficients each have a random variable as well as the error term. ..thus the LHS distribution doesn't have to be the same as that of the RHS. This we look at the residual distribution to see the overall effect not y by itself.. Best wishes, David Booth
It makes no sense to do any formal test of normality (like Shapiro) to justify the use of ANOVA. In short because:
either you can reject H0 or you can not reject H0. If you can, it's just stating that the data are sufficient to conclude that reality is not identical to an idealized distribution model but not whether the actual discepancy is of any relevance (-> useless), and if you can't, it's stating that the data are not enough to be conclusive regarding H0, and absence of evidence is not evidence of absence (-> again useless).
Rasch D, Kubinger KD, Moder K (2011): The two-sample t test: pre-testing its assumptions does not pay off. Statistical Papers 52(1): 219-231
Rochon J, Kieser M (2010): A closer look at the effect of preliminary goodness‐of‐fit testing for normality for the one‐sample t‐test. Br J Math Stat Psychol 64: 410-426
Rochon J, Gondan M, Kieser M (2012): To test or not to test: Preliminary assessment of normality when comparing two independent samples. BMC Med Res Methodol 12: 81
Schoder V, Himmelmann A, Wilhelm KP (2006): Preliminary testing for normality: some statistical aspects of a common concept. Clin Exp Dermatol 31: 757-761
Williams, M. N., Grajales, C. A. G., & Kurkiewicz, D. (2013). Assumptions of multiple regression: Correcting two misconceptions. Practical Assessment, Research & Evaluation, 18(11). http://www.pareonline.net/getvn.asp?v=18&n=11
Take an example of two groups in which variation around the group means is actually normally distributed. The distance between the groups is large. SW will detect the bimodality of the observed variable. But of course this is not quite the intention of running a test of normality. Looking at the residuals gives you a limit to which the assumption of normality can still be plausible.
Should you? Per @Jochen, this is not a convincing use of any test of normality and regardless of your result you will then be in the position of deciding whether your primary model is appropriate and acceptable. It does, however, sometimes make reviewers happy.
@Bruce Thanks for your question. Jochen and I once had a discussion about this. Jochen and I agree on the answer. He was simply taking a different path to the the same place I believe. In my years of publishing research I found that satisfying reviewers is oftentimes a good idea but not always. Hope this helps. Best wishes, David Booth