This managed to throw me for a loop. When I'm assessing the presence of outliers before I conduct my moderation analysis I use SPSS to split the data file based upon the dichotomous moderator. I test my assumptions separately and perform a simple linear regression of the IV on the DV at each level of the moderator to generate studentized deleted residuals (SDR) and Cook's distances, so far so good!

The confusion I'm running into is that in Cohen et al. (2003) he recommends a different cutoff score for considering potential outliers: SDR +- 2 for small sample sizes and +- 3 for large sample sizes. This sort of information makes a lot of sense in the context of a standard multiple regression but when dealing with a dichotomous moderator in which assumptions are tested separately for each level I'm just wanting to do a bit of a sanity check.

I've assumed that if there is a small group size in one dichotomous level that I am applying these SDR cutoffs separately based upon size of the group and NOT overall sample size.

Similarly, I prefer using the 4/n cutoff of Cook's distance for considering potentially high leverage values rather than just looking for values greater than 1 (I could swear these are only present in artificial datasets).

All the articles I've come across so far always express n as sample size and so again I run into this slight confusion about if it's correct of me to have two different cut off scores of Cook's distance that is specific to the group size. I mean after all, the regression line, the residuals, and the produced Cook's distance are specific to that level of the moderator and therefore should have their own group cutoff.

Thanks in advance for taking the time to read and weight in!

----

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

More Robert Lawrence Cooper's questions See All
Similar questions and discussions