Hello!
I have a non-normally distributed variable (income) and although I tried to transform it to a normally distributed variable skewness and kurtosis values are still so high and there is lots of outliers on it. But can't delete the outliers because it is about nature of income variable. So I didn't delete a single one (by the way N=9918, I am not sure it is acceptable to delete 200 or 300 of them). I read about after conducting the OLS if residuals are distibuting normally it is acceptable to use OLS results. But I couldn't find any academic source/strong reference about it.
I wonder that when I have normally-distributed residuals can I use OLS results even if the variable has outliers and have higher skewness and kurtosis values? If this is an acceptable way to conduct this analysis, can you suggest an academic resource that I can reference to support this usage?
Thank you in advance.