12 July 2016 10 2K Report

Hi,

I have data that is based on sample of children and adolescents aged between 10 to 14. When I checked my IVs (scores on few different tests) and my depended variable (score on a personality test) for normality they all appear to be non normal according to Shapiro-Wilk test though skewness and kurtosis values were pretty ok for most of them (smaller than 1 and close to 0.5). I tried to normalize the data (which include relatively many zero values that I thought were behind that non normality) using log10, inverse and sqr root methods (after adding 1 to all values). Non of them led the data to normality. I didn't try reflection because I fail to see how that may help and also I didn't try box cox because it only applies for the DV. 

I then tried to see the distribution when I divided the sample per age and per sex and the non normality gone. It seems that the strongest effect was of age. Meaning, my sample actually include three different populations- I didn't plan that, such significant differences were not predicted per age and sex.

The problem is that I really need to use regression and my sample size is not big enough to use only part of the data for that (I've nearly 300 participants but if you check them per age group and sex then we get six small groups of different sizes and considering I've 3 IVs, it's not a good idea). Is there anyway to normalize such data without dividing it to groups?

More Gilad Sabo's questions See All
Similar questions and discussions