Hello,

I have a large non-balanced panel dataset regarding a company dealing with private lessons for students. My goal is to fit a lmer model which predicts the number of monthly hours taken by a student according to a pretty large set of variables, both categorical and quantitative - most of them are counts, or relative percentile of a certain factor.

The random effect is based on a dummy which separates students who had only one teacher during the t period (about 63% of the entire record) from students who had more than one.

I have attached a barplot showing the distribution of my dependent variable, so we see it's strongly asymmetrical and with a long right tail.

I have already tried to apply the log-transformation by studying a Box-cox fit and it does a really good job for the consistency of parameters and of the model itself, showing positive R2 values and a residuals QQplot apparently good but, having it tested with both K-S and Jacque-Bera, I get p-values near to zero. Is there anything proper i could do to fix this issue?

Similar questions and discussions