I work with effects of contexts like the place of residence, and use different softwares that fit multilevel models (R, Stata, MLWin, Mplus). Almost any software does this analysis, nowadays (SAS, SPSS, HLM) and all provide similar estimates for coefficients, especially for linear models. I noticed, however, some difference in the variances (i.e. second level variance) and I am aware they use different estimators (IGLS, REML, MLR, and so on). What are the advantages and disadvantages of the main softwares? Is there any published paper comparing them for discrete variables and non linear models (Binomial, Poisson, N-Binomial, zero-inflated, etc)?