Why do the residuals need to be normal when carrying out multi level modeling?

Hi Alex,

When the residuals are not normally distributed, then the hypothesis that they are a random dataset, takes the value NO.

This means that in that case your (regression) model does not explain all trends in the dataset. I guess, you don´t want unkown trends to remain in your dataset. I would feel uncomfortable with that, because this would mean that your model is not fully explaining the behaviour of your system.

Only solution is to find a model that fully explains the behaviour of your system. That means that you have to find a model, that shows residuals which are,... yes indeed, normally distributed.

Cheers and good hunting,

Frank

Brendan J. Morse

Hi Alex, one of the big problems with non-normality in the residuals and heteroscedasticity is that the amount of error in your model is not consistent across the full range of your observed data. When you think about your predictor variables, this means that the amount of predictive ability they have (i.e., as calculated in their beta weights) is not the same across the full range of the dependent variable. Thus, your predictors technically mean different things at different levels of the dependent variable. Not so good for interpretation.

Transforming the dependent variable can help to correct for this - but at the same time makes the interpretation of the overall model a little bit more opaque. You have to make the trade-off on what you are comfortable with here.

If the square-root transformation did not fully normalize your data you can also try an inverse transformation. The strength of transformations tends to go from 1. Logarithmic, 2. Square Root, 3. Inverse (1/x). See if that helps.

Andreas B. Neubauer

Hey Alex,

from what I understand, normally distributed residuals are required since your are estimating the parameters of your model via maximum-likelihood estimation. To obtain these estimates, you have to make assumtions about the distribution of your residuals and this assumption is (in linear multilevel modeling) that the residuals are normally distributed. The logic behind this is the same as in (sinlge level) regression analysis.

But just as for a single level analysis you can make other assumptions (take for example logistic regression) you can also change the assumption of your residuals in multilevel modeling. You "only" have to define an alternative distribution of the residuals via generalized mixed models (for an applied example of these methods see the attached reference).

As to your question whether this is a problem: strictly speaking yes, because you violate a basic assumption of the model and you parameters might be biased. From an applied perspective it very likely depends on the degree of the violation.

Best,

Andreas

Article Reactivity to Stressor Pile-Up in Adulthood: Effects on Dail...

Thom Baguley

I don't believe it is generally true that the residuals need to be normal (it certainly doesn't follow from maximum likelihood estimation). If you have a multilevel generalized linear model then it depends on how you have set it up. The common common setup is a normal response with variance parameters that are also assumed to be normal. For other models the response might be assumed binomial or Poisson, but typically the variance parameters would still be modeled as a normal distribution. However, other models are possible - it just isn't very easy without a bit of extra work (e.g., switching to a Bayesian software that has more flexibility in say using t distributions).

I'm not aware of work that suggests that the normality assumptions is particularly important for multilevel models (as opposed to other regression models). You certainly want to avoid marked skew or kurtosis and consider transformations.

Kelvyn Jones

A few points

1 As others have stated it is quite common to model non-Normal distributions at level 1 using a discrete outcome model such as Probit/Logit/ Poisson and NBD model.

2 So I presume you are talking about higher level residuals which often assumed to be Normal so as to be summarized in a variance term. If they are not Normal, this estimate could be poor.

3 As usual this is the assumption of conditional normality - so that the assumption is that level 2 residuals are Normal taking account of what is in the fixed part - a well specified fixed part often works wonders.

4 (Consequently) if there is a notable outlier ( or indeed a set of outliers) it is possible to include fixed part dummies and thereby assume (and often achieve) that the rest of the higher-level residuals follow a Normal distribution - see this manual on Rgate for an extended example

https://www.researchgate.net/publication/260771330_Developing_multilevel_models_for_analysing_contextuality_heterogeneity_and_change_using_MLwiN_Volume_1_%28updated_June_2014%29_Volume_2_is_also_on_RGate

5 There is some literature that suggests (unless there are marked outliers) you do not need to get worked up about the Normality assumption

eg Misspecifying the Shape of a Random Effects Distribution: Why Getting It Wrong May Not Matter, Statistical Science, 2011, Vol. 26, No. 3, 388–402

6 There is software - eg Winbugs that allows different distributions for the higher level - eg a t distribution with fattened tails.

7 Finally it is possible to fit non-parametric distributions at higher level eg GLLAMM (in Stata) has the possibility to put in mass points; see these two papers for the use of this idea to get at latent trajectories in a growth model- that is discrete random effects and not a continuous distribution

https://www.researchgate.net/publication/46541418_INTERNATIONAL_VARIATIONS_IN_LIFE_EXPECTANCY_A_SPATIO-TEMPORAL_ANALYSIS

https://www.researchgate.net/publication/24046536_Regional_variations_in_voting_at_British_general_elections_1950__2001_group-based_latent_trajectory_analysis

Book Developing multilevel models for analysing contextuality, he...

Article International variations in life expectancy: A spatio-tempor...

Article Regional Variations in Voting at British General Elections, ...

Rumi Masih

You need to extract the degree of non-normality in the set of residuals and then diagnose the problem. Non-normality is not all that much of an issue when it comes to modeling if you have a large data set since your estimates will be unbiased. However, the problem usually arises due to inferences you will make upon the model you have estimated and since most of the tests are asymptotically normal in assumption, you run the risk of making errors at the inference stage. If you have a fairly high degree of NN, I would assume that there is some flaw to the underlying specification of your model or that you have misspecificed or even omitted some factor(s) that is(are) pertinent to the relationship you are trying to model. In addition, if you wish to optimize using your estimates, you may also run in to issues if your underlying residuals are non-normal. In that case, I would model the relationship using estimators designed for data that is explicitly NN like from the Weibull distribution, just as a suggestion (from family of NN distributions).

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

What is the difference between mathematical R^4 space and physical 4D unit space?

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

Controlling for pupil light reflex when analyzing pupil size time course?

What are a “Farmers Producer Organization” (FPO) and its essential features?

Strugglling with m6A dot blot any suugesstion ?

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How to get moment output in Abaqus Standart?

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

How can I prepare virus for a TEM or SEM imaging?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?

Can you suggest reliable sources defining "3D mesh" and "3D city models"?

Please explain how the plastic input value should be considered from the true stress-strain curve for the bilinear elastoplastic material model ?

What are the shear and normal stiffness values of an LLDPE liner in 3D numerical modeling of a stockpile?

Is it necessary to covary exogenous constructs in a structural model?

Given the current level of natural phenomena cause by the climate change and environmental pollution, will the AI find the technological solutions?