18 December 2018 16 4K Report

Hello,

I am struggling to understand how R's lmer function handles missing data. I couldn't find an exact description in the documentation of the package. I also tried to play with some data, but still couldn't figure it out. Here is what I did:

I constructed a full data set with reaction times (RT) to words of 5-7 letters and fitted the following model to the data:

RT ~ Length + (1|Word).

The intercept and the estimates of the fitted model gave me the correct mean RTs in the various length conditions.

Then, I changed the RT value for a single observation (a 7-letter word) to NA, and refitted the model (using either na.action="na.omit", or "na.exclude"). The estimates for 5 and 6 letters were still correct, but for 7 letters (and also the grand mean, when trying sum coding) the result was a bit off. I tried to decipher how the estimate for 7 letters was calculated, but failed. It was not equal to the weighted mean over responses to the different 7-letter words, as I would have expected, but a slightly lower value.

Any idea how is the estimate calculated given some missing data?

Thanks,

Chen

More Chen Gafni's questions See All
Similar questions and discussions