Hello,
I am struggling to understand how R's lmer function handles missing data. I couldn't find an exact description in the documentation of the package. I also tried to play with some data, but still couldn't figure it out. Here is what I did:
I constructed a full data set with reaction times (RT) to words of 5-7 letters and fitted the following model to the data:
RT ~ Length + (1|Word).
The intercept and the estimates of the fitted model gave me the correct mean RTs in the various length conditions.
Then, I changed the RT value for a single observation (a 7-letter word) to NA, and refitted the model (using either na.action="na.omit", or "na.exclude"). The estimates for 5 and 6 letters were still correct, but for 7 letters (and also the grand mean, when trying sum coding) the result was a bit off. I tried to decipher how the estimate for 7 letters was calculated, but failed. It was not equal to the weighted mean over responses to the different 7-letter words, as I would have expected, but a slightly lower value.
Any idea how is the estimate calculated given some missing data?
Thanks,
Chen