I have a dataset with a lot of missing values so I've been experimenting with various imputation methods. I've been doing cross-validations with MAE, RMSE and Pearson's R2 and looking at density plots and the rest, means, medians, standard deviations, to ensure that my imputed datasets are of a reasonable quality.

The next step I feel is to compare with the original data, it occurs to me that a Mann-Whitney U might be a good way of directly comparing the two, the idea being that a non-significant difference here would be a good thing.

So my question is, are there standard methods for assessing data imputations that I should know about?

More Daniel Forster's questions See All
Similar questions and discussions