I'm increasingly running across the case in machine learning problems where I have a double-type, continuous variable column with some NAs which should not be imputed by kNN or median type methods (or, in fact, at all).

For example, if I have a garage/no garage column in the standard real estate value problem, and also an 'age of garage' column, one cannot impute the age column. There is no garage. One cannot set it to zero, because that would be a fresh garage. What is the standard method of dealing with this?

More Nils Ross's questions See All
Similar questions and discussions