Just in case, enforcing a zero error over the training data causes significant overfitting in most classification tasks. If you are going to apply a learner to real applications, non-consistent learners are better choices. Ignore my message if you are just discussing the theory.
As Prof. Bauckhage mentioned, the 1NN and candidate elimination are consistent.
Besides, I think if one feature has a large range of values so that we can assign each sample one unique value for that feature, we can say ID3 is consistent too.
Dear Parsapoor, I can't get your point. "If the number of the training samples goes to infinity, the error of K nearest neighbor algorithm approaches zero".
Dear Wu, you are right about overfitting. But as you guess I'm thinking about thheorical issues.