I'm working with a multi class text classification data set having train and test sets. There are around 470 unique labels in training set and around 250 unique labels in test set. (These 470+ 250 unique labels comes from a large set of labels of size 400 thousand. )

There are around 30 labels which are only in test set but not in training set. I'm not able to understand how to handle the labels which are not in the training set.

Shall I have to discard these 30 labels ? Or Encode each label into a one hot vector of size 400 thousand rather than 450 ?

More Kalyan Katikapalli's questions See All
Similar questions and discussions