Hi,
I confused on one thing, so I need you help in figuring out the right way:
I have introduce a supervised machine learning-based framework, which is definitely easy to evaluate because you only have labeled data, now the thing is that I have enhanced the same supervised machine learning framework to have semi-supervised capabilities. So, I have labeled data and unlabeled data. I used self-training to classify the unlabeled data and add them in the dataset if the predication probabilities of unlabeled data is above threshold. Now, my question is how to evaluate my semi-supervise learner and see whether it add value in the supervised learning model?
I am confused if I use the updated dataset (with newly added data using semi-supervised learning) for evaluation then wouldn't it add a bias in my results? because my classifier is not 100% accurate and it might have add some false positive samples from the self training part.