Assuming we have a corpus with unlabeled documents and we have used
a certain unsupervised ML method and we get some clusters.
How can we measure the quality of the built clusters?
Please remember, we don't have the real classification results, other wise we have used a supervised ML method.
One possible way is to select a randomized sample and to manually classify them and then to compare the system's results to these results.
Dear researchers, do you have other ideas how to measure the quality of the built clusters?