I have set up a MapReduce code for KMeans algorithm using python. The code is working fine.The resultant cluster file that is generated as part-00000 output file contains the set of data points with full dimension.Now I want to find out the index of the data points.How can I do that?

Similar questions and discussions