I have used before measures like the Jaccard Index or the Normalized Mutual Information to check the performance of a clustering algorithm using some benchmarks that have ground truth.
Now I am working with time dependant environments split in several time slices. For any time slice i have a ground truth to evaluate the performance of my clustering algorithm. I am looking for some measurements like the ones mentioned above that compress the performance over all the time slices into one number.
thanks.