I have a large set of molecules and I'd like to assess its diversity in terms of molecular structure, assuming I'm designing a screening library. My question is not what tools to use, but rather how should I interpret the results?
Let's say I choose clustering. How many clusters indicate a diverse library?
"Modern Approaches in Drug Discovery" suggests comparing Morgan fingerprints against Tanimoto distance matrix with multidimensional scaling applied, among others. But what will it get me?