If one computes multiple topic models on the same corpus, is there a measure that allows one to choose which model is the best one. I did look at various internal coherence metrics, including Cumass, as well as some external consistency measures, Cuci, but I found that with low alpha, those measures may be higher than with high alpha, yet be associated with clearly inferior topic models.