I have a corpus with 50 speakers. I am a bit confused on how should I divide the corpus to perform a cross-validation strategy.
I am using a GMM-UBM strategy, so I have to divide the corpus in speakers to train the UBM model and speakers to perform tests. The latter also need to be sub-divided in registered speakers and impostors.
Any material on how to do this would be greatly appreciated.
Thank you very much.