26 September 2017 1 718 Report

I've always assumed in order to generate a set of MFCCs for speech synthesis using Hidden Markov Models, that there was one HMM per Mel Coefficient, that is 12 HMMs, an HMM for the pitch, and yet another for durations. Apparently people just use one HMM for all the variables, so I wonder if it is possible to do as I first described, and if so is it efficient?

More Lyes Demri's questions See All
Similar questions and discussions