I trained a 3-layer LSTM network to extract d-vector embedding using keras. I extracted MFCC features from TIMIT dataset as input to the model, and defined a custom loss function (i.e. GE2E loss).

After less than 213444 batches I got zero loss (on train and dev sets), however, when I use the model to predict d-vecotrs (even using input form training set) I keep having nearly the same output (i.e. The cosine similarity between any output vectors is 0.99999xxx).

I double checked the code and the loss function implementation, it seems to be correct.

Any idea what might cause such problem?

More Alhasan Alkhaddour's questions See All
Similar questions and discussions