I am working on language identification through i-vector.But i am thinking that if a language which use two language like if we speak hindi and some time u prefer some english word than that type of data set shows some problem for model which built for corresponding language.and it also decrease the performance of model