With Gaussian mixture the model's posterior probability for a label is not ever certain - the general idea of expectation maximisation is that changing model parameters will affect the goodness of fit. The SAEM and SEM algorithms of Celeux et al. are designed to converge more quickly, more robustly and handle missing data efficiently than plain EM. They can get stuck in local minima if those exist, that is a feature of your specific problem formulation not the algorithm, often a convex approximation is useful.
You seem to be working with probability .. likelihood .. instead of log-likelihood. This will cause underflow errors. Viewed as log-likelihood then each Gaussian is a paraboloid, the GMM "E" log-function is slightly greater than Max of individual paraboloid log functions log(p+q)>max(log(p)+log(q)), the "M" step of a single GMM does not require you to compute "E" for all points it requires only to find the "most likely" classification for each sample point given a set of parameters, this is just Max(p). With "double GMM" block method this should work too if you are updating only one whilst holding the other constant, but I am sorry I am not familiar with this.