Thanks for your answer. The version you suggest is actually the one I'm using right now. But it is still quite slow on CEC Large Scale Contest. I was also wondering if there is a newer version which exploits separability (more explicitly).
Loshchilov, I. (2014, July). A computationally efficient limited memory CMA-ES for large scale optimization. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (pp. 397-404). ACM.