I usually apply the integration scheme on the global assembled matrices. However, on the local matrices, I would be able to parallelize better my code.

Similar questions and discussions