We are doing a benchmark of an urban case with 2 nested domains, resolutions 36 km and 12 km, and comparing the results it seems that running with the same set up it takes almost twice time using 128 processor via parallel machines instead of 64 in a single machine. Does anyone know this difference? Could be some issue with the efficiency of the communication between processors?