Hello,
After reading the infamous publication "More Bang for your Bucks", I developed a workstation of my own using Xeon CPU (>3GHz, E5 series) along with one RTX and one GTX GPUs. In case of single runs, for approx 60k atom systems, I am getting 140-150ns/day.
Problem starts when I'm trying to run two simulations in parallel without overscribing (16 threads). I am even not going beyond 8 threads.
For single run, PME/PP ratio is around 1.04-1.05 and load imbalance is around 2-3% with DLB on. Fourier spacing kept at 0.10 and cut offs at 1.0 nm.
Is there any specific reason for this? Is there any way to solve this?