Hi,
I am trying to do a Quantum espresso SCF calculation on an Intel Xeon Gold Gold 5120 CPU @ 2.20 GHz (2 Processor). It has 56 cores and a 96 GB RAM.
I am trying to do a parallel calculation on this workstation by using:
mpirun -np (no of cores) pw.x -npool (no of pools) -inp si.pw.in
According to internet sources, I have tried to improve the performance by setting the OMP_NUM_THREADs=1 and I_MPI_PIN_DOMAIN=1.
Can anyone please guide me as to how to choose the no of optimum cores and the no of pools on which I should run the calculation.
The input file is attached below.
The FFT grid dimensions is (48 48 48) and maximum number of k-points is 40000
Subsidiary Questions:
1. Should the Subspace diagonalization in iterative solution of the eigenvalue problem run by a serial algorithm or an ELPA distributed-memory algorithm
2. Should the maximum dynamical memory per process be high or low for better performance?