Hello, I am working on material simulation and a newbie at all stuff.
I need to parallelize my code within a single node with 16 cores with shared memory.
Server looks like this
----------node--------------
| mem1 mem3 |
| cpu1 === cpu2 |
| mem2 mem4 |
---------------------------------
Two cpu(16x2 cores) are connected and 4 memory slots (several memory cards per memory slots)
I heard that OMP is better for shared-memory. But there is a problem such that synchronization inside a loop is not available. So I try parallalization with shared-memory with MPI.
I heard that cpu1 can access to mem3,4 without mpi communication.
Is it possible to use 16 cores with shared memory within a single node with mpi without mpi communication like MPI_Send() ??
Thank you in advance !