I have a Navier-Stokes based Numerical Wave Tank (NWT) code that has been parallelized using MPI (Message Passing Interface). I am using this code on a single (multicore) node / computer (i7) with 4 GB of RAM.
The problems that I am currently simulating (using my parallel NWT code) are two-dimensional and not memory intensive (maximum no. of computational cells is < 105)....in other words, the serial version of the code is also able to simulate these problems (albeit more slowly) without facing memory issues.
I reckon that the computing architecture (comprised of a single node) that I am using is better suited for OpenMP based parallel codes (shared memory mode) rather than MPI based codes (distributed memory mode).
So my question is this: Would speedup (ψ) get adversely affected if a non-memory intensive CFD code were parallelized using MPI and then run on a single (say i7) machine / node with multiple cores?
P.S.: I ask this question because I am struggling to achieve linear speedup (ψ=number of cores) for my code....currently ψmax < 4 for eight cores (irrespective of the nature of problem being simulated).