Matlab is not a compiled language as you well know and loops are very inefficient. If an operation cannot be vectorized which can be difficult many times, they the only other option is parfor. I've done extensive testing comparing "parfor" to "for" for an ongoing project. The parfor structure will force you to write much better S/W, e.g. pre allocate memory, remove dependence on order in the loop, etc. If the code is mostly computational then you efficiency will be higher than if there is some associated "accounting code."
The experience on my project is on a Macbook Pro Retnia with four core i7 and 16 GB memory about a 1,7 to 1.9 speed up in execution time. On a Mac Pro with 8 core Xenon and 64 GB memory about a 3 X speed up. It may have been more but the parfor gave us real time through put (with the processor waiting) in the Mac Pro so the speed up was based on playback from disk which introduces inefficiencies of its own.
The one thing is you need to debug your code using a for loop since you cannot set break points within a parfor loop of change "for" to "parfor" is the last thing you do.
Matlab is not a compiled language as you well know and loops are very inefficient. If an operation cannot be vectorized which can be difficult many times, they the only other option is parfor. I've done extensive testing comparing "parfor" to "for" for an ongoing project. The parfor structure will force you to write much better S/W, e.g. pre allocate memory, remove dependence on order in the loop, etc. If the code is mostly computational then you efficiency will be higher than if there is some associated "accounting code."
The experience on my project is on a Macbook Pro Retnia with four core i7 and 16 GB memory about a 1,7 to 1.9 speed up in execution time. On a Mac Pro with 8 core Xenon and 64 GB memory about a 3 X speed up. It may have been more but the parfor gave us real time through put (with the processor waiting) in the Mac Pro so the speed up was based on playback from disk which introduces inefficiencies of its own.
The one thing is you need to debug your code using a for loop since you cannot set break points within a parfor loop of change "for" to "parfor" is the last thing you do.
I don't have specific experience with matlab parfor loops, but in general, the speedup increases as the work done with each pass of the loop increases; otherwise, the overhead associated with thread coordination/communication dominates.
If loops are not used, and all operations are vectorized (i.e. without using loops), then the core matlab engine usually does a pretty good job of distributing the computational load evenly over all available cores, for a nice speedup.
One (maybe trivial) thing to keep in mind, is that memory really constrains your parallel code. As Truman Pravatt already pointed out, the execution of the parfor maybe constrained by disc I/O. If your loop at some point runs out of "normal" memory (since a parallel execution may create much more temporary data at the same time point) and starts swapping, then PARFOR will be severly slower than FOR.