I have a code which runs in Python, and it takes more than 10 minutes to accomplish a calculation. Therefore, I am looking for any tricks or techniques to optimize the running time of this code.
Simply switching to C/C++/Matlab is not, usually, a good solution, unless the researcher is already skilled in those languages. Especially C/C++. The time they will spend just trying to find their way around will overwhelm any computational speed gains they might make.
In any case, Python's computational libraries (numpy and scipy) are based on C code anyway, so a dramatic increase in computational speed depends on the level of optimization, and that's much more dependent on the user's skill with a particular language rather than the language itself.
A couple recommendations for code optimization:
1. Use the built-in multiprocessing library to parallelize computation. I recommend using map.async method, in case the computational sub-tasks don't all take the exact same amount of time to finish. Generally, this works if your model involves lots of computations that are independent from each other. For example, Monte Carlo simulations are perfectly suitable for this approach.
2. Use pre-computation. For example, if you are doing combinatorics, there will be lots of pre-factors that are integer-only, and therefore there is a finite number of combinations you will actually need. Simply compute a table of these values once. There is a memory vs. speed trade-off here, but memory is usually plentiful. Just make sure you don't break your memory limits. Even pre-computing factors like square root of pi can save you time (square roots are very slow).
2a. Carefully look at your computation procedure and try to find chunks of values that can be calculated once and re-used. For example, some function optimization routines rely on computing the function's Jacobian. But all these derivatives look very similar, especially if your function has any exponential terms. So you can compute the exponential term once, and then use it in every element of the Jacobian.
3. Speaking of integers. If a part of calculation cycle can be done using pure integers, do it in integers and then convert to floats at the appropriate time. Integer computation is almost always significantly faster than float operations.
4. If and For loops are generally considered slow. Either use vectorization (numpy arrays use it intrinsically) or list comprehension when iterating over quantities. It's also more "Pythonic" this way.
5. To follow Daewonn Lee's advice - profile your code to figure out where you're losing time. It's possible you are not stuck on computations, but rather reading / writing to disk.
I don't think 10 minutes is that long, but if time is of the essence, did consider choosing a different language that is faster and more efficient; C, C++, and Java?
Also, you might want to parallel compute some of the functions that are independent on each other.
You can extend Python with C or C++ for the numerically intensive parts of the program. See the first link below. To identify the latter, you might use profiling, see the second link below. Sometimes, performance issues are caused by nested loops. Then, you can often optimize the code by computing quantities needed in the innermost loops at least partly outside the loop structure and store them.
Simply switching to C/C++/Matlab is not, usually, a good solution, unless the researcher is already skilled in those languages. Especially C/C++. The time they will spend just trying to find their way around will overwhelm any computational speed gains they might make.
In any case, Python's computational libraries (numpy and scipy) are based on C code anyway, so a dramatic increase in computational speed depends on the level of optimization, and that's much more dependent on the user's skill with a particular language rather than the language itself.
A couple recommendations for code optimization:
1. Use the built-in multiprocessing library to parallelize computation. I recommend using map.async method, in case the computational sub-tasks don't all take the exact same amount of time to finish. Generally, this works if your model involves lots of computations that are independent from each other. For example, Monte Carlo simulations are perfectly suitable for this approach.
2. Use pre-computation. For example, if you are doing combinatorics, there will be lots of pre-factors that are integer-only, and therefore there is a finite number of combinations you will actually need. Simply compute a table of these values once. There is a memory vs. speed trade-off here, but memory is usually plentiful. Just make sure you don't break your memory limits. Even pre-computing factors like square root of pi can save you time (square roots are very slow).
2a. Carefully look at your computation procedure and try to find chunks of values that can be calculated once and re-used. For example, some function optimization routines rely on computing the function's Jacobian. But all these derivatives look very similar, especially if your function has any exponential terms. So you can compute the exponential term once, and then use it in every element of the Jacobian.
3. Speaking of integers. If a part of calculation cycle can be done using pure integers, do it in integers and then convert to floats at the appropriate time. Integer computation is almost always significantly faster than float operations.
4. If and For loops are generally considered slow. Either use vectorization (numpy arrays use it intrinsically) or list comprehension when iterating over quantities. It's also more "Pythonic" this way.
5. To follow Daewonn Lee's advice - profile your code to figure out where you're losing time. It's possible you are not stuck on computations, but rather reading / writing to disk.
You can use parallel programming for faster execution.
first find out which part of your code taking more time for that you can use intel's profiling tool and then apply parallel approach. MPI library is available for python.
C/C++ will help to run code more faster then python.
Without information about what you're computing or providing us code chunks, you cannot receive answers that will be really helpful for you.
In few answers, parallel computations or switching to C have been suggested. However, I think that you should avoid it as long as you will solve some simpler issues (I like list provided by Boris L. Glebov).
I think that depends on your specific problem. But in general, when your python code is too slow, you can follow the following step:
(1) Use some profile tools like cProfile or kern_prof to find the execution time of your code and find the performance bottleneck.
(2) Try to improve the data structure, remove non-necessary computation in the bottleneck code. If necessary, implement this part by C and use as a library in python.
Some of you need to know more about the code source. Indeed, this is important to narrow down the problem. So, the code calculates heat transfer between a huge number of areas which need meshing. The code uses view factor concept that involves massive integral calculation.