Dear Munesh, i obviously can, but i'm wondering on what profiling option i should use in order to get a really detailed code profile with source code correlation.
Dear Ion, thank you for your reply, but unfortunately these profiling options don't give me the code correlation profile. Actually i'm looking for profiler setting to get result like this http://devblogs.nvidia.com/parallelforall/cuda-pro-tip-view-assembly-code-correlation-nsight-visual-studio-edition/
Just jaywalking thru google for this, combined with other nonrelevant general exp:
The -G option to nvcc forces the compiler to generate debug information
for the CUDA application. To generate line number information for applications without
affecting the optimization level of the output, the -lineinfo option to nvcc can be
used.
This gives you two things: the -G option generates the additional info for the profiler (you probably already did that, otherwise could not use nvprof).
Then, -lineinfo will generate the info you point out in # 1. in your link. To go further, I'm just guessing you will have to look into the --events/metrics options in the documentation, and find exactly what you want to get.
When you first compile, do you use nvcc -S ? I believe that is the option/switch to generate assembly code. If not, look for it in the documentation (actually -S is the one for gcc, I'm just assuming they should be the same or close)