Can I profile OpenACC kernel C source code level?

Hi.
I’m trying to speed-up my code with openacc.
So I want to profile my code in source level.
I’m using ‘nvvp’ profiler from CUDA 7.0
When I run nvvp, I can use ‘analysis tap’ and can get which latency slows my code. (data dependency, conditional branch and bandwidth… etc)

But, I couldn’t get line-based analysis, but only ‘kernel’ level analysis.
(e.g. main_300_gpu kernel used 10s)

Is there any way to profile my code in source-level?


I’m using
PGI 15.7 (using pgcc)
CUDA 7.0
GTX 960
Ubuntu 14.04 LTS x86_64

Hi slee91955,

Try adding “-ta=tesla:lineinfo”. This will generate line information in the generated kernel which NVVP can use to perform source correlation.

  • Mat