I’m currently reading some slides by Boston University on OpenACC which are from 2017. and they include
pgprof as their profiler for OpenACC code. It has an output as the one shown below:
Which basically shows a per compute region breakdown of the device time per data region and per kernel.
From the forums I understand PGI was acquired by Nvidia and now pgprof is merged with nvprof (forgive me if i got this wrong). Therefore is this view still available? If not is there a view in
nvprof which shows a similar breakdown of kernel region information such as device time, calls and data region information in a similar clear structure as the old pgprof?
From experimenting I couldn’t find anything similar except
print-gpu-trace but its still not as clear as the above method.
Any suggestions would be greatly appreciated.