Nsight Visual Studio Edition and OpenACC Kernels

In version 12.x of Visual Fortran when I used Nvidia’s Nsight to profile and analyze openACC accelerator regions, they would appear as actual kernels in the profiling output and show the number of blocks and threads along with other useful information. Now with version 13.2, they seem to appear just as a call to cuMemFree_v2. It looks like the kernels aren’t even being registered by the profiler. Is there a way to get around this issue?

Thanks,
~David

Hi David,

While I’m not positive, my best guess is that there is a mismatch between the CUDA versions. In later versions of 12.x we targeted CUDA 4.1 by default, but moved to CUDA 4.2 in 13.x. Can you try setting the option to target CUDA 4.1 (i.e. -ta=nvidia,4.1) to see if that helps.

Another possibility is that we changed the Windows Open Tools version in order to support Windows 8, and that might have caused some incompatibility. I’ll ask our tools folks, to take a look.

Thanks,
Mat

Thanks for looking into this Matt.

I tried targeting CUDA 4.1 and 4.0 but get the error message.
pgfortran-Error_CUDA version 4.0 is not available in this installation

I tried manually copying over the toolkits from the PGI 2012 directory to the PGI 2013 directory and get an internal compiler error when I recompile.

One other possibility that may be causing this is that I am now using Nsight 3.0. I will keep looking into this.

It could be on our end as well. We did make a lot of changes in the 13.x run time in order to support additional devices.

  • Mat