Only CUDA Context 0 is shown on Nsight timeline with RTX 2080 Ti

When I trace an 64-bit app using CUDA Driver API in Nsight on a machine with RTX 2080 Ti running Windows 10 x64, I only see CUDA Context 0 on the timeline (the first context this app creates). I do not see other contexts for some reason (where most of the app’s useful activity (kernel calls, memory transfers etc.) takes place).

This used to work well on the same computer in the same version of Nsight (6.0.0.18297) with 1080 Ti (and earlier with 980 Ti too) before switching to 2080 Ti.

Tried drivers 411.63 and 416.34: same result.

Unfortunately NsightVSE’s trace support ends with the Pascal family of GPUs. We are steering people to Nsight Systems ([url]https://developer.nvidia.com/nsight-systems[/url]) or Visual Profiler / NvProf ([url]https://developer.nvidia.com/nvidia-visual-profiler[/url]).

Thank you for your response.

Is it going to change any time in the future?

I tried to use Nsight Systems and Visual Profiler and found them significantly less mature compared to NsightVSE:

  • it is more convenient to use an integrated tool rather than launching an external application
  • it took NsightVSE less time to collect trace compared to the other two tools
  • both Visual Profiler and Nsight Systems seem to be somewhat more fragile (in particular Nsight Systems stopped working after I killed the app being profiled and required a system reboot, while Visual Profiler has hung with one of the test apps I tried which required setting execution timeout in order to collect at least a partial timeline)
  • NsightVSE colored kernel calls on the timeline in such a way that it was easier to spot different parts of the algorithms. I am not sure what the exact coloring algorithm was, but it was working beautifully. On the contrary Nsight Systems does not color kernel calls at all, while Visual Profiler seems to pick colors randomly which does not help much.
  • Visual Profiler does not zoom to point on ctrl+wheel. In addition it takes a long time to zoom into an area (zoom percentage is increased in very small steps). This makes timeline navigation very difficult.
  • Neither of the two tools offers occupancy calculators in the same convenient way NsightVSE does
  • Nsight Systems shows very little information about each kernel execution (no register usage, no shared mem usage, no occupancy etc.)
  • Neither of the two tools offers analysis results grouped by kernel in the same way NsightVSE does

This list could go on and on. Is there a hope that tracing for Turing GPUs will ever be implemented in NsightVSE or it’s a permanent decision that won’t be changed?

The bad news is that there is no plan to add Turing support for Nsight Visual Studio’s legacy trace.
The good news is that there IS a plan to instead focus on improving our next-gen analysis and trace capabilities for use in all products, so you should see improvements in Nsight Systems each release.

Thank you for this feedback and feature requests. We will definately use your input in our planning.

Thank you! Looking forward to the new versions of Nsight Systems.

It appears that support for Turing was added to the Visual Studio plugin. However, I am still being able to see only Context 0. Why?