Which application accesses the driver's performance monitor

I get this when profiling:

Profiling failed because a driver resource was unavailable.
Ensure that no other tool (like DCGM) is concurrently collecting
profiling data. See https://docs.nvidia.com/nsight-compute/
ProfilingGuide/index.html#faq for more details.

The URL didn’t help. I don’t have dcgmi installed. Also VS 2019 was closed during profiling, and there was no other instance of Nsight Compute running.

Is there any way to see which application accesses the driver’s performance monitor?

No, there is no tool available to report this. Does the issue persist after rebooting the machine? Is the application you try to profile using CUPTI itself for collecting performance data?

  • The issue can reproduced.
  • Rebooting system doesn’t change anything.
  • The app doesn’t use CUPTI.

Is it allowed to use the Debug binary (as opposed to the Release binary)?

Yes, you can profile debug builds (even though it might not be recommended in terms of performance analysis).

Hi.

  • 465.89
  • Enable Developer Settings in NVIDIA Control Panel is disabled while doing the profile. Recap … I (re)start the system, start ncu --mode launch from Command Prompt, start Nsight Compute and attach my just launched app … and profile. Above quoted error (1st post) shows up. This error even seems to have such a deep impact on the entire system, that I’m unable to start NVIDIA Control Panel after this error. A system restart is required.
  • Correct (2021.1.0.0 (build 29693910))

Thanks.

I did some more testing:

  • profiling debug version, using <<< 1,1 >>> - ok
  • profiling debug version, using <<< 1,512 >>> - ok
  • profiling debug version, using <<< 16,512 >>> - ok
  • profiling debug version, using <<< 512,512 >>> - ok
  • profiling debug version, using <<< 1024*256,512 >>> - FAILS w/ error shown in #1
  • profiling release version, using <<< 1024*256,512 >>> - ok

Any ideas on this?

I can use the data received by the final release version, but I was wondering why the debug failed.

Besides that, what are the recommended grid,block values for profiling?
The actual grid,block from the productive environment?

profiling debug version, using <<< 1024*256,512 >>> - FAILS w/ error shown in #1

My guess would be that the kernel needs too much resources (when being profiled), i.e. some additional resource requirements added by the profiler push it over the limit. You didn’t specify how exactly you profile, but I would recommend trying with an individual metric first, to see if that works better. If it does, you can try to add more metrics/sections. If those start failing, using application replay can also be an option.

Besides that, what are the recommended grid,block values for profiling?
The actual grid,block from the productive environment?

Yes, you should use the configuration you are planning to productize, since that is what you are optimizing for, and the performance characteristics of the kernel can vary with different configurations.