CUDA_ERROR_INVALID_CONTEXT and driver resource unavailable when profiling

Hello ! I’m back to profiling kernels but I’m having some issues with the nsight compute tool. I have a big project with an extensive use of torch and some custom kernels that I run throught an nsight compute project.

After the program is launched it stops in the first API call cuGetProcAddress, then I just select “Run to next kernel” as I usually do and it runs through a lot of API calls, these being: cuDeviceGetAttribute, cuDeviceGetUuid and cuDeviceGetLuid. Then, inside an API call cudaGetDevice -> cuCtxGetDevice I get the function return value CUDA_ERROR_INVALID_CONTEXT(201) and the profiler pauses the program again.

I will press the “Run to next kernel” button and this issue will happen again a few times until I reach my desired kernel. However when I ask nsight compute to profile this kernel with full statistics, it fails with the following messages:

Profiling failed because a driver resource was unavailable. Ensure that no other tool (like DCGM) is concurrently collecting profiling data. See … for more details.
Failed to prepare kernel for profiling

I suspect these two issues might be related and that it could be somehow also related with permissions. Some things I have checked:

  1. I always run nsight compute as administrator
  2. “Allow access to the GPU performance counters to all users” is ticked in the nvidia control panel
  3. DCGM is not recognized as an application and I dont see it running anywhere
  4. Restarting the computer in case there is something else blocking the GPU profiling

Im using windows 11 with cuda 11.8. My nvcc and nvidia-smi (I removed some sensible program names) outputs are:

$> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
$> nvidia-smi
nvidia-smi
Sat Nov  9 23:50:16 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 566.03                 Driver Version: 566.03         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2060      WDDM  |   00000000:01:00.0  On |                  N/A |
|  0%   53C    P0             38W /  190W |    1745MiB /   6144MiB |      6%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      4180    C+G   ...__8wekyb3d8bbwe\WindowsTerminal.exe      N/A      |
|    0   N/A  N/A      4700    C+G   ...5n1h2txyewy\ShellExperienceHost.exe      N/A      |
|    0   N/A  N/A      4752    C+G   ...GeForce Experience\NVIDIA Share.exe      N/A      |
|    0   N/A  N/A      7480    C+G   C:\Windows\explorer.exe                     N/A      |
|    0   N/A  N/A      8348    C+G   ...nt.CBS_cw5n1h2txyewy\SearchHost.exe      N/A      |
|    0   N/A  N/A      8372    C+G   ...2txyewy\StartMenuExperienceHost.exe      N/A      |
|    0   N/A  N/A     11956    C+G   ...werToys\PowerToys.ColorPickerUI.exe      N/A      |
|    0   N/A  N/A     11992    C+G   ...werToys\PowerToys.PowerLauncher.exe      N/A      |
|    0   N/A  N/A     12236    C+G   ...CBS_cw5n1h2txyewy\TextInputHost.exe      N/A      |
|    0   N/A  N/A     15176    C+G   ...GeForce Experience\NVIDIA Share.exe      N/A      |
+-----------------------------------------------------------------------------------------+

I have also tried with a simple kernel and I’m getting the same error when profiling the kernel

Profiling failed because a driver resource was unavailable. Ensure that no other tool (like DCGM) is concurrently collecting profiling data. See … for more details.
Failed to prepare kernel for profiling

#include <iostream>

__global__ void add(int a, int b, int *c) {
    *c = a + b;
}

int main()
{
    int c;
    int *dev_c;
    cudaMalloc((void**)&dev_c, sizeof(int));
    add<<<1,1>>>(2, 7, dev_c);

    cudaMemcpy(&c, dev_c, sizeof(int), cudaMemcpyDeviceToHost);
    std::cout << "2 + 7 = " << c << std::endl;

    return 0;
}

Hi, @jsierra.siete

Can you please update Nsight Compute to latest version to have a try ? Thanks !

I can profile the kernel now without any permission issues. I did reinstall nsight compute using the CUDA toolkit installer but I didnt expected it to install an older version (2022). Maybe I missconfigured something during installation. Your link works fine.

The CUDA_ERROR_INVALID_CONTEXT(201) in some API calls is still there but I believe this might be something related to libtorch sources.

Thanks for your help

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.