Nvprof works but nsight compute gives "no kernels were profiled" warning

fadragertu · August 19, 2021, 6:34am

I have a titan Volta GPU.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P620         Off  | 00000000:C1:00.0 Off |                  N/A |
| 34%   45C    P8    N/A /  N/A |      2MiB /  1999MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA TITAN V      Off  | 00000000:E1:00.0 Off |                  N/A |
| 32%   47C    P8    28W / 250W |      4MiB / 12066MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

I am trying to profile some code. For sake of brevity, let’s consider a simple vector add code I picked up here https://www.olcf.ornl.gov/tutorials/cuda-vector-addition/.

nvprof works fine

nvprof --devices 1 ./a.out
==208985== NVPROF is profiling process 208985, command: ./a.out
final result: 1.000000
==208985== Profiling application: ./a.out
==208985== Profiling result:
            Type  Time(%)      Time     Calls       Avg       Min       Max  Name
 GPU activities:   68.54%  145.57us         2  72.784us  68.160us  77.408us  [CUDA memcpy HtoD]
                   29.17%  61.952us         1  61.952us  61.952us  61.952us  [CUDA memcpy DtoH]
                    2.29%  4.8640us         1  4.8640us  4.8640us  4.8640us  vecAdd(double*, double*, double*, int)
      API calls:   98.08%  211.47ms         3  70.491ms  11.650us  211.27ms  cudaMalloc
                    0.64%  1.3889ms         3  462.96us  14.429us  1.2802ms  cudaFree
                    0.42%  910.12us         3  303.37us  116.25us  629.10us  cudaMemcpy
                    0.41%  882.48us         2  441.24us  150.76us  731.72us  cuDeviceTotalMem
                    0.38%  817.99us       202  4.0490us     350ns  175.41us  cuDeviceGetAttribute
                    0.04%  96.391us         2  48.195us  36.210us  60.181us  cuDeviceGetName
                    0.01%  29.326us         1  29.326us  29.326us  29.326us  cudaLaunchKernel
                    0.01%  12.221us         2  6.1100us  2.8180us  9.4030us  cuDeviceGetPCIBusId
                    0.00%  3.3020us         4     825ns     328ns  1.8740us  cuDeviceGet
                    0.00%  3.2020us         3  1.0670us     558ns  2.0160us  cuDeviceGetCount
                    0.00%  1.1250us         2     562ns     468ns     657ns  cuDeviceGetUuid

ncu however does not detect any kernels.

ncu --devices 1 ./a.out
==PROF== Connected to process 209016 (/home/adityap/a.out)
final result: 1.000000
==PROF== Disconnected from process 209016
==WARNING== No kernels were profiled.
==WARNING== Profiling kernels launched by child processes requires the --target-processes all option.

I tried installing older CUDA versions. As I am on debian, the ubuntu packages seem to be broken. The run file errors for driver. If I install only CUDA toolkit(without the driver), it installs, but ncu doesn’t work nevertheless.

The same GPU worked fine on another machine, the issue only occured when we moved it to another machine. So my guess is that there is some specific version of CUDA toolkit + driver that works for this device. Is anyone aware of it?

lonnie.souder · August 23, 2022, 5:49pm

I am also having this issue.

jmarusarz · August 23, 2022, 6:11pm

Does the same problem occur if you omit the --devices flag?

Topic		Replies	Views
NVprof works while NSight Compute says No kernels were profiled Nsight Compute	5	709	June 22, 2023
`ncu` "No kernels profiled" Nsight Compute	6	2248	September 29, 2022
NVPROF with Error: incompatible CUDA driver version. Visual Profiler and nvprof	1	1427	January 3, 2020
Windows 10 error with Nsight: ==WARNING== No kernels were profiled Nsight Compute	3	753	February 22, 2023
Nsight-Compute returns “No kernels were profiled” warning Nsight Compute	9	1383	July 27, 2023
NVIDIA NSight Compute: The profiler returned an error code:1 Nsight Compute	13	1864	March 18, 2024
No kernels were profiled warning/problem Nsight Compute	17	10243	December 28, 2021
Nsight compute fail to profile L20 gpu CUDA Programming and Performance	7	638	April 11, 2024
NVIDIA Nsight Compute and ncu error Nsight Compute	3	2277	August 9, 2021
Cannot profile RTX 2060 KO (TU104) with CUDA 11.0 on windows and ubuntu Visual Profiler and nvprof nvbugs	8	2746	July 27, 2020

Nvprof works but nsight compute gives "no kernels were profiled" warning

Related topics