`ncu` "No kernels profiled"

axelf · September 28, 2022, 8:08pm

Hi,

I’m running my application with

ncu --target-processes all ./application args

And consistently getting

==PROF== Disconnected from process 89889
==WARNING== No kernels were profiled.

at the end of my application. I’m on Ubuntu 20.04 using Cuda version 11.1. I am certain that my GPU is being used (nvidia-smi reports activity correctly). I see lots of similar topics on these forums, but none of the fixes within help.

Additional points:

nsys profile ./application args seems to work fine.
I am using a V100
I cannot use Visual Studio but I’d like to be able to get specific GPU performance metrics for my application
nvprof similarly tells me that “No kernels were profiled”

Any next steps?

Thanks!

jmarusarz · September 28, 2022, 8:42pm

Can you share some more information about how your application executes? For example, does your application fork child applications with CUDA kernels in them? Is that the reason you’re using --target-processes all? Is there hand written CUDA in there, or some 3rd party library or a higher level framework like PyTorch?

Do you have access to the GUI? If so, you could launch an interactive profile to step through the APIs and see if you encounter a cuda kernel.

axelf · September 28, 2022, 9:02pm

My application does not fork anything. I just added --target-processes all to make sure that I capture everything. There is a mix of hand-written CUDA, cublas, and even cusolver code in there. I have (crappy x11) access to the nvprof GUI, and I have seen that the kernels are executing as expected.

jmarusarz · September 28, 2022, 9:17pm

You mentioned x11, what’s the environment you’re using? Is this all happening locally on an Ubuntu box or are you ssh into the Ubuntu machine where the app and profile are running? Are there multiple devices (GPUs) on the target machine? Can you share the output of ‘nvidia-smi’?

I would recommend checking if you can profile a simple cuda sample. If you don’t have them installed, they are on github GitHub - NVIDIA/cuda-samples: Samples for CUDA Developers which demonstrates features in CUDA Toolkit. Something like cuda-samples/Samples/0_Introduction/matrixMul at master · NVIDIA/cuda-samples · GitHub is a good example to see if any profiling works at all.

axelf · September 28, 2022, 9:33pm

I’m sshing into the Ubuntu machine with the GPU.

The output of nvidia-smi is:
Wed Sep 28 21:29:11 2022

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000001:00:00.0 Off |                  Off |
| N/A   28C    P0    35W / 250W |      0MiB / 16160MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Just tested matrixMul and it works fine with: nvprof --metrics all ./matrixMul (which is what I would want for my own application).

… now it seems fixed on my own application too? Honestly I’m super confused…

jmarusarz · September 28, 2022, 9:40pm

Does ncu work on your application or just nvprof?

axelf · September 29, 2022, 12:14am

Yeah, somehow it works now…? I’m sort of baffled.

Topic		Replies	Views
Nsight-Compute returns “No kernels were profiled” warning Nsight Compute	9	1381	July 27, 2023
Nvprof works but nsight compute gives "no kernels were profiled" warning Nsight Compute	2	1553	August 23, 2022
NVprof works while NSight Compute says No kernels were profiled Nsight Compute	5	709	June 22, 2023
How do I use nv-nsight-cu-cli and the GUI version for profiling? Nsight Compute	3	1736	May 1, 2019
Nsight compute fail to profile L20 gpu CUDA Programming and Performance	7	638	April 11, 2024
No kernels were profiled warning/problem Nsight Compute	17	10237	December 28, 2021
Run ncu command in ubuntu 20.04 Nsight Compute	7	5226	August 8, 2022
Question about ncu profiling Nsight Compute	2	567	March 2, 2022
Application GUI freezes after NSIGHT Compute profiler is connected Nsight Compute	11	1289	April 12, 2023
Ncu no kernels profiled -- Target process xxx terminated before first instrumented API call Nsight Compute cuda , kernel , python	5	141	March 18, 2025

`ncu` "No kernels profiled"

Related topics