Why NVPROF and Nsight not profiling one of the kernels?

Lorin · September 13, 2013, 9:28am

I have this CFD program in cuda, which when I execute using block of dimension 16 * 16 and profile it, it gets profiled perfectly and shows a kernel “NLMMNT” to be taking most of the GPU time. But I execute the same program using block dimension 32 * 32, the program accelerates upto 5 times faster than before, and the results of the program are correct, but now the profiler is not showing the profiling output for NLMMNT. When I see the log of Nsight, there also its not showing the profiling of NLMMNT to be complete. I can figure what may be the reason, I tried running that application for hours but still NLMMNT’s profile info is absent from the profiler’s output.

Lorin · September 13, 2013, 9:29am

Log of Nsight can be seen in this screenshot…

http://s23.postimg.org/hbaect7uz/profoutput.png

njuffa · September 13, 2013, 5:33pm

Chances are the kernel isn’t showing up in the profiler output because it did not execute. For example, the kernel my have failed to launch due to an out-of-resources condition. Does the code check the return status of every CUDA API call and every kernel launch? Note that checking the status of kernel launches is a two-step process, to catch both pre-launch and post-launch errors.

Lorin · September 14, 2013, 6:02am

As I said the output is correct, I cross checked the return status of the kernel.

egodfred · October 26, 2015, 6:08pm

I have a similar problem. If this was resolved can anyone tell me how?

njuffa · October 26, 2015, 6:16pm

Does you program carefully check the return status of every CUDA API call and every kernel launch? If not, there is a chance the kernel may never have executed.

Also, make sure the profiling data is properly flushed at the end of the application run. To do so

#include <cuda_profiler_api.h>

and at application termination, call

cudaProfilerStop();

Topic		Replies	Views
Profiling CUDA Programming and Performance	2	826	August 17, 2015
kernel runs much faster when being profiled with Visual Profiler Visual Profiler and nvprof	4	4689	August 29, 2014
Focused profiling with nvprof not working? Visual Profiler and nvprof	1	1175	May 28, 2020
Missing kernels in NSight Profiling Nsight Visual Studio Edition	4	2013	October 2, 2015
==ERROR== Failed to prepare kernel for profiling (0xc00000fd) but CUDA sample works Nsight Compute kernel , nvbugs	13	2032	November 6, 2021
No CUDA kernels shown in nsys profiler timeline when using dynamic parallelism Nsight Systems cuda , kernel , nsight	4	1429	January 7, 2021
Profiling in a code line resolution CUDA Programming and Performance	7	7050	December 6, 2011
`ncu` "No kernels profiled" Nsight Compute	6	2212	September 29, 2022
Question about ncu profiling Nsight Compute	2	559	March 2, 2022
Windows 10 error with Nsight: ==WARNING== No kernels were profiled Nsight Compute	3	751	February 22, 2023

Why NVPROF and Nsight not profiling one of the kernels?

Related topics