Strange cudaLaunch stall in NV Visual Profiler

Fxpower · November 13, 2012, 5:16pm

Hi all,

I am working with a cuda code that works well, but shows some strange behavior concerning 2 or 3 function calls that are showed to take extremely long time in the profiler. (see picture provided).
It is often a ‘cudaLaunch’ that take 110 ms according to the profiler, and sometimes a ‘cudaMemcpy’…

I am using some thrust, combine with homemade kernels, and I want to stress out that the kernels before the cudaLaunch ‘gap’ and after are the same.

I do not expect a direct solution in this forum, without the code (that I can’t provide) but has anyone seen this ?
→ Could be a NVVP display bug ?
→ Seeing the speed of the release code outside nvvp, I suppose this phenomenon is not happening.
→ how to be sure the profiler is telling me the truth ?
→ does kernel launching can be stalled if many many kernels are launched in asynchronous mode ?

Feel free to give me your insight on this one !

Thanks

Fxpower · November 29, 2012, 11:15am

Hi
Replying to myself, because I have found the reason, this is due to nvvp, when it is actually flushing its profiling data, (or else).
This can artificially create this “stall” on any function, but with the new version of nvvp (Cuda 5.0) this time is marked in red as “non accountable in real execution time”.

So much clearer now.

Topic		Replies	Views
Strange cudaLaunch stall in NV Visual Profiler CUDA Programming and Performance	1	864	November 29, 2012
Kernel Launch Time (CPU Time) Reported in Visual Profiler how to optimize kernel launch CUDA Programming and Performance	0	3759	January 13, 2011
Kernel Launch Time (CPU Time) Reported in Visual Profiler how to optimize kernel launch CUDA Programming and Performance	1	712	July 7, 2011
cudaErrorLaunchFailure when using nvProf only Nsight Visual Studio Edition	4	1609	February 2, 2016
nvprof metrics (Stalls) Visual Profiler and nvprof	0	1657	February 27, 2015
is excessive kernel launches killing my application? CUDA Programming and Performance	3	1983	July 19, 2016
Time of cudaLaunch increase with the times of calling kernels. CUDA Programming and Performance	7	1220	September 12, 2017
Discrepancy between cudaEventElapsedTime and nvprof CUDA Programming and Performance	7	1620	March 11, 2016
Crash when profiling with "Kernel Launches and Memory Operations" Nsight Visual Studio Edition	7	3672	February 5, 2015
"idle time" between kernel calls ( from NVVP inspection) CUDA Programming and Performance	4	5248	December 10, 2012

Strange cudaLaunch stall in NV Visual Profiler

Related topics