strange GPU idle time in profiler

polotenchiko · June 15, 2011, 2:28pm

Hello!
I have a question about profiler.
my code looks like this :

memcpyAsync(…, host2device);
kernel1<<<>>>();
memcpyAsync(…, device2device);
kernel2<<<>>>();
…

So i suppose this code should run strictly sequentially (there is no CPU instructions between cuda operations invocations). But looking into some profiling info (attached) there is some gap between kernel1 invocation (blue strip) and following memcpy (red strip). And it repeats further after every kernel invocation.
What can be reason of such GPU idle times things?

polotenchiko · June 15, 2011, 2:28pm

Hello!
I have a question about profiler.
my code looks like this :

memcpyAsync(…, host2device);
kernel1<<<>>>();
memcpyAsync(…, device2device);
kernel2<<<>>>();
…

So i suppose this code should run strictly sequentially (there is no CPU instructions between cuda operations invocations). But looking into some profiling info (attached) there is some gap between kernel1 invocation (blue strip) and following memcpy (red strip). And it repeats further after every kernel invocation.
What can be reason of such GPU idle times things?

polotenchiko · June 27, 2011, 9:06am

still intrested in

tera · June 27, 2011, 11:25am

If you are using the WDDM driver on windows it could be due to batching - maybe a cudaStreamQuery(0) or a cudaStreamSynchronize(0) after the kernel launch fixes things.

DrAnderson42 · June 27, 2011, 12:03pm

Do you have profiler counters enabled? It takes the profiler time to read the counters back to the host and tally up the stats. I find that if you want to see the best idle timing results, you need to turn off all counters. Even then, the recorded gaps will be larger than the gaps in a profiler disabled run.

Topic		Replies	Views
GPU Idle time CUDA Programming and Performance	0	3889	February 28, 2009
High idle times between kernel exeuction CUDA Programming and Performance	0	2145	April 18, 2012
Profiler Interpretation of profiler results CUDA Programming and Performance	3	5868	July 3, 2007
"idle time" between kernel calls ( from NVVP inspection) CUDA Programming and Performance	4	5162	December 10, 2012
Massive idle time CUDA Programming and Performance	3	1232	March 9, 2011
Reducing GPU Idle Time CUDA Programming and Performance	19	4402	June 14, 2022
What are possible reasons of heavy kernel launch latency? CUDA Programming and Performance cuda , kernel , python	8	688	March 26, 2024
idle time, gaps between kernels qunatifying syncronisation overhead CUDA Programming and Performance	1	769	September 14, 2011
cuda visual profiler CUDA Programming and Performance	12	8167	July 30, 2008
getElapsedTime vs Profiler CUDA Programming and Performance	2	393	July 4, 2011

strange GPU idle time in profiler

Related topics