cuda overhead

marjan_919 · May 19, 2009, 4:34pm

Hello,
I have question about overhead when calling kernel (function run on GPU).
What does overhead contain? does it contain copy input parameter to registers? what else does it ocntain?
and what is startup overhead? what does it contain?
for example in scanLargeArray sample of CUDA SDK 2.1 there is these lines:

[b]

…
// run once to remove startup overhead
prescanArray(d_odata, d_idata, num_elements);

// Run the prescan
cutStartTimer(timerGPU);
…

prescanArray(d_odata, d_idata, num_elements);
 ....
cutStopTimer(timerGPU);
....

[/b]

I dont’ understand why doese calling the first kernel call remove the startup overhead in second kernel call?

bog · May 29, 2009, 2:05pm

Hi,

It’s cuda initialization that happens at first call of a cuda kernel, or a cudaMemcpy, which takes some time. What happens there, at the initialization, I can only suppose.

Topic		Replies	Views
startup overhead CUDA Programming and Performance	1	1965	May 22, 2009
kernel call overhead: timing results overhead is large for small # of calls CUDA Programming and Performance	16	8014	March 8, 2013
Kernel invocation cputime overhead? CUDA Programming and Performance	1	4347	April 18, 2008
Unexplained GPU call overhead one call 13ms; N calls N*7 ms CUDA Programming and Performance	6	2604	August 7, 2008
How to avoid the overhead in the beggining of every CUDA application? CUDA Programming and Performance	1	1189	December 11, 2019
overhead between two successive kernel calls CUDA Programming and Performance	6	1851	July 7, 2013
Execution time The first execution time is always slow CUDA Programming and Performance	12	5386	January 23, 2008
Slow loading kernel to GPU CUDA Programming and Performance	11	13088	April 18, 2008
First cudaMalloc() takes long time? CUDA Programming and Performance	13	17462	April 23, 2021
Speed up due to a kernel launch ? CUDA Programming and Performance	3	1260	December 26, 2009

cuda overhead

Related topics