What's the cost of loading in blocks?

Hella_Yu · April 9, 2008, 5:46pm

When old block retires, meaning it finishes kernel, and new block is brought in, is there any overhead associated with this loading, and how much?

Thanks.

MisterAnderson42 · April 9, 2008, 6:09pm

[url=“http://forums.nvidia.com/index.php?showtopic=59537&view=findpost&p=341967”]The Official NVIDIA Forums | NVIDIA
But the conclusions about the “overhead” in that post I think are incorrect (it is an artifact of inefficient scheduling). See my thoughts on this matter at
[url=“The Official NVIDIA Forums | NVIDIA”]http://forums.nvidia.com/index.php?showtop...ndpost&p=342244[/url]

Note that the tests in those posts were in particular related to kernels with many blocks that do “almost nothing”.

In kernels where all blocks do a similar amount of work, I have never detected any kind of overhead that I would associate with block scheduling other than a linear overhead of 1.0ms/60000 blocks for the kernel launch (tested with an empty kernel).

Hella_Yu · April 9, 2008, 6:49pm

Could you please clarify what’s the kernel launch time?

Is it the overhead associated with the first batch of blocks executing that kernel?

Thanks.

MisterAnderson42 · April 9, 2008, 7:13pm

No, there is no way to measure the time of execution of the first batch of blocks. What I referred to as the “kernel launch time” was the average time taken to execute an empty kernel with no arguments. This time linearly increases with the number of blocks with a slope of 1.0ms / 60,000 blocks.

Topic		Replies	Views
Overhead of launching a new thread block CUDA Programming and Performance	9	2257	December 1, 2016
fundamental cuda kernel launch questions CUDA Programming and Performance	2	16544	July 31, 2008
reduce overhead of launching a new thread block CUDA Programming and Performance	15	4815	February 15, 2018
kernel launch overhead for GTX 280 CUDA Programming and Performance	17	3790	November 5, 2009
kernel call overhead: timing results overhead is large for small # of calls CUDA Programming and Performance	16	7952	March 8, 2013
Slow loading kernel to GPU CUDA Programming and Performance	11	13030	April 18, 2008
Kernel enqueue overhead Bringing kernel overhead down? CUDA Programming and Performance	9	13845	March 12, 2010
kernel launch overhead timing best practices CUDA Programming and Performance	3	10054	June 24, 2014
Why is there 10uS between kernel launches? CUDA Programming and Performance	2	3863	August 6, 2010
order of number of overhead of 1k kernels? kernel startup overhead CUDA Programming and Performance	4	5337	March 18, 2007

What's the cost of loading in blocks?

Related topics