Overhead of block scheduling?

eastgun · May 13, 2009, 12:34pm

GPU implements zero-overhead scheduling to interleave warps. But I don’t understand the situation as below:
the G80 can provider the max number 768 of active threads per SM. If I set the block as 32x16. Then only one block can be putted into SM and there are only 512 threads can be executed in SM. My questions are:

If there is any over head when switch to next block.
In the impelmentaion of GPU blocka are divided into warps including 32 theads. If the time spends on 32x16 block dimension is greater than 2x the time spends on 16x16 block dimension.

Topic		Replies	Views
Overhead for SM to switch to other set of blocks ? CUDA Programming and Performance	4	5621	August 28, 2008
Overhead of launching thread blocks CUDA Programming and Performance	3	4616	September 2, 2008
What will be happen in the situation CUDA Programming and Performance	9	6346	December 23, 2008
Question about threads per block and warps per SM CUDA Programming and Performance	13	17494	October 6, 2022
Warp scheduling - have I got this right? CUDA Programming and Performance	17	12463	February 12, 2013
Scheduling of blocks: Does every thread of a block need to finish before a new block launches? CUDA Programming and Performance cuda	4	844	November 30, 2023
understand the mapping of the block threads to SMs in GPU CUDA Programming and Performance	3	2832	August 2, 2018
Discrepancy between theoretical occupancy and achieved occupancy depending on ThreadsPerBlock CUDA Programming and Performance cuda	7	366	September 6, 2024
Can threads in a warp from different blocks? CUDA Programming and Performance	17	12054	March 26, 2010
Kernel Launch: number of blocks CUDA Programming and Performance	1	1749	May 21, 2009

Overhead of block scheduling?

Related topics