Scheduling thread blocks adaptively How to distribute thread blocks to arbitrary dimensions

Thomas_Kroes · March 25, 2010, 4:36pm

Hi Guys,

I am relatively new to CUDA and I am interested to know how you guy’s would tackle a problem I am having at the moment.

Maybee it’s a stupid question but please bare with me…

I am writing a raytraycing application. Naturally you would want your end user to be able to resize the canvas to any desired width and height.

Things brings me to my problem. If I am not mistaken I understood that it’s best to keep thread dimension in a thread block 2 pow x. Please correct me if I am wrong.

Say we have a canvas of 513 X 513 pixels, and thread blocks of 16 * 16 threads, one would need at least 33 x 33 threadblocks in order to fully cover the canvas.

However for 65 threads blocks only a marginal percentage of the total threads would be occupied with rendering. I would like to find a way to ensure that all the threads in my thread blocks are in fact doing the work they are supposed to do.

I hope you guys understand my question, if you have any question please let me know.

Thanks in advance,

T Kroes

The Netherlands

Jimmy_Pettersson · March 25, 2010, 5:18pm

BLOCK dimensions should prefferably be a multiple of 32.

Your question is how to adapt a grid of thread blocks that are multiples of 32 to data that is not ?

I think the most common way to do it is to make a grid that is bigger than your dataset and then have a conditional check in the kernel to make sure you’re not going out of bounds. What happens is that your boundary blocks are branched and there might be som slight loss of performance, but probably no biggy.

ex:

dim3 dimGrid( (DIM_X + cols - 1)/DIM_X , (DIM_Y + rows - 1)/DIM_Y );

Hope i understood your question.

Topic		Replies	Views
Thread size in a block should be multiple of warp size? CUDA Programming and Performance	4	6121	January 17, 2013
Optimization problem how many blocks/ threads... CUDA Programming and Performance	1	1924	July 9, 2010
Grid and Thread size problem... CUDA Programming and Performance	3	4216	February 14, 2008
Dimensions of a Block and a Grid CUDA Programming and Performance	7	13082	May 1, 2008
Raytracing on dataset which is not a power of 2 Uneven dataset CUDA Programming and Performance	3	1868	March 10, 2008
String search with many threads CUDA Programming and Performance	11	6011	November 5, 2010
Question about Block and Thread Organization dimBlock.x, dimBlock.y, dimGrid, dimBlock CUDA Programming and Performance	9	14702	April 22, 2012
How better split threads between block/grid ? CUDA Programming and Performance	4	3491	May 7, 2009
thread / block allocation in function of data size CUDA Programming and Performance	5	4315	November 9, 2009
trouble learning how to set block and max thread size CUDA Programming and Performance	4	2025	January 26, 2011

Scheduling thread blocks adaptively How to distribute thread blocks to arbitrary dimensions

Related topics