Calculating the optimal grid and block size?

MartyMcFly · August 30, 2011, 11:18am

Hello,

is there a general algorithm/rule how to calculate the optimal grid/block size for 2D image processing.

I’m doing image processing with a lot of pixel-wise algorithms.
For most of my applications it makes sense to create a grid/block from the given width and height.
What I assume is that there might be an optimal algorithm of how to spread the threads and blocks
using the information I get from the CUDA device info.

How do you solve such things.

Thanks
Martin

Gaszton · August 30, 2011, 12:12pm

Hello, i am calculating 2d images (phase holograms), with pixel-wise operations only.
I am arranging the data, the threadblocks, and the threads in 1D, i see no advantage of the 2D arrangement in my case.

When determining the number of threads per block, and thereby the number of blocks, you should be aware for the
Maximum number of resident blocks per multiprocessor
Maximum number of resident warps per multiprocessor: 48 in my case
Maximum number of resident threads per multiprocessor: 1536 in my case (=48*32)

I am calculating an image with 1024x768 pixels, and i am making threadblocks with 768 threads. So i will have 1024 threadblocks, each containing 768 threads.
2 threadblock will be executed concurrently in each multiprocessor, with 1536 threads overall. So i am having full occupancy.
If the number of threads per block is higher than 768, for example 1024,
only one block will be executed concurrently on each multiprocessor, so there will be an occupancy: 1024/1536.

Maximizung the threads per block can pay out if you are loading constant global parameters to the shared memory of each block. More threads/blocks will result in fewer
threadblocks, thereby fewer load from global to shared memory.

Gaszton

Topic		Replies	Views
General Formula for Thread/Block Ratio CUDA Programming and Performance	1	593	June 2, 2011
How to decide the optimal block size in CUDA CUDA Programming and Performance	4	27734	February 15, 2010
Optimization problem how many blocks/ threads... CUDA Programming and Performance	1	1896	July 9, 2010
increasing blokSize -> Faster or slower CUDA Programming and Performance	4	869	September 12, 2011
Blocks and Threads CUDA Programming and Performance	1	642	February 7, 2013
How to determine the Block Size CUDA Programming and Performance	1	5904	September 4, 2009
Question regarding maximum amount of blocks CUDA Programming and Performance	2	797	January 28, 2011
Grid size and block size Decision CUDA Programming and Performance	4	2387	June 8, 2008
How to choose a proper grid size CUDA Programming and Performance cuda , kernel	3	448	January 30, 2024
Ideal number of thread per bloc CUDA Programming and Performance	9	3409	February 5, 2008

Calculating the optimal grid and block size?

Related topics