How to choose a proper grid size

yo123abxd · January 16, 2024, 10:00am

Suppose I have n similar(execute the same codes) and individual(have no need to concern threads communication) tasks. how do I choose a proper grid size?
#1 let the grid size equal to the number of SMs. assign task 1 → n/grid_size * 1 to block 1, assign task n/grid_size * 1 + 1 → n/grid_size * 2 to block 2, …
Then, each thread in a same block are assigned n / grid_size / block_size tasks. each thread needs do:
for i in task_set:
task(i)

#2 assign every task to a unique thread. Namely, the thread at blockIdx.x * blockDim.x + threadIdx.x will be assigned task(blockIdx.x * blockDim.x + threadIdx.x)

As far as I know, the number os SMs is constant at a specific GPU. Does it mean if I want to get the best performance, I only need to keep grid size larger than SMs size?

njuffa · January 16, 2024, 11:06am

A good general starting point:

(1) Each thread is responsible for producing one output element
(2) Chose between 128 and 256 threads per thread block (multiple of 32)
(3) Make a 1D grid that comprises enough blocks that the total number of threads covers all output elements

This has been covered in these forums multiple times. Of course, numerous variants and modifications are possible based on the details of the processing. For example 2D grids may be more naturally suited to the processing of 2D images, where each thread block produces one tile of pixels in the image.

yo123abxd · January 16, 2024, 11:14am

sorry for my ignorance and thank you for your help!

system · January 30, 2024, 11:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
About Grid size selection CUDA Programming and Performance	4	1176	May 17, 2017
How to decide the optimal block size in CUDA CUDA Programming and Performance	4	27596	February 15, 2010
Block size and grid size CUDA Programming and Performance	5	8379	April 27, 2009
How to device the size of block and grid for Kernel? CUDA Programming and Performance	2	278	September 18, 2023
Grid dimension's decision How to take decision for organization of a grid . CUDA Programming and Performance	6	5450	March 10, 2009
General Formula for Thread/Block Ratio CUDA Programming and Performance	1	590	June 2, 2011
Hide latency CUDA Programming and Performance	3	492	June 9, 2023
Blocks and Threads CUDA Programming and Performance	1	641	February 7, 2013
Grid size and block size Decision CUDA Programming and Performance	4	2385	June 8, 2008
Calculating the optimal grid and block size? CUDA Programming and Performance	1	6117	August 30, 2011

How to choose a proper grid size

Related topics