Maximum number of threads per block

brightsongeorge · September 15, 2021, 8:21pm

Hi! I am new to CUDA. What decides the maximum number of thread per block for a gpu? Why hasn’t there been any increase in the maximum limit with increasing compute capability?

njuffa · September 15, 2021, 8:47pm

Only a GPU architect could answer that authoritatively, and they don’t normally frequent this forum.

However, practical programming experience with CUDA would seem to indicate that allowing larger thread blocks does not provide a significant benefit. At the same time, increasing the maximum block size would lead to an increase in hardware complexity, while the overarching goal of GPU hardware design is to minimize the complexity of processing elements while providing more of them. In other words, with high likelihood, the trade-offs simply do not justify a larger maximum block size.

From a performance perspective the GPU hardware is usually utilized most efficiently (and performance is maximized) when using smaller granularity, i.e. medium-sized blocks, and simply using enough of them to cover the data. A reasonable starting point when designing CUDA code is to plan for 128 to 256 threads per block, then adjusting this up or down only where use cases require it.

Topic		Replies	Views
maximum threads per block not always used CUDA Programming and Performance	2	754	June 14, 2018
Why is max threads per sm larger than max threads per block? CUDA Programming and Performance	3	1150	January 5, 2024
How to chose the number of blocks and threads in kernel calling CUDA Programming and Performance	3	664	November 27, 2011
How to determine the Block Size CUDA Programming and Performance	1	5902	September 4, 2009
How to decide the optimal block size in CUDA CUDA Programming and Performance	4	27699	February 15, 2010
Maximum possible number of threads (Total) CUDA Programming and Performance	1	1009	December 28, 2009
What is the maximum number of threads per block? CUDA Programming and Performance	4	21243	April 8, 2010
Number of thread blocks and threads in those, difference for performance? CUDA Programming and Performance	1	383	September 6, 2021
is there a limitation for total number of threads? CUDA Programming and Performance	5	5271	October 22, 2009
Max no. of threads in a multiprocessor. CUDA Programming and Performance	4	1693	September 29, 2009

Maximum number of threads per block

Related topics