Relationship between Warp and Thread Block on SM

laurelwoods0102 · November 10, 2023, 8:32pm

I’m currently studying CUDA Programming and getting confused with the relationship between Warp and Thread Block per SM.

On Compute Capability 9.0,
Maximum number of resident blocks per SM is 32
and Maximum number of threads per block is 1024.

So, I thought Maximum number of resident warps per SM should be 32*(1024/32) (Warp size) = 1024.

But it is 64 actually.

Can you explain why and how such number is calculated?

striker159 · November 10, 2023, 10:26pm

A maximum number of 32 blocks per SM does not imply that 32 blocks of size 1024 fit on a SM.

The maximum number of threads per SM is 2048, which is equal to 64 warps.

system · November 24, 2023, 10:26pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why is max threads per sm larger than max threads per block? CUDA Programming and Performance	3	972	January 5, 2024
Scheduling Thread Blocks CUDA Programming and Performance	5	1156	July 29, 2021
Maximum Number of Warps and Warp Size per SM CUDA Programming and Performance cuda , gpu , architecture-and-design	5	6853	November 30, 2022
Why is my register count limiting the active thread blocks per SM CUDA Programming and Performance	1	22	February 17, 2025
Partitioning CUDA Programming and Performance	0	1996	October 6, 2011
question about warp, block and threads CUDA Programming and Performance	4	2002	February 3, 2009
Thread block num problem CUDA Programming and Performance	8	37	November 13, 2024
C2040 /fermi limits CUDA Programming and Performance	1	545	January 18, 2011
CUDA hardware level: Streaming Multiprocessor CUDA Programming and Performance	1	2634	April 27, 2015
Max threads/blocks CUDA Programming and Performance	10	77	September 6, 2024