CUDA thread and SM

uniadam · September 30, 2021, 1:22pm

Hi,

I have 2 CUDA streams and 2 diferent kernels. For executing them in parallel I reduced number of threads in block and now I am seeing some parallel behaviar.

My question is about the number of threads that we have for each SM. Is it possible to have 1 SM with 512 thread and another SM with 256 thread? If maximum number of thread is 512 per SM, am I waste half of the thread for the second SM with 256 thread?

Also It is not clear for me the relation betwine number of CUDA kernel and number of CUDA thread and maximum of each.

Robert_Crovella · September 30, 2021, 1:52pm

You can have up to 2048 threads per SM on most GPUs up through the Volta architecture. The maximum number of threads per SM is discoverable using a tool like deviceQuery and is also reported in the programming guide, table 14.

There is no relationship between the number of CUDA kernels you decide to define, and the number of CUDA threads and maximum of each.

The number of CUDA threads that a kernel requires is referred to as the grid. You defined the grid at launch time. A subset of the grid will begin executing on SMs sometime after you launch your kernel(s).

Topic		Replies	Views
Scheduling Thread Blocks CUDA Programming and Performance	5	1370	July 29, 2021
Running CUDA kernels from two different pthreads CUDA Programming and Performance	7	3042	May 10, 2016
Why is max threads per sm larger than max threads per block? CUDA Programming and Performance	3	1810	January 5, 2024
Number of blocks parameter for kernel when GPU has just one SM CUDA Programming and Performance	3	590	August 4, 2017
confusion of basic concepts CUDA Programming and Performance	8	6441	May 18, 2011
Question about the number of SMs using in the program. CUDA Programming and Performance	3	877	April 9, 2018
Max (resident \| active) threads for V100 & A100 CUDA Programming and Performance	1	4984	October 31, 2021
SP , SM and thread CUDA Programming and Performance	0	1129	February 12, 2011
Cuda Cores Cuda Cores - run threads bloocks, kernels etc. CUDA Programming and Performance	5	1909	February 22, 2011
Max threads/blocks CUDA Programming and Performance	10	228	September 6, 2024

CUDA thread and SM

Related topics