How many concurrent threads are running on my GeForce GTX 1080 Ti?

atip.asvanund · January 3, 2018, 11:43am

What is an effective way to find out this information?

I understand that cores should not be treated equally on CPUs vs GPUs.

If a warp can have 32 threads running concurrently? How many warps are running concurrently on my on my GeForce GTX 1080 Ti?

Understanding this would help be better design my application.

Jimmy_Pettersson · January 3, 2018, 12:41pm

You have a number of active threads that the physical “GPU cores” are context switching between.

The number of active threads will depend on their resource requirements (register, shared memory) or hit the upper limits specified by your particular GPU:s compute capability (ex max 1024 threads per SM, and then you have N SM:s on your GPU).

The number of threads executing each clock-cycle should be equal to the total number of FPU:s/SP:S/“CUDA cores” on your device ( ~3500 ish on your card), so #warps = NbCores / 32.

BulatZiganshin · January 3, 2018, 4:06pm

1024 threads is limit per thread block, not SM

each GPU core may run up to 16 threads simultaneously. 1080Ti has 3584 cores, hence may run up to 16*3584 threads

Robert_Crovella · January 3, 2018, 4:22pm

I wouldn’t describe it that way. The maximum number of threads in flight is 2048 * # of SM, for all GPUs of compute capability 3.0 and higher (but less than 7.5: Turing GPUs are limited to 1024 threads/SM maximum)

This is an upper bound, not necessarily achievable with every code. Some codes may have resource utilization that dictates a lower maximum instantaneous thread carrying capacity (“occupancy”).

1080 Ti has 28 SMs, so the maximum instantaneous threads in flight number is 282048 (which does happen to be the same as 163584, however the 16*core count methodology will not give a correct upper bound for other GPUs that do not have 128 cores/SM, including all Kepler GPUs, and also cc 6.0 and 7.0 GPUs).

Jimmy_Pettersson · January 3, 2018, 9:41pm

Not on all devices, Fermi only allows 1536 for the whole SM with a maximum of 1024 per block, and 1.X devices allow even fewer.

Topic		Replies	Views
threads how many threads can simultaneously execute? CUDA Programming and Performance	1	1967	February 27, 2009
Scheduling Thread Blocks CUDA Programming and Performance	5	1178	July 29, 2021
How many threads can be running on a cuda core? CUDA Programming and Performance	1	8111	May 17, 2019
How many concurrently running threads CUDA Programming and Performance	1	2974	July 1, 2007
Mapping between CUDA cores and threads CUDA Programming and Performance	7	15402	December 2, 2011
The number of thread running on one SP CUDA Programming and Performance	1	549	March 1, 2022
How the number of concurrent threads is calculated? CUDA Programming and Performance	5	6056	November 26, 2011
number of threads on device at given time CUDA Programming and Performance	2	1201	September 12, 2009
max number of threads CUDA Programming and Performance	3	3158	October 10, 2008
How many thread are executed at the same time ? CUDA Programming and Performance	9	7859	January 21, 2024

How many concurrent threads are running on my GeForce GTX 1080 Ti?

Related topics