Partitioning

Markuss.london · October 6, 2011, 4:19pm

Hi there,

I am working on a Tesla c1060 on a cfd case.

My whole problem is unstructured and I am using metis for domain decomposition.

I have a few question regarding warps and max number of active threads.

As I understood how cuda works, I always assumed I have 30 SMs with a max of 8 SPs each, which would give me a total of 240 Blocks a can use concurrently. I further understood that 3 SMs share one thread block scheduler and each SMs has its own warp scheduler.

My question now is what does “max number of active threads per SM (appernetly 1024))” mean , I always assumed that 240 active blocks means that 240 warps (or 240*32 threads = 7680) can be executed concurrently.

But now I saw that because of 30 SMs → 30 Blocks can be executed concurrently (does that mean max of 30SMs32threads = 960threads concurrently???). And because of the max number of active threads 301024 = 30720 threads can run concurrently. How can I divide my problem up to get the optimal block, warp number ?

So for example in theory which approach would be faster , 1. make sure all 240 blocks are in use lets say with 32 (1 warp) threads each, which should be solved in 4 cycles concurrently. Or 2. use 120 blocks with 64 (2 warps)threads in it?

any help is very much appreciated

cheers

Markus

Topic		Replies	Views
Tesla C1060 Max blocks per Streaming Multiprocessor CUDA Programming and Performance	14	10532	November 30, 2011
question about warp, block and threads CUDA Programming and Performance	4	2002	February 3, 2009
Scheduling Thread Blocks CUDA Programming and Performance	5	1207	July 29, 2021
Cuda Cores Cuda Cores - run threads bloocks, kernels etc. CUDA Programming and Performance	5	1750	February 22, 2011
help me understand cuda CUDA Programming and Performance	4	6882	February 10, 2010
Tesla Fermi card thread scheduling CUDA Programming and Performance	1	805	August 14, 2014
How to understand "active thread block"? CUDA Programming and Performance	4	547	August 4, 2023
Relationship between Warp and Thread Block on SM CUDA Programming and Performance cuda	2	524	November 10, 2023
number of threads on device at given time CUDA Programming and Performance	2	1202	September 12, 2009
How they work betweem SM and block SM, SP, Block, Thread and so on. CUDA Programming and Performance	1	4323	January 8, 2008

Partitioning

Related topics