Quick block question!


Is an aktiv block interruptable or not?



Then, I don’t understand the occupancy calculator. It says the maximum of “aktiv thread blocks per multiprocessor” is 6. What happend if there are more than 6 blocks? And what means “aktiv”? Are there inaktiv, too?

Please help me! I thought I understand the calculator…


“Active” means scheduled. Active blocks are those which have active threads which are running on a give multiprocessor. There can up to 768 active threads on any multiprocessor - so if the occupancy calculator says 6 active blocks per multiprocessor, than means that your blocks probably use between 64 and 128 threads, and each thread uses less than (register files size / 768) registers per thread and less than (16384/6) bytes of shared memory per block.

If your kernel launch contains more than (6 x the number of multiprocessors in your card) blocks, then the excess blocks in the grid wait until a multiprocessor becomes vacant, and then are scheduled in groups of 6 per multiprocessor until the entire grid is processed.

Thank you for your answer! I think, I understand now. But why are the registers (and shared memory) shared bei all aktiv blocks of a multiprocessor? Look, the calculator shows this values:

ThreadsPerBlock: 128
RegistersPerThread: 8
SharedMemoryPerBlock: 2048

Allocation per thread block
Warps: 4
Registers: 1024
SharedMemory: 2048

Maximum thread blocks per multiprocessor
Registers/Multiprocessor: 8
SharedMemory/Multiprocessor: 8

TotalOfRegisters = 8192;

Registers/Multiprocessor(not per block) = TotalOfRegisters (RegistersPerThread * ThreadsPerBlock)

I thought, the registers and shared memory are reserved by ONE block!

Thanks in advance!

You can of course use up all registers and all shared memory of a multiprocessor with one block. But if a block doesn’t require all registers or all shared memory of a multiprocessor it is logical that several blocks are scheduled, otherwise it’s resources would not be used efficiently.

But if a block is not interruptable, it’s not necessary to share the memory. Each block will be perfomed serial. So, memory that is occupied by a block can be freed after the execution of it’s threads. So, the next block is able to allocate the complete memory (or all registers) of the multiprocessor. There can’t be conflicts between memory accesses of different blocks.

I think, there are only two options: Either blocks are interruptable, than sharing the memory is sensefull or there are not interruptable, than it’s not sensefull.

If I’m mistaken, help me PLEASE!

I think there is a misunderstanding here about what “interruptible” means. Blocks cannot be interrupted or preempted is the usual sense. Once a block is scheduled, it has to run to completion (that is all of its threads must exit). There is no way to use signals or interrupts or any comparable mechanism to make a block yield or stop from user space. But blocks do not run serially. The finest level of scheduling granularity is a warp of 32 threads, and a given multiprocessor runs a warp in an SIMD fashion that might be thought of as serial - each of the eight cores in a multiprocessor runs each broadcast instruction 4 times to complete one instruction for all 32 threads in an active warp. The scheduler in the GPU can context switch and schedule warps of threads from multiple active blocks on a given mutliprocessor - this allows for instruction pipelining and memory latency hiding.

The scheduler will run as many active blocks on a mulitprocessor as there are free resources: registers (16384/8192), shared memory (16kb) and thread state (768 threads). The occupancy spreadsheet tells you how many blocks that will be, for any given set of kernel resource requirements.

Thank you very much! Now, everything is absolute clear! :thumbup: