Sm90: setmaxnreg will change Occupancy dynamically?

setmaxnreg is a new feature since Hopper. I noticed this in cutlass:

From above, the consumer register is 232, the producer register is 40.
Different warp can use different register number??? This will affect Occupancy.
Also, we can dynamicaly modify register during kernel running? This will affect Occupancy.
Using NCU, I find a static register number 168, not 232, not 40. Anything wrong?

Here we have a producer(128 threads with 40 regs each, but only 1 warp for mainloop is active), two consumers(256 threads with 232 regs each)
So (40 * 32+256 * 232)/384=158!!?? ≈168???

A CUDA grid launch specifies the number of registers per warp. All warps in the CTA (thread block) launch with the same number of registers. The PTX instruction setmaxnreg allows a warp (e.g. producer warp) to return registers to the CTA register pool (setmaxnreg.dec) and other warps to acquire registers from the CTA register pool (setmaxnreg.inc). The total number of registers allocated to the CTA does not change. The total number of registers allocated to all warps in the CTA cannot exceed the initial CTA launch. Due to this constraint SM warp occupancy is not changed.

Registers allocated to warps are freed on warp complete (EXIT).
Registers returned to the CTA pool are freed on CTA complete after all warps complete.

NCU captures the number of registers per warp at grid launch. The use of setmaxnreg is a dynamic operation at runtime. NCU does not have a method to trace the changes. A given warp could go through many different warp counts during execution of a grid.

For more details see PTX ISA 8.5 section on setmaxnreg.

2 Likes

Here we have a producer(128 threads with 40 regs each, but only 1 warp for mainloop is active), two consumers(256 threads with 232 regs each)
So (40 * 32+256 * 232)/384=158!!?? ≈168???

Seems this make sense?

The total number of registers allocated to all warps in the CTA cannot exceed the initial CTA launch. Due to this constraint SM warp occupancy is not changed.

This means, the register from NCU might not be correct(because it is changing), but the Occupancy is still correct and useful, right?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.