Hi, I’m debugging a CUTLASS kernel on an RTX 5090 (which does not support clusters), and I noticed something strange.
According to my understanding, shared memory is allocated per-CTA, so different thread blocks should never access the same shared memory address. However, in my debug prints, I see different CTAs reporting the same shared memory pointer value. For example, logs look like this:
THR (0,0,0) BLK (6,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700000400)
THR (0,0,0) BLK (12,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700000400)
THR (0,0,0) BLK (4,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700000400)
THR (0,0,0) BLK (10,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700000400)
THR (0,0,0) BLK (7,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700000400)
THR (0,0,0) BLK (8,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700000400)
...
THR (0,0,0) BLK (49,0,0) TMACRD (0,128,0,0,0) SMEMADDR (0x7f8700008400)
THR (0,0,0) BLK (56,0,0) TMACRD (0,1024,0,0,0) SMEMADDR (0x7f8700008400)
THR (0,0,0) BLK (47,0,0) TMACRD (0,1920,0,0,0) SMEMADDR (0x7f8700008400)
THR (0,0,0) BLK (57,0,0) TMACRD (0,1152,0,0,0) SMEMADDR (0x7f8700008400)
THR (0,0,0) BLK (15,0,0) TMACRD (0,1920,0,0,0) SMEMADDR (0x7f8700008400)
THR (0,0,0) BLK (48,0,0) TMACRD (0,0,0,0,0) SMEMADDR (0x7f8700008400)
...
Both CTAs show the same shared memory address, even though the GPU (RTX 5090) does not support clusters.
After this happens, the kernel hits an illegal instruction error when executing the following CUTLASS TMA-related operations (cp.async.bulk.tensor.2d etc.).
So my questions are:
- Is it expected that different CTAs print the same shared memory pointer value?
My impression is that shared memory is CTA-scope only, so this result seems wrong. - Could this incorrect shared memory address be the reason for the illegal instruction exception?
Since the kernel uses TMA instructions, I want to confirm whether incorrect smem addressing could trigger this. - Is there any known issue with smem pointer printing or smem addressing in CUTLASS kernels when debug-printing?
Any insights would be appreciated. Thanks!