I’m not aware of any method to retrieve this info.
It’s hard to imagine how it could be useful from a CUDA programming perspective. An SP in CUDA marketing speak is a floating point unit that can handle a limited number of floating point instructions. All CUDA GPUs I am familiar with (excepting cc2.1 which are now obsolete) had a multiple of 32 of these in each SM. Therefore we could imagine that warp lane 0 always uses SP 0, or SP 0 or 32 if there are 64 SPs, or SP 0 or 32 or 64 or 96 if there are 128 SPs. I cannot imagine how knowing this would be useful for CUDA programming purposes.
Two simplest kernels that each has one CUDA block of 33 threads (two warps)
one SM of 128 SPs
Since there’s one block per kernel, maybe we can refer one kernel as one block and call the two kernels Block0 and Block1, respectively.
If the two blocks (kernels)are running simultaneously, the first warp of Block0 may occupy SP[0:31] but how about the SP occupancy of the second warp of Block0?
Does the second warp of Block0 monopoly SP[32:63]? Or the second warp of Block0 only takes up SP32 and the first warp of Block1 occupies SP[33:64] and so forth