We can find the usage of registers per thread and shared memory in cubin files :
name = Kernel
reg = 18
it is very obvious that the code use 18 registers per thread
But when I check the ptx code, I also fine some information about registers
.reg .u16 %rh<4>; .reg .u32 %r<74>; .reg .f32 %f<75>; .reg .pred %p<9>;
So, whatâ€™s the difference between them? What’s the meaning of the information in ptx code? the amount of registers? Thanks!