How to count the number of registers required per thread?

To optimize my code, first I’m trying to optimize register use.

How can I measure the number of registers required per thread?

NVVP tells me that I’m using 63 registers per thread which is maximum, but I don’t have that much scalar variables.

How can I measure the number of registers required per thread?

You can see the number the number of registers used via the flag “-ta=tesla:ptxinfo”. You can set the maximum number of registers to use per thread via the flag “-ta=tesla:maxregcount:n” where “n” is the number of registers.

  • Mat