Better control of register use

I haven’t been able to find specific information on how I can get better control over the use of registers in my kernels without resorting to PTX programming. Is this correct, is PTX the only way to really accomplish this? Are there possibly some other techniques that can be employed in C code that will force the compiler to re-use registers, for example? I’ve had some success freeing registers by moving variables to shared memory arrays but this seems to slow down kernel execution. I found some information on the use of volatile variables with the claim that this can help PTX to be more parsimonious with registers but the programming guide indicates that volatile is for global or shared memory and makes no mention of it’s use with automatic variables which typically end up in registers.

I’m sure this information must be out there but I’ve searched the forum with various phrases using the word “registers” with no joy.

Thanks to anyone who can provide a pointer.

 - Richard

You can always try passing the -maxrregcount option to nvcc to reduce register usage.

If you’re moving variables to registers you should also try to reduce bank conflicts as much as possible.

N.

I’m already using --maxregcount set to 32. I can look at the PTX and see that virtually none of the registers are being re-used and this seems to me to be the major problem. Even when only a conversion is being done the compiler is taking up a register and not re-using it, and whatever I do to create intermediate automatic variables and re-use them seems to have little or no effect.

  • R

Google: ptx, register, reuse

PTX is an intermediate language, not the final assembly output. Use decuda to verify your assumption. Consensus here, so far, has been that register reuse is done in the final stage of translating the PTX code to native machine instructions.

http://forums.nvidia.com/index.php?showtopic=89573

jma, thanks so much for this link. I think I’ll start using Google to search rather than the search engine in this forum. BTW, I haven’t been able to get decuda to run with G10 (it was created for G8 and G9) and the last update to decuda is quite old. Is there a new version of decuda that is not listed on the main decuda page?

  • R