Saving registers with smaller data types?

I’ve been playing around with the kernels in my current project - trying to reduce register counts as they’re all pretty hefty and some require lmem to run properly. I had a thought that perhaps I could change some of my current integers to 16 bit integers and save some space that way… but thusfar none of my register counts seem to go down when I make this fix.

Is it actually possible to reduce register usage by shortening variables? I’m not sure if it’s just the compiler being clever so that my changes don’t actually help, or if it’s just not something that is possible to do.

All of the registers are natively 32-bit. On compute capability 1.3 devices, doubles effectively take two registers, which is why the register count was doubled.

If you had smaller variables, you could always pack them into a register (i.e. 2 shorts) and use a bitmask + shift operation to read them back. I’ve also thought about this as a technique to reduce register usage to allow more threads to run per block (at the cost of the extra unpacking operations) – but never really had a chance to test it to see if it worked well.

Bitmask/shift works well - ta.