I am a little mystified by how the nvcc compiler uses the register. So, currently I have a kernel that uses 41 registers. I have the following bit in my kernel
float4 first_value = tex1Dfetch(firtex_var,tid); float4 sec_value = tex1Dfetch(sectex_var,tid);
Now, somewhere down the line I have the following:
temp -= (x * first_value.x * y) + (z * sec_value.x * w);
Now, if instead I have the following (so not using these variables in this bit of code)
temp -= (x * y) + (z * w);
The register count drops by 4. I use these variables before (in an if condition), so it not like they are not being used in the code. So, I am guessing it is creating temporaries somewhere in the register. I tried putting these variables in shared memory but to no avail.
Also, the fact that I am using the -= operator is also bringing the registration count up significantly. I am guessing this is because of some temporary variables creation as well?
Does anyone know how I might be able to reduce the register count in this scenario? Would appreciate any help/suggestions.