Number of variables limit? within a kernel

recently i’ve had some problems with kernels calling lots of device functions.

i was loosing some values of variables, until i lowered the Block Size from 512 to 128 now it seems to work well but i cannot confirm 100% yet.

is there a limit on how many variables you can use on a kernel?

Ehh, not the number of variables has a limit, but the number of registers used determines the maximum blocksize you can run.
Do you check the number of registers used by your kernels?
Do you check for errors? Because loosing values of some variables sounds like your kernel actually did not run at all because of a “too many resources requested for launch” error.

If you are using Visual Studio with Cuda Build Rules you can set “PtxAsOptionV” to Yes. This will cause it to print in the output screen how much of resources your kernel is using.
If you use nvcc command line, you can add --ptxas-options=-v for the same effect.

Im in linux, i never had checked the registers usage and i really need to.

so i should use --ptxas-options=-v ?


thanks i was having 16 register limit per thread,

i have a short question is it guaranteed that the compiler will put small variables like float[3] arrays onto register memory or is it a chance on having them as local ?

It depends on how you access it. If you always access elements via float[0], float[1], and float[2], then yes you are basically guaranteed that it will be put in registers.

If you access float[i], where is is a variable, you are pretty much guaranteed that it will be put into local memory.