Are register declarations respected by CUDA?

The topic title says it all: if a variable is declared as a register variable can I be sure that CUDA really puts it into a register and not into (any of the various layers of) memory? So far what I see is only that syntactically “register float x” doesn’t choke nvcc :)

Something related: if in a kernel there is a simple declaration like ‘float x’ which layer of memory will this variable x end up in?

The manual says in that automatic variables end up in a register unless they are too “big”. How big “big” is seems to be up to the compiler.

So if you do a struct, it might be best to play by the rules and always align it nicely. See the vector_types.h for what is done for float4 for example. I would guess that they are defined optimal.


Thank you very much Peter.