if I have an int that is stored on the host, and call the kernel:
global static void kernel(int x)
int z = x;
How does this work, since x is indeed on the host - unless CUDA caches it somewhere. And if so, is it cached in global memory (meaning really slow access), or since it has a very small footprint maybe its cached as a register or a local?