Some Questions About Passing Arguments to Kernel

global void foo(int &a)
{
a = 10;
}
Does “a” have multiple copies? Only one copy shared by all threads or one copy per thread?
If I modify “a” in the kernel, what will happen? For example:

int x = 1;// a variable in CPU Memory
foo<<<10,10>>>(x);// x = 1 or 10?

Does this means compiler will generate code that will automatically copy the “x” to GPU Memory and copy it back after the kernel execution?

  1. Another similar situation:
    struct Bar
    {
    int temp;
    Bar(int i){temp = i;}
    device int operator()(int *array)
    {
    array[threadId.x] = temp;// we use “temp” in a device function
    }
    };
    This code seems work well. Does this mean “temp” is automatically copied to the GPU Memory?

Does CUDA thread has its local stack for parameters passing?

Thanks!

  1. That won’t compile. global functions can’t have arguments passed by reference
  2. There is no automatic copying to the GPU unless you pass by value to a global function. There is a limit of 256 bytes for function arguments per global function.
  3. There is no call stack on the device at all. Everything is inlined and arguments are passed directly via shared memory.

Thanks for your reply!