globally declared device variables fail silently?

Can someone explain to me why the following doesn’t work?

device unsigned char *foo;

host void whatever() {
CUDA_SAFE_CALL(cudaMalloc((void **)&foo, 192));

__global void run_my_global() {
unsigned char *otherfoo = foo;

If I use that patter, foo has value 0x0 (in the debugger in emulation) both after the cudaMalloc call and in run_my_global. I saw nothing in the documentation that said this shouldn’t work, nor does the compiler give me a warning.

If I declare foo to be local to the host function and then pass it as an argument to the global function, it seems happier.

Are globally declared device variables simply not allowed? Only usable in the device context?


foo is a variable that resides on the device. As such, it cannot be modified by the host code, which is what your code is attempting to do with cudaMalloc. cudaMalloc requires a host variable, which it fills with a device address, if allocation is successfull. Afterwards, when you pass that address to a global function, only the address is being passed (parameter pass by value), which is OK. That’s why your declaration of foo as local to the host function works as you expect.

The Programming Guide does say that referencing device variables in host code (and vice versa) leads to undefined behavior.

Let me know if this still doesn’t make sense.


Yeah, I guess that makes sense. The problem, I suppose, is that the programming model is so bizarre.

That should be a compiler warning, methinks, not a silent runtime failure. :/