usage of "extern" in C with cuda

I have been tyring to start up a new code with multiple files for modularity. I was thinking to make the variables used by most functions global and declare them in file_m lets say. Then, with the extern keyword make them available to the other files. This works like you would think for host allocated memory, but not for device memory. To me a pointer is a pointer, but I guess not for device memory. So this is what I have discoverd for my code.

Lets say we have these files:

main: contains the main function
file_m: contains global variables to host and device arrays
file_f: has a kernel that will execute on a device array

Lets say we have a device pointer ptr_dev, and a kernel k.

:) These scnarios work for me:
ptr_dev is declared globaly in main
ptr_dev is cudaMalloced in main
k is in main and is called in main with ptr_dev passed by value

ptr_dev is declared globaly in main
ptr_dev is cudaMalloced in main
k is in file_f and is called in main ptr_dev passed by value

:( these scnarios don’t work and I don’t see why. please explain
ptr_dev is declared globaly in file_m and is “threoretically” made available in main by extern
ptr_dev is cudaMalloced in main
k is in main and is called in main with ptr_dev passed by value

ptr_dev is declared globaly in file_m and is “threoretically” made available in main by extern
ptr_dev is cudaMalloced in file_m
k is in main and is called in main with ptr_dev passed by value

ptr_dev is declared globaly in main
ptr_dev is cudaMalloced in file_m
k is in main and is called in main with ptr_dev passed by value

ptr_dev is declared globaly in file_f
ptr_dev is cudaMalloced in file_f
k is in main and is called in file_f with ptr_dev passed by value

??? So, it seems like the pointers need to be declared in the main file and cudaMalloced and passed by value. I couldn’t get to work any other way except including the other .cu files in the main file which is the same as just having one big file.

Ideas?? what I am I not understanding?

Your problems will only get worse when you try to share a texture reference between two .cu files like this. Bind a texture in one .cu file and then call a kernel in another .cu file and get an “unbound texture” error. nvcc seems to understand the extern keyword, but something with the context management messes up the whole concept.

I gave in and used the #include method. It at least gives you multi-file organization, even if all of the .cu code is recompiled every single time. Maybe we can cross our fingers and hope for a true multi-file nvcc in CUDA 1.1.

Could you post your exact statements? I don’t quite understand what you mean by “passed by reference”. There’s no reference in C, and passing a pointer to pointer surely won’t work?

To Mr.Anderson42:
I don’t think multi-file texture sharing would be possible in the near future. After all, the device part is already in cubin format before linking. nvcc would have no clue how to set the extern texture, since doing that requires a model handle in another obj file that may not yet exist.

You are right, I should of said value or the value of the pointer.

For the record, it is possible to get and use a reference of a pointer. It looks like thats what cudaMalloc does, but I could be wrong.

i.e.

void function(void **ptr_);

void *ptr;

function(&ptr);