What address space are device functions/kernels aware of?

Are device functions/kernels aware of the address space (I mean variable names) which cudaMallocManaged() uses? Or do I still have to pass these global memory pointers to kernels?

Similarly, are device functions aware of extern shared pointers that I declare in a kernel - that is, can I call device functions from kernels without passing the pointer to shared memory?


You have to pass the global memory pointers allocated using any dynamic allocation scheme (cudaMalloc, cudaMallocManaged, cudaMallocPitch, etc.) to the kernels.

The only variables you don’t have to pass pointers to are those that are declared statically, at global/file scope, using either device, constant, or managed.

Shared pointers have to be passed to the functions that will use them, or else need to be declared in the function itself.

These questions are all just requirements of C/C++ programming. You can’t use a variable in a function that is not declared to that function.

Forgive what’s probably a stupid question: I can declare pointers to dynamically allocating memory as extern in a header file and then use it in functions across all files including that header, right? How/why is CUDA-dynamically allocated memory different?

Yes, and you can do this with CUDA dynamically allocated memory pointers as well, for use in host code.

The problem here arises from the fact that a global scope pointer variable, with no explicit tagging (e.g. device) is not accessible in CUDA device code. The pointer itself lives in host memory somewhere, and host memory is not ordinarily accessible from CUDA device code.

If you want to make it accessible in device code, you must pass it explicitly to device code via a kernel parameter, or else you must declare it as a device variable (which will cause it to live in device memory, have a device-accessible address, and no longer be directly accessible from host code).