Consistency of functions pointer

Each compilation unit in CUDA results in a CUmodule (Driver API type). The CUDA Runtime hides CUmodules from the developer. CUfunctions (device or global) in separate CUmoduels cannot call each other through function pointers. Separate compilation can be used to statically link two separate compilation units.

There are multiple reasons why cross CUmodule functions cannot be reliably called using function pointers:

  1. Each CUmodule has its own separate constant area. When a kernel is launched the kernel is configured to reference the its CUmodule constants. If you call a device function in a different CUmodule using a function pointer then that device function will be incorrectly reading the wrong constants.

  2. Each global function is launched with a specific launch configuration. If the device function references a higher register than the launch configuration the SM will throw an exception.

  3. Each global function has a shared memory map. If you call a device function in a different module then the compiler will not have correctly allocated the shared memory. Since shared memory is statically mapped this will cause corruption in shared memory.

There are several other reasons but I think this list will give you a good list of reasons why you shouldn’t try this.

I am not aware of any documentation that lists this restriction.