I have a PTX file that’s successfully loaded by cuModuleLoad(). If I run cuModuleGetFunction() on that module with an existing kernel name, I get CUDA_ERROR_NOT_FOUND. However, if I run cuModuleGetGlobal() on the very same loaded module, it works like a charm. How’s that possible? Both symbols (the kernel and the global memory variable) can be found in the PTX file, so why does cuModuleGetFunction() fail? FYI, I use the “pure” kernel name as a string literal during lookup, not the mangled version found in the PTX. Using CUDA 3.1 on 64-bit Linux.
In that case, they indeed get mangled names. Is the fact documented that they’re compiled as C++? If not, it should probably be added to the driver API section on kernel execution. Anyhow, I declared my kernels as extern “C” and now the function lookup works as expected.
In that case, they indeed get mangled names. Is the fact documented that they’re compiled as C++? If not, it should probably be added to the driver API section on kernel execution. Anyhow, I declared my kernels as extern “C” and now the function lookup works as expected.