I’m trying to import a kernel function from an executable containing both host and device code. The C driver API contains functions to load in kernel code from a device code file (namely, cubin, fatbin or ptx file), but not directly from an executable which contains device code in itself. I know I can use nvcc to generate these device code files, but I really want to load it from an executable.
E.g. for the vectorAdd example, I want to open the executable and extract the VecAdd() kernel function from it.
Note: I noticed that for the newest version of nvcc, the symbold of the VecAdd kernel is compiled with C++ symbol syntax. This creates a problem when you use the Driver API to try and load the “VecAdd” kernel; since that symbol is actually “_Z6VecAddPfS_S_i”. So you won’t get it loaded by using the “normal” kernel name.