I’ve recently moved my CUDA code implementation over to a newer version of our software’s framework, and the makefile maintains everything important from the previous makefile. The problem is that when I use dlopen (with RTLD_LAZY) to get my plugin, the external to the CUDA call is not recognized by the plugin. The cu_o was linked into the main application with -rdynamic and --export-dynamic, so I’m not entirely sure why it can’t resolve the symbol.
Not a CUDA problem in particular, but possibly some of you have come across this problem?
On RHEL 64-bit I compile shared libraries using:
nvcc --compiler-options ‘-fPIC’ -o mykernel.so -shared mykernel.cu
If you don’t already link libcuda to your application, you need to make sure to dlopen() that library before loading your shared library. (At least our loader is not smart enough to pull in libcuda for me.)
What is the output of ldd on your plugin?
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00002ba485c99000)
libm.so.6 => /lib/libm.so.6 (0x00002ba485fa4000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00002ba486226000)
libc.so.6 => /lib/libc.so.6 (0x00002ba486434000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
::EDIT::
I’ve managed to get it to work by directly dlopen’ing the cu_o file, but that’s such a hack that I’d like to know how to just get the linker make it work the way I want it to.
Try to add the cuda and cudart libraries when you are linking your plugin.
Also, check that the symbols names with nm.
My workaround backfired and ended up not working for me.
nm EXECFILE | grep “LoadKDTree”
produced this:
000000000041adc4 T _Z10LoadKDTreePfS_S_S_S_S_jjjjj
So the symbol exists, but the shared library must not understand the mangled form. Not entirely sure how to fix this.
Add extern ‘C’ to your kernels.
They’re already prototyped in an extern “C” { } block External Media
It is working for me.
With extern “C” in my source code:
extern “C” void myfunc(…)
After compilation:
nvcc -c -O3 my.cu
The symbol is correct:
nm my.o |grep -y myfunc
00000640 T _myfunc
Ya… well… it doesn’t do that for me, heh. :(
::EDIT::
Sorry, nvcc gives me the mangled symbols and I get those symbols in my main executable, but the gcc-compiled shared library that uses the external doesn’t get the name mangling, so when it’s dlopen’ed it can’t find the right symbol.