Background: I use FindCUDA.cmake which “compiles” all .cu files with nvcc -cuda file*.cu -o file*_output.cpp. The file*_output.cpp files are then sent into the main compilation phase of all “normal” c++ files in the build system.
I’m having a problem with --device-emulation and this in CUDA 2.2. Namely:
test_cu.cu
#include <stdio.h>
extern "C" void test_cu()
{
printf("hello\n");
}
test_main.cc
extern "C" void test_cu();
int main()
{
test_cu();
return 0;
}
The following compilation work:
nvcc - test_cu.cu #produces test_cu.o
g++ -o test test_main.cc test_cu.o -L/opt/cuda/lib -lcudart
This works too: (mimicing what FindCUDA.cmake does)
nvcc -cuda test_cu.cu #produces test_cu.cu.cpp
g++ -o test test_main.cc test_cu.cu.cpp -L/opt/cuda/lib -lcudart
This does not work:
nvcc --device-emulation -cuda test_cu.cu #produces test_cu.cu.c
g++ -o test test_main.cc test_cu.cu.c -L/opt/cuda/lib -lcudart
/tmp/ccQEWcrT.o: In function `main':
test.cc:(.text+0x5): undefined reference to `test_cu'
/tmp/ccH2SOpm.o: In function `__sti____cudaRegisterAll_39_tmpxft_00003c3d_00000000_4_t
est_cpp1_ii_test_cu()':
test.cu.c:(.text+0xa8): undefined reference to `__cudaRegisterFatBinary(void*)'
test.cu.c:(.text+0xb9): undefined reference to `atexit(void (*)())'
/tmp/ccH2SOpm.o: In function `__cudaUnregisterBinaryUtil()':
test.cu.c:(.text+0xdc): undefined reference to `__cudaUnregisterFatBinary(void**)'
I’m thinking that the generation of test_cu.cu.“c” is the problem. Somehow, the C code generation path is being run when --device-emulation is specified.
Feel free to tell me that the use of nvcc -cuda is incorrect, as well (I didn’t write FindCUDA.cmake). I’m debating modifying FindCUDA.cmake to use the “nvcc -c” form and then just link the resulting object files, but that is going to require a large amount of additional care when it comes to changing compiler flags for shared vs. static compiler flags, etc… It is a can of worms I don’t want to open if I don’t have to.