CUDA 2.2 beta --device-emulation problem

Background: I use FindCUDA.cmake which “compiles” all .cu files with nvcc -cuda file*.cu -o file*_output.cpp. The file*_output.cpp files are then sent into the main compilation phase of all “normal” c++ files in the build system.

I’m having a problem with --device-emulation and this in CUDA 2.2. Namely:

test_cu.cu

#include <stdio.h>

extern "C" void test_cu()

	{

	printf("hello\n");

	}

test_main.cc

extern "C" void test_cu();

int main()

	{

	test_cu();

	return 0;

	}

The following compilation work:

nvcc - test_cu.cu   #produces test_cu.o

g++ -o test test_main.cc test_cu.o -L/opt/cuda/lib -lcudart

This works too: (mimicing what FindCUDA.cmake does)

nvcc -cuda test_cu.cu   #produces test_cu.cu.cpp

g++ -o test test_main.cc test_cu.cu.cpp -L/opt/cuda/lib -lcudart

This does not work:

nvcc --device-emulation -cuda test_cu.cu   #produces test_cu.cu.c

g++ -o test test_main.cc test_cu.cu.c -L/opt/cuda/lib -lcudart

/tmp/ccQEWcrT.o: In function `main':

test.cc:(.text+0x5): undefined reference to `test_cu'

/tmp/ccH2SOpm.o: In function `__sti____cudaRegisterAll_39_tmpxft_00003c3d_00000000_4_t

est_cpp1_ii_test_cu()':

test.cu.c:(.text+0xa8): undefined reference to `__cudaRegisterFatBinary(void*)'

test.cu.c:(.text+0xb9): undefined reference to `atexit(void (*)())'

/tmp/ccH2SOpm.o: In function `__cudaUnregisterBinaryUtil()':

test.cu.c:(.text+0xdc): undefined reference to `__cudaUnregisterFatBinary(void**)'

I’m thinking that the generation of test_cu.cu.“c” is the problem. Somehow, the C code generation path is being run when --device-emulation is specified.

Feel free to tell me that the use of nvcc -cuda is incorrect, as well (I didn’t write FindCUDA.cmake). I’m debating modifying FindCUDA.cmake to use the “nvcc -c” form and then just link the resulting object files, but that is going to require a large amount of additional care when it comes to changing compiler flags for shared vs. static compiler flags, etc… It is a can of worms I don’t want to open if I don’t have to.

Thanks for the report, will check on this today.

Official answer: -deviceemu and -cuda are incompatible options and never should have worked in the first place.

Thanks for the quick answer. Time to put on my CMake hacking hat, I guess.

Update: apparently it’s not really unsupported, but you should compile your .cu.c with a C compiler and not g++. It should work then…

(sorry if I made you go rewrite all your CMake scripts already…)

No problem. It turned out to be easier than I thought to make the modification. 1) CMake wraps all the platform dependent shared library stuff up into some convenient variables. and 2) All it really took was a small change to flag that the generated files were object files instead of source files. Convincing it that the generated sources are C instead of C++ is a similar complexity to what I already did, except that I would then have to know a priori when the generated code is C and when it is C++…

For anyone monitoring this, you can pick up the modified FindCUDA.cmake from HOOMD’s source: http://trac2.assembla.com/hoomd/browser/trunk/src/CMake/cuda . There is only one little modification in there that is specific to HOOMD (the enabling of the shared library flags with ENABLE_STATIC=OFF), but I could probably find a way to make that of a more general use if someone were to ask nicely.