I’ve ported and tested a chunk of code from C++ to C as a precursor to running it on CUDA but I’m now having problem getting it to compile.
An example would best illustrate the problem:
I have my kernel defined in the file cudaKernel.cu.
This calls a function foo() which is declared and defined in foo.h and foo.c.
It compiles and runs fine in emulation mode but I get ‘External calls are not supported’ errors when I compile in ‘device’ mode.
BTW - this test function has almost nothing in it for test purposed i.e. int x = 1 * 2.
I’m expecting the compiler to inline foo() with no problems but it’s beginning to look like I need to have all the functions which the kernel function depends on defined in the same file as the kernel function. Is that right?
Maybe I’m missing something but this seems nuts to me and is a big obstacle to being able to run code on CUDA or CPU using a single code line. Ditto the need to add device prefixes. Why is this necessary? Can’t the compiler figure this out itself?