My CUDA code has been compiling fine except it started complaining about not knowing how to deal with function pointers so I added -arch=sm_21 to the nvcc command (I’m compiling for an installed GTX 470, I’m running CUDA 4.0 on Ubuntu 10.10) and then a file which had previously compiled fine started to give me
“… Error: External calls are not supported (found non-inlined call…”
I’ve triple-checked it and everything looks good, and indeed this file still compiles fine when I remove -arch=sm_21. (The same error occurs for -arch=sm_20.) I’ve looked around and noticed some potential relation between these issues, but not to the extent that I’m able to resolve the problem. The code structure is (file.cu is the file being compiled, and contains a device function called by a kernel in another file not shown)
file.cu:
#include "prototypes.h"
__device__ fileFunction ()
{
...
brokenFunction(); <- line number given in the error
...
}
protoypes.h:
__device__ void brokenFunction();
functions.cu:
#include "prototypes.h"
__device__ void brokenFunction(){
}
and the terminal instruction that fails: nvcc -O3 -arch=sm_21 -c file.cu
brokenFunction is the function given by the Error: External calls… notification.
Ok so it looks like device functions don’t like to live in separate files, even with noinline. Is there any way around this? Any way at all? My program is going to become a mess if I have to condense it into one file…
CUDA has no linker on the device side, so the functions indeed have to live within one file. The common way to achieve this is to #include the source code (instead of just the prototype), i.e. in your example file.cu would start with #include “functions.cu”.
Horrible for C-style pedants, I know, but that’s how things are at the moment.
[s]I followed your advice and it compiles (!!) but I am now receiving Undefined Reference errors during linking. I’ve tried peppering my code with extern “C” but this hasn’t worked out so far… here is my current configuration: