Calling device function returns "calling a host function..." error

I am currently trying to port parts of an old OpenMP code to CUDA, and I’ve worked out all the bugs so far except for this one. I have two device functions. We’ll call them func1(…) and func2(…). Both return doubles. When I call them within a kernel, I get the following error during compilation:

error: calling a host function from a device/global function is only allowed in device emulation mode

It seems like the compiler is simply not detecting these two device functions. How do I fix this?

If it’s relevant, the two device functions are in a separate file. I am definitely including the file in the compiler line, but that doesn’t seem to be enough.

You need to either have the functions in the same file or [font=“Courier New”]#include[/font] the files into one. CUDA has no linker on the device side, so functions in different files don’t see each other, even if compiled at the same time.

You need to either have the functions in the same file or [font=“Courier New”]#include[/font] the files into one. CUDA has no linker on the device side, so functions in different files don’t see each other, even if compiled at the same time.

OK, thanks! Do you happen to know if this was mentioned in any of nVidia’s documentation? I don’t recall ever reading such a requirement, and it seems like information they should include for n00bs like myself.

OK, thanks! Do you happen to know if this was mentioned in any of nVidia’s documentation? I don’t recall ever reading such a requirement, and it seems like information they should include for n00bs like myself.

It’s deeply hidden in Appendix E.1 “NVCC Specifics: noinline and forceinline” of the Programming Guide: “[…] The function body must still be in the same file where it is called. […]”

It’s deeply hidden in Appendix E.1 “NVCC Specifics: noinline and forceinline” of the Programming Guide: “[…] The function body must still be in the same file where it is called. […]”