How to separate device function and kernel function?

I would like to run a complicate model in CUDA, and I just want to organize the code just like the C/C++ way, the header code and the implementation code.

But when I try to this, just separate the kernel function and the device function, relink them with the header file, get the error:

Error: External calls are not supported

So, is it possible to separate the kernel function and the device function?


There is no device code linker in CUDA, so everything a kernel uses (device functions, constant definitions, texture definitions, etc) must be included in the same compilation unit as the kernel itself. This means that you can put device function code in another file if you like, but you must include that file into the file containing your kernel code at compilation time. Just a prototype isn’t sufficient, because device functions are inlined by the compiler, so the code must also be present at compile time.