I am working on a class that uses CUDA and I need a couple custom kernels (i.e. global void functions), but I can’t figure out where to place these functions.
All of the examples I have seen online (e.g. http://devblogs.nvidia.com/parallelforall/separate-compilation-linking-cuda-device-code/) put these kernels in the main.cpp file, but I cannot do this with my project.
How can I include them along side my class and compile it all into a .o file I can then link against the main executable?
EDIT: to clarify a bit, my program looks like the following:
cuda class (used in main program), compiled to .o with nvcc
main program, compiled with icc and linked against libraries
I need the cuda class to be able to call kernels (i.e. global void functions), but don’t know where to place these… I cannot place them in the main program.