CUDA: How to run cuda kernel functions not in SDK directory?
I am currently trying to modify the program (already implemented in C++, and running properly on CPU). I am planning to create a separate CUDA kernel function as a different file to run that small portion of the program on GPU.
However, I would like to hear some recommendations from what .h files should I include ? and compilation issues that might arise.
There shouldn’t be anything special. Just compile the object file with nvcc (declare the c++ callable functions extern “C”) and link the object file into the executable. You only need to include “cuda_runtime_api.h” if you want your c++ files compiled by the normal c++ compiler to perform cudaMalloc or other cuda operations.
I had kind of like the same problem in the beginning. But when i looked carefully, There is nothing you need to have for running only a kernel. The thing in the examples is that they all use those functions like
the CUT_SAFE_CALL is some function in the cutil
but is not needed. I think it’s only for running the program in device emulation so you can do / get some debugging…
CUT_SAFE_CALL isn’t needed unless you are using the cutil library. But, CUDA_SAFE_CALL, defined in cutil.h is VERY VERY USEFUL since it will tell you exactly which cuda call is failing on you, explaining why your app does nothing (if there is a problem, that is). And is safe to use in release builds, because it is a no-op when NDEBUG is defined. I usually just copy CUDA_SAFE_CALL and the ??_CHECK_ERROR macros from cutil.h for every new code file I write so I can use them without needing to link to cutil.