Is it possible to compile CUDA kernels in a .cu file that are directly callable by multiple different applications after statically linking?

I am getting errors when I try to call some of my own CUDA kernels that are used by two separate applications.

I declare my CUDA functions in a shared header using the “global” prefix and define them in a similarly named “.cu” file. I can compile and link both applications fine but when I call the CUDA kernels I get the “invalid configuration argument” error.

Yes, it works if I write a non-CUDA wrapper to call the CUDA kernels but I want to call the kernels directly. Is this even possible unless they are in the same file or something silly like that?


I don’t know if I have fully understood your question, but I don’t seem to have any difficulty with something like what you describe:

# cat
#include <cstdio>
__global__ void k() {printf("hello from kernel\n");}
# cat t1.cuh
__global__ void k();
# cat
#include "t1.cuh"

int main(){
# nvcc -o test
# compute-sanitizer ./test
hello from kernel
========= ERROR SUMMARY: 0 errors

Naturally, I could build another application, similar to test, that also uses the bits from

You might want to give a short but complete example of what is not working for you, just as I have provided a short but complete example of something that does seem to work.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.