Hi,
I am writing a header-only wrapper library around cuFFT and other fft libraries. I wanted to include support for load and store callbacks. My ideas was to use NVRTC to compile the callback in execution time, load the produced CUBIN via CUDA Driver Module API, obtain the __device__
function pointer and pass it to the cufftXtSetCallback(...)
function.
I tried to modify the cuFFT callback sample and it didn’t work, the plan execution fails with code no. 6 (execution failed).
After reading the documentation once again, I think that the problem is that the callback function and cufft library must reside in the very same CUDA module, that’s the reason why static linking is required when using callbacks. Do I understand it correctly?
I am pretty sure the JIT callbacks will solve this problem, however I’d like to ask if there is a solution which I have missed? It would be great to support callbacks even for the older versions of cuFFT.
Thanks very much!
David