cuFFT callbacks with NVRTC

Hi,

I am writing a header-only wrapper library around cuFFT and other fft libraries. I wanted to include support for load and store callbacks. My ideas was to use NVRTC to compile the callback in execution time, load the produced CUBIN via CUDA Driver Module API, obtain the __device__ function pointer and pass it to the cufftXtSetCallback(...) function.

I tried to modify the cuFFT callback sample and it didn’t work, the plan execution fails with code no. 6 (execution failed).

After reading the documentation once again, I think that the problem is that the callback function and cufft library must reside in the very same CUDA module, that’s the reason why static linking is required when using callbacks. Do I understand it correctly?

I am pretty sure the JIT callbacks will solve this problem, however I’d like to ask if there is a solution which I have missed? It would be great to support callbacks even for the older versions of cuFFT.

Thanks very much!

David

Hi David,

Apologies for the late reply. Your understanding is correct; for “legacy” (i.e. non-LTO) callbacks, the callback device function needs to be device-linked with the static cuFFT binary (libcufft_static.a) so they reside in the same module.

We just shipped LTO callbacks in cuFFT as part of the CUDA Toolkit 12.6 Update 2; as you say, LTO callbacks do not require separate device linking, so they are supported in the dynamic library and work with NVRTC.

I’d like to ask if there is a solution which I have missed? It would be great to support callbacks even for the older versions of cuFFT.

Sadly no, not currently. It’s possible this could change in the future, however, for the time being, we recommend switching to LTO callbacks.

Thank you for reaching out!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.