Does anyone have a device-callable FFT library? I had hoped that since dynamic parallelism has been out for over six months now there would be some news on whether cuFFT is going to be ported over to a device-callable version (with potentially some extra limitations) but so far every time I have asked I have been met with silence, or a quick followup question followed by silence. All that would need to be done is the actual FFT execution, the plan creation can be done by the host well before execution is needed.
I would have thought NVidia would be trying to push the idea of dynamic parallelism as it pushes the strengths of the newer architectures, yet I cannot find any information at all about when they are providing (or even if they are) of their computation libraries. This is a shame because the strength of the libraries is one of the biggest benefits of using CUDA over OpenCL.