FP16 and CUFFT

Hi all.

I’m looking forward to testing the new 16 bit floating point type in the CUDA 7.5 release candidate. My applications make extensive use of CUFFT, but I cannot see how the half or half2 types can be used here. I imagine that it would be possible to load/store the input and and output via custom callbacks, but I was expecting a cufftHalf type to an associated CUFFT calls to be added to the RC API.

Has anyone done tried this with RC7.5? I’ve experimented with using the callbacks to read 8-bit input data into my FFTs, and I see performance improvements in most cases - there is some variation with FFT length though…