cuFFT function cufftExecR2C How to get full result instead of only the first half?

Function cufftExecR2C has this in its description:

cufftExecR2C() (cufftExecD2Z()) executes a single-precision (double-precision) real-to-complex, implicitly forward, cuFFT transform plan. cuFFT uses as input data the GPU memory pointed to by the idata parameter. This function stores the nonredundant Fourier coefficients in the odata array.

As a result, the output only contains the first half of the result. For example, the output is this:

15 + 0
-2.5 + 3.44095
-2.5 + 0.812299
0 + 0 // second half is zero-filled
0 + 0 // it's symmetric with the first half, is there any parameter to fill it too?

Expected:

    (+1.500e+01,+0.000e+00)
    (-2.500e+00,+3.441e+00)
    (-2.500e+00,+8.123e-01)
    (-2.500e+00,-8.123e-01)
    (-2.500e+00,-3.441e+00)

We can write an extra kernel to fill the second half of the output, but that’s an extra step. Is there any built-in method in cuFFT to fill the whole array instead of just the first half?

Thanks!

There is no built-in method in CUFFT to provide any other kind of output from the R2C transform. You could just use a C2C transform if you want the “full” output. This will require modified formatting of your input data, of course.

Hmm, strange that they didn’t add one for quality-of-life improvement.

C2C transform is probably slower (is there any benchmark results online?) . So I guess I have to write a separate kernel.

Yes, in my experience, typically a C2C transform is slower than an equivalent R2C. If you have a specific case in mind, it should be trivial for you to benchmark the actual difference.

benchmark data is linked from the cufft landing page. Look for the “Learn More” link and click on it. The cufft data starts on slide 16/17 but I don’t see anything there that compares R2C to equivalent C2C