Real to real transform with CUFFT?

How do I set up a real to real transform with CUFFT?

EDIT: To clarify, I’ve previously worked with FFTPACK. With that library, I can perform a real to real forward or backward transform on an array of 64 reals and get an array of 64 reals back. I’d like to be able to the same thing with CUFFT.

EDIT2: Furthermore, I’d like to know how to do this for 2D FFTs.

The CUDA FFT works the same as the FFTW library, so one way to do a Real to Real is to do a Real-to-Complex transform, which yields a 2D complex array (1/2 the size of the real time data) – where the 1st dimension holds the cos() values and the 2nd dimension holds the sin() values.

// Load real data on to the device

	float *Td;

	int size = ary_sz * sizeof(float);

   CUDA_SAFE_CALL(cudaMalloc((void**)&Td, size));

// Allocate device memory for signal

    cufftComplex *d_signal;

    int mem_size = sizeof(cufftComplex)* (ary_sz/2);

   CUDA_SAFE_CALL(cudaMalloc((void**)&d_signal, mem_size));

	// CUFFT plan

    cufftHandle planF, planI;

    CUFFT_SAFE_CALL(cufftPlan1d(&planF, ary_sz, CUFFT_R2C, 1));

   // Transform signal

    CUFFT_SAFE_CALL(cufftExecR2C(planF, Td, d_signal));

The complex fft data is accessed by using:

d_signal.x [for cos() values]

d_signal.y [for sin() values]

Thanks for your answer, but I think you’ll find that the second output dimension is actually (N / 2 + 1). Thus, if I use the CUFFT R2C transform, I end up with, for example, 64 x 33 complex numbers. For the solver I’m implementing, I need to be able to recover 64 x 64 real numbers (for N = 64).

With a 1D transform, I can take advantage of the fact that N is even so that the first and last complex numbers in the N / 2 + 1 array are purely real. Thus I can recover 64 real numbers, which match the output from an FFTPACK real to real transform. Right now, I’m not able to do something similar with a 2D transform without doing 2 1D transforms and handling the transposing and extraction of the real numbers myself.