I’d like to FFT data from two interleaved real-valued signals that are to be cross-correlated by the FFT method. The input data look like d_in = [x0 y0 x1 y1 … xn-1 yn-1]. The output should be d_out = [X0Re X0Im Y0Re Y0Im … ] for sequential memory access in later processing.

Tried cufftPlanMany() with input and output strides of 2, input dist of 2*(2*Lfft) and output dist of 2*(Lfft+1). Then called cufftExecR2C() twice. The first cufftExecR2C() with start at d_in transformed the “x” data. This worked. The second cufftExecR2C() for a offset-by-one start at d_in+1, however, produces a CUFFT_INVALID_VALUE error.

The same error happens with the following addition to CUDA 7.5 example “simpleCUFFT.cu”:

Complex *d_signal;

checkCudaErrors(cudaMalloc((void **)&d_signal, mem_size+32));

Complex *d_signal_o;

checkCudaErrors(cudaMalloc((void **)&d_signal_o, mem_size+32));

cufftHandle plan_r2c;

checkCudaErrors(cufftPlan1d(&plan_r2c, new_size, CUFFT_R2C, 1));

// 1st out-of-place FFT, works

checkCudaErrors(cufftExecR2C(plan_r2c, ((cufftReal *)d_signal)+0, ((cufftComplex *)d_signal_o)+0));

// 2nd out-of-place FFT, error 4(CUFFT_INVALID_VALUE)

checkCudaErrors(cufftExecR2C(plan_r2c, ((cufftReal *)d_signal)+1, ((cufftComplex *)d_signal_o)+1));

The CUFFT_INVALID_VALUE error also occurs when a ‘load’ callback function is used to fetch the cufftReal-aligned input data.

There is no mention of requirements for FFT input data aligment in http://docs.nvidia.com/cuda/cufft/index.html#data-layout.

Should something like the above actually work…?

Or is all cuFFT processing natively aligned to ‘float2’?