I’d like to FFT data from two interleaved real-valued signals that are to be cross-correlated by the FFT method. The input data look like d_in = [x0 y0 x1 y1 … xn-1 yn-1]. The output should be d_out = [X0Re X0Im Y0Re Y0Im … ] for sequential memory access in later processing.
Tried cufftPlanMany() with input and output strides of 2, input dist of 2*(2Lfft) and output dist of 2(Lfft+1). Then called cufftExecR2C() twice. The first cufftExecR2C() with start at d_in transformed the “x” data. This worked. The second cufftExecR2C() for a offset-by-one start at d_in+1, however, produces a CUFFT_INVALID_VALUE error.
The same error happens with the following addition to CUDA 7.5 example “simpleCUFFT.cu”:
Complex *d_signal;
checkCudaErrors(cudaMalloc((void **)&d_signal, mem_size+32));
Complex *d_signal_o;
checkCudaErrors(cudaMalloc((void **)&d_signal_o, mem_size+32));
cufftHandle plan_r2c;
checkCudaErrors(cufftPlan1d(&plan_r2c, new_size, CUFFT_R2C, 1));
// 1st out-of-place FFT, works
checkCudaErrors(cufftExecR2C(plan_r2c, ((cufftReal *)d_signal)+0, ((cufftComplex *)d_signal_o)+0));
// 2nd out-of-place FFT, error 4(CUFFT_INVALID_VALUE)
checkCudaErrors(cufftExecR2C(plan_r2c, ((cufftReal *)d_signal)+1, ((cufftComplex *)d_signal_o)+1));
The CUFFT_INVALID_VALUE error also occurs when a ‘load’ callback function is used to fetch the cufftReal-aligned input data.
There is no mention of requirements for FFT input data aligment in http://docs.nvidia.com/cuda/cufft/index.html#data-layout.
Should something like the above actually work…?
Or is all cuFFT processing natively aligned to ‘float2’?