I’m trying to run two different cufft plans with batch count > 1 on the same memory and I get an error for some reason (perform the 2d FFT by applying two 1d FFTs)
Stripping the code a bit, what I’m running is
cufftPlan1d(&planx, sizeX, CUFFT_C2C, sizeY);
cufftPlan1d(&plany, sizeY, CUFFT_C2C, sizeX);
cufftExecC2C(planx, device, device, CUFFT_FORWARD);
// Apply transpose kernel
cufftExecC2C(plany, device, device, CUFFT_FORWARD); // <= this returns error 6 - CUFFT_EXEC_FAILED
If I run the plans in a loop with the batch count = 1 it works, but bypassing the loop with the larger batch count fails.
If I run the same plan twice (change the second one to planx or the first to plany) things work as well.
Any idea what I’m doing wrong?
Tested this under linux 64 bit with cuda 2.3 and 3.0
thanks