problem running two cufft plans on the same memory

I’m trying to run two different cufft plans with batch count > 1 on the same memory and I get an error for some reason (perform the 2d FFT by applying two 1d FFTs)

Stripping the code a bit, what I’m running is

cufftPlan1d(&planx, sizeX, CUFFT_C2C, sizeY);
cufftPlan1d(&plany, sizeY, CUFFT_C2C, sizeX);

cufftExecC2C(planx, device, device, CUFFT_FORWARD);

// Apply transpose kernel

cufftExecC2C(plany, device, device, CUFFT_FORWARD); // <= this returns error 6 - CUFFT_EXEC_FAILED

If I run the plans in a loop with the batch count = 1 it works, but bypassing the loop with the larger batch count fails.
If I run the same plan twice (change the second one to planx or the first to plany) things work as well.

Any idea what I’m doing wrong?

Tested this under linux 64 bit with cuda 2.3 and 3.0


How large is your data block and what are sizeX and sizeY?

you can use cuMemGetInfo (see CudaReferenceManual.pdf) to check memory usage.

if you use batch mode, and dimension is not power of 2, then

sometimes cufftPlan1d() would allocate larger memory block.