cufft what is maximum size for 2D fft

Hello All. I am having a problem with cufft. I am trying to perform 2D CtoC FFT on 8192 x 8192 data. It returns ExecFailed. What is maximum size for 2D FFT? Thank You

my card: 470 gtx
cuda: 3.0

Hi, the maximus size of a 2D FFT in CUFFT is 16384 per dimension, as it is described in the CUFFT Library document, for that reason, I can tell you this is not your problem. Maybe it is the setup of your 2D CUFFT, something about the plan, etc.

What is the set-up of your 2D CUFFT?

8192 x 8192 ComplexToComplex

OK, there’s no problem with that. What’s your plan configuration?

Well, it seems problem is with unsufficient memory. Strange thing is my plan creation is successful but execution failes. If i allocate too much memory on gpu execution fails but plan creation succeeds.
On my card i have 1280 gb memory. If i create two arrays 8192 x 8192 x 8 (2 floats) fft failes. If i create one array 8192 x 8192 x 8 (2 floats) fft succeeds. It looks like execution takes some memory dynamically and it failes if located memory is insufficient or slow (like agp/pci memory (i mean total amount of available memory is more than 1280 gb as shows CUDA)

I think that “8192 x 8192 x 8 (2 floats)” is the amount of bytes required to store a complex, single precision array, i.e., 536870912 bytes. Two of such arrays would occupy 1073741824 bytes, which would almost completely saturate the available GPU memory.

The cuFFT library User’s guide, see, says, with reference to the cufftEstimate2d() function, that

During plan execution, CUFFT requires a work area for temporary storage of
intermediate results. This call returns an estimate for the size of the work area required,
given the specified parameters, and assuming default plan settings.

Furthermore, with reference to the cufftGetSize2d() function, the guide says that

This call gives a more accurate estimate of the work area size required for a plan than
cufftEstimate2d(), given the specified parameters, and taking into account any plan
settings that may have been made.

The user could exploit those two functions to have an idea of the extra memory required by the cuFFT to store intermediate results, which is perhaps the reason why the cuFFT execution fails in the two arrays case.