CUFFT 1D Memory Usage Inconsistencies

I was able to find memory usages for CUFFT 2D transforms, but not for 1D. I’m having a bit of an issue that is confusing. I am doing complex 1D FFTs on approximately 256 MB of input data. The data is arranged in different FFT sizes and done in batch mode. I do FFT sizes from 1k to 8M. All runs finish successfully except for when the FFT size is 4k or 8k. Those cases return a CUFFT_ALLOC_FAILED.

I investigated further by doing the same FFT sizes on about half as much input data, to see how much memory those FFT plans are using internally. Here are my results:

FFTSIZE = 1024
NUM_FFTS = 17476
Memory used = 0 bytes

FFTSIZE = 2048
NUM_FFTS = 8738
Memory used = 143196160 bytes

FFTSIZE = 4096
NUM_FFTS = 4369
Memory used = 286359552 bytes

FFTSIZE = 8192
NUM_FFTS = 2184
Memory used = 286261248 bytes

FFTSIZE = 16384
NUM_FFTS = 1092
Memory used = 131072 bytes

And so on. The memory usage for the larger FFT sizes (> 16k) grows, but doesn’t get as large as the ~273MB used for the 4k and 8k sized FFTs. Is this spike in memory usage normal? What is the reasoning for it? The documentation I’ve found implies that these transforms should all be using the same algorithm, since they will not fit into shared memory (with the exception of the 1k case, depending on how this is all handled) and they are all powers of 2. Any help would be greatly appreciated.

Also, is there any way to consistently determine how much memory will be used by a CUFFT call before actually initializing it? It’d be nice to be able to split this into two chunks and process them separately when the memory doesn’t allow them to be processed in batch mode all at once.

Thanks!
Bryan

I’m still struggling with this problem. It doesn’t seem intuitive that 4k and 8k FFTs would use so much memory. The internal memory they’re using internally is 2x the amount of memory for the input.

Does anyone have any guesses why this is happening? Or can someone point me in the direction of some resources so I can better understand the reasoning behind this??

Thanks!