What is the maximum size for CUDA FFT

Today i try the simpleCUFFT, and interact with changing the size of input SIGNAL.
The test FAILED when change the size of the signal to 5000, it still passed with signal size 4000

#define SIGNAL_SIZE 5000

Is there any one know why this happen. Is this the size constraint of CUDA FFT, or because of something else. I’m a novice CUDA user

Is there any ideas

Did you find an answer for your question? I would like to learn the answer also.

  • Is it related with the graphic cards memory?

  • 5000 signal? each signal is 4 byte float or 8 byte?

this is my experience with a GeForce 8800 GTX (768MB):

Forum post with memory usage of CUFFT

I did some simple test to get the free memory at every important phase by adding code cuMemGetInfo( &theFree, &theTotal ). But what puzzle me is that it seems that after creating the FFT plan, the free memory do not change any more. Any one can gime me a explain ?

my card is Quardro FX5600

below is my test result:

the Total Memory is :1610285056

Input data size           : 1018 x 1018

Convolution kernel size   : 7 x 7

Padded image size         : 1024 x 1024

Aligned padded image size : 1024 x 1024

Allocating memory...

Generating random input data...

Before Creating FFT plan!

the Free Memory is :1530855424  the Total Memory is :1610285056

Creating FFT plan for 1024 x 1024...

After Creating FFT plan!

the Free Memory is :1522466816  the Total Memory is :1610285056

Uploading to GPU and padding convolution kernel and input data...

...initializing padded kernel and data storage with zeroes...

...copying input data and convolution kernel from host to CUDA arrays

...binding CUDA arrays to texture references

...padding convolution kernel

...padding input data array

After padding input data, starting FFT transformation!

the Free Memory is :1522466816  the Total Memory is :1610285056

Transforming convolution kernel...

finish FFT kernel transformation,start running GPU FFT convolution

the Free Memory is :1522466816  the Total Memory is :1610285056

Running GPU FFT convolution...

finish input data transformation, start multiplying operation

the Free Memory is :1522466816  the Total Memory is :1610285056

finish multiplying operation, and start inverse FFT!

the Free Memory is :1522466816  the Total Memory is :1610285056

GPU time: 11.077584 msecs. //93.551444 MPix/s

finish inverse FFT!

the Free Memory is :1522466816  the Total Memory is :1610285056

Reading back GPU FFT results...

Checking GPU results...

...running reference CPU convolution

CPU time: 3587.733154 msecs. //0.288852 MPix/s

the performance diversity is :323.873244

...comparing the results

L2 norm: 2.053509E-007


Shutting down...

Press ENTER to exit...

next is the free memory and total memory of different input data size:

input size                  free memory(byte)        total memory(byte)

5120×5120 :                  297664512                  1610285056

4096×4096 :                  750583808                  1610285056

3584×3584 :                  966721536                  1610285056

3072×3072 :                  1103101952                 1610285056

2048×2048 :                  1388183552                 1610285056

1024×1024 :                  1522335744                 1610285056

Ah, replied to your other post… :whistling: