I have been using CUDA programming for my graduate research, and I have a question about 3D Fast Fourier Transforms (FFTs). Does someone know what the most efficient way is to compute large 3D FFTs (with sizes ranging from 7 million to 450 million points)?

Thanks for your help,
Nihshanka Debroy

Well, the largest non-Telsa card has 1GB of memory, which (if you could utilize 100% of it, which you can’t in most cases) would give you 128M elements if you did single-precision, in-place transform. The largest Tesla card has 4GB of memory, which bumps you to 512M elements (max). So, I’d say you at least need to go with the Tesla, but you may run into some memory problems with your very largest datasets.

Also, remember that you’ll need at least the same amount of memory on your PC to hold the results when you transfer them back to the host, and I’d recommend at least twice that (or more if you can afford it).

actually you can get cards with 1.5 and 4 Gb in the Quadro line, but they do not really have price advantage compared to Tesla. ;)