FFT GFLOPS results with nice graph! For different sizes and batches.

Ian_Wainwright · November 10, 2009, 9:51am

Hi!

I’v been doing some FFT:s on a 4800 FX this morning and the results are:

The FFT is done using CUFFT with toolkit 2.3 for complex single precision, i.e. 8 bytes per element.

The measurement of time is the time taken for a transform followed by its inverse. The number of floating point operations are m * 2 * (5 * n * ln(n)) where m is the number of batches, n is the number of elements per batch, and the “2” comes from their being 2 transforms.

As can be seen for batches of 128 elements a piece or more, the number of GFLOPS attained are mostly a function of the number of elements, not the relation between the size of the batch or the length of the input vector.

Any comments? Do you think the results reasonable? Do you think I’m uncool seeing as I use Excel and not say Matlab?

Cheers!

Ian

Jimmy_Pettersson · November 10, 2009, 9:54am

Excel is for such n00bs! ;)

Ian_Wainwright · November 10, 2009, 10:06am

Damn you Jimmy!

Shoe flys across office (10p landing)

You’re eating lunch by yourself today!

avidday · November 10, 2009, 10:08am

That sort of makes sense, doesn’t it? The GPU is pretty much the embodiment of Gustafson’s Law, and this is pretty much what your results show. Larger input datasets in cuFFT means more blocks per FFT, which is usually good for GPU throughput.

And yes, Excel is unspeakably uncool (as well as ugly as hell and really unsuited to just about any serious scientific endeavour). Matlab is passÃ© as well. Something like python matplotlib is what the cool kids are using these days.

EDIT: brain moving slightly faster than fingers in one spot.

Ian_Wainwright · November 10, 2009, 10:12am

.

Jimmy_Pettersson · November 10, 2009, 10:37am

Hmmm im gonna have lunch with some cool python programming kids instead! :D

Topic		Replies	Views
How many GFLOPS do you get with CUFFT? CUDA Programming and Performance	4	2159	October 12, 2009
FFT Performance CUDA Programming and Performance	4	4723	March 3, 2010
Performance of CuFFT 3.1 library CUDA Programming and Performance	0	3269	July 8, 2011
No Increase in Performance with Increased Batch Size for CUFFT CUDA Programming and Performance	1	1483	May 5, 2009
FFT Performance Discussion about CUDA FFT performance CUDA Programming and Performance	0	8993	March 28, 2007
CUFFT on G80 performance CUDA Programming and Performance	0	2618	July 16, 2007
cufft 2.3 batched 1D fft of size 80 on 1GB relative low performance of batched cufft of size 80 on 1 CUDA Programming and Performance	0	2095	August 17, 2009
Benchmarking Paricular Sized CUFFT I have a CUFFT, and I can't seem to get anywhere near optimal CUDA Programming and Performance	0	2220	April 27, 2009
CUFFT Newbei Question CUDA Programming and Performance	1	2910	May 4, 2010
CUFFT performance not good How to correctly find the excution time on CPU and GPU CUDA Programming and Performance	1	1040	May 4, 2010

FFT GFLOPS results with nice graph! For different sizes and batches.

Related topics