matlab cufft performance?


Has anybody experience in cufft performance with matlab?

I’ve written a mex file to execute the fft on the gpu.
It’s like the fft2 sample from the matlab plugin from Nvidia, but only for 1D transforms.

I’ve measured only 2x speedup…(Vectorsize: 100*1024) I think this is a little slow…
I use a C870 Tesla. BandwidthTest results approx. 2GB/s
Host Processor is an Intel Core 2 Duo with 2.2GHz

You can see my results in the attachment.

So anybody can comment?


I guess you are measuring time including transfer to mex, convert to float, do cufft, convert to double, transfer to matlab.

I guess the conversion is a reason why the speedup slows down. If you want I can test your script on 8800GTX

Yes, you are right, but im interrested in practical results.
Because later in “productive” work the steps (mex, converting, memcopy,…) are also need to be done.

So what i’m interessted in is the real, practical advantage of the Tesla to speedup the fft in comparison to native matlab fft.

My scripts and the mex code is attached.

Thanks for testing! (4.79 KB)

I understand why you time the way you did, I do the same for my simulations. I am now prototyping a realtime processing using matlab & cuda where I will not have the trouble in reality (luckily)

I’ll test tomorrow when I am back at work.

Sorry did not get around to it today unfortunately. I’ll try to do it tomorrow.

I was looking for a 1D FFT script when I came across this topic. My system is a quad-core Xeon processor (E5405) and an 8800GTX. The performance gain I saw leveled off at 2x as well.

I think the overhead (memcopies, packing and casting the data) is really big…

Just today we were doing some performance tests using CUDA FFT 1.1 example from NVIDIA-CUDA website. It was strange coz we got slower times on 8800gtx than on 7600gs! Not much but still. Between 7600gs and 8800gtx there is huge step.
These cards are installed on different machines but both are Core 2 Duo with 4GB ram.
I will move 7600gs card to the second computer to check. But it’s very strange.

CUDA doesn’t work on the GeForce 7 series. How are you running CUDA FFT on a 7600 GS?

Apparently it is working. I don’t know I was surprised also. But I got two different plots from speed_fft. One for standard matlab fft and the other for CUDA’s fft.
I compiled the mex files before and they are working on my 7600GS GAINWARD.
It is also a mystery for me.
I didn’t put 7600 instead of 8800 to the second machine yet to reveal the factor of slowing down on 8800GTX, but I will post the results after that.

How CUDA compiled fft example from NVIDIA website would behave if it would found not supported card? Do I get any error?

Don’t waste your time, it is not going to work.
The calls are just failing and returning right away. Speed_fft is not doing error checking.

Trust me, I wrote the Matlab plugins … :-)

I believe you than. So that’s why it was faster - it was doing nothing on my 7600GS :D.
Thanks very much for the response.