check GPU usage

ferrarirt · September 21, 2009, 6:25pm

Is there a command to check GPU usage?

I have a piece of scientific software (not written by me) which I have linked with the CUFFT libraries and I want to check that this software is indeed utilizing the GPU. The trouble is that I see no speed up (or slow down) so I can not tell if the software is using its internal FFT or the CUDA FFT.

Thanks!

YDD · September 21, 2009, 6:38pm

You might be able to run it through the visual profiler, and see if any CUDA kernels get called… However, this is not an easy problem (although one I’d like to solve, so that I could auto-kill CPU only jobs on our GPU cluster :angry: ). When you say ‘no speed up’ do you literally mean ‘no speed up’ or ‘no discernable speed up?’ It could be that the FFT is such an insignificant portion of the runtime that it wouldn’t matter if the GPU made it infinitely fast (see Amdahl’s Law).

Additional… when you say ‘linked yourself’ did you at least change the FFT calls to point at CUFFT?

ferrarirt · September 21, 2009, 7:53pm

First, thank you for the response. I see no discernible speedup, the time difference is well within the standard timing errors. We chose to try the CUFFT since our software can use FFTW, and CUFFT claims to be completely compatible with the FFTW standard. The hope was that we could do no rewriting of the code and get the software to work with CUFFT (not that I have any problem rewriting some code, it would just be great if it worked “out of the box”). So far, I have only re-compiled the software and told the configure file that the FFT libraries are in /usr/local/cuda/lib64/. It did not complain, so I hoped that it worked.

mfatica · September 21, 2009, 8:14pm

CUFFT is not compatible with FFTW. It uses a similar approach ( creation of a plan, execution of the plan), but the calls are different.

ferrarirt · September 21, 2009, 8:34pm

Ah, I was very mistaken; then I will have to write some wrapper functions to make it work?

YDD · September 22, 2009, 1:23pm

Yes. They shouldn’t be too difficult - as mfatica says, the API is practically identical.