CUDA performance measure

coderunner · January 30, 2009, 10:03pm

Hi,

I have a doubt about calculation of CUDA performance measure. I see people saying 10X and 100x and so on.

How are they calculating those values?

In my case:

CUDA execution (includes memory allocation, memory copy from GPU to CPU) : 1.16 ms

CPU execution : 30.01 ms.

However, if we print the output (copied from GPU to CPU / output generated from CPU). This operation takes 50 ms.

How many fold improvement i achieved from CPU to GPU?

If i take only execution of program : ~30 fold

If i add the output priniting into the file then: ~ 1.5 fold.

Please tell me how to measure the performance?

Thanks for your time

PS: I would like to ask the same question to NVIDIA company developers.

kristleifur · January 30, 2009, 10:44pm

Are you outputting to console? The terminal program etc. can have a BIG effect on output! Also, some shells “sync” your program to the console output, so if your terminal is displayin slowly, the program is halted meanwhile.

If you’re on Linux, try the mrxvt terminal. It is ridiculously fast. You could also redirect the output straight to a text file to see if that helps the speed. Or try piping to tee /dev/null which desynchronises processing and allows your program to continue faster.

Also, you can overlap the output processing with more CUDA processing, or “pipeline” the processing - while the main CPU outputs problem piece no. 1, CUDA starts processing problem piece no. 2, so more work gets done. Edit: What I mean, if you have many pieces to process, start a separate CPU thread to manage the output, if you can.

Ailleur · January 31, 2009, 8:26pm

Youre accelerating the calculations, not the act of writing to a file. When i present results, i show what has actually been accelerated and what that acceleration factor is.

It all depends on what is asked of you. If you need to accelerate the piece of code, including writing to a file, by 30x, then you will have to accelerate the computing portion, which you have done, and then work on what is most likely taking the most time to do, writing to the file. For the latter, CUDA can do nothing for you.

Topic		Replies	Views
CUDA slower than CPU? CUDA Programming and Performance	7	828	August 18, 2023
Performance increase? Too good to be true? CUDA Programming and Performance	1	2075	January 4, 2009
How to calculate the speedup ratio between C code and CUDA program? CUDA Programming and Performance	5	12125	April 4, 2013
question about the performance of CUDA CUDA Programming and Performance	1	495	December 18, 2019
Best way to report speedups? CUDA Programming and Performance	2	908	February 10, 2010
What CUDA GPU can give 10000 times performance of a CPU(1core 3Ghz)? CUDA Programming and Performance	3	1075	January 25, 2019
how to evaluate the CUDA's performance how can i know the program is optimazed CUDA Programming and Performance	7	7338	July 24, 2008
Compare GPU and CPU function time CUDA Programming and Performance	7	6307	May 30, 2011
time measurement CUDA Programming and Performance	0	454	February 2, 2013
Performance measurement CUDA Programming and Performance	3	642	April 29, 2011

CUDA performance measure

Related topics