performance between the CPU and GPU ? equivalence between the CPU and GPU ??

smeil · September 17, 2010, 3:46pm

Hello

i would like if it’s possible to give a comparaison between the :

the CPU is IntelÂ® coreâ„¢ Quad CPU Q6600 @ 2.4 GHz

the GPU is Geforce GTS 250 128 cores

so if we can said that the GPU is equivalent to the 8 CPU (just example ) or something like that !!

of course we talk about a comparaison for calculated heavy and not for small data !

Antagonistic · September 20, 2010, 8:10am

You could benchmark it yourself.

You often see people claiming something like 300x speedup on some ports, but… thats not always totally honest. The CPU implementation is usually single-threaded and reasonably unoptimized, due to the implementor’s focus being on the GPU rather than creating an optimized CPU version.

If you want straight numbers to compare you can always use the Peak GFLOPs numbers, but in that case its important to note that the Peak GFLOPs for GPUs usually assume a MAD (multiply and add) instruction every cycle. Realistically, it might be half that if you only have A multiply or A add per cycle to do, or even worse if you look at transfer latencies or other overheads. On the other hand, due to caching and branch prediction and such, the CPU is more likely to run at its rated GFLOPs, at least on a single core. I am unsure if SSE is worked into the GFLOP rating for CPUs, but it is something to look at if you want a truly comparable benchmark.

In short, its really hard to compare one to the other, even when you assume embarrassingly parallel algorithms. Best indicator would be to try and see.

Antagonistic · September 20, 2010, 8:10am

You could benchmark it yourself.

You often see people claiming something like 300x speedup on some ports, but… thats not always totally honest. The CPU implementation is usually single-threaded and reasonably unoptimized, due to the implementor’s focus being on the GPU rather than creating an optimized CPU version.

If you want straight numbers to compare you can always use the Peak GFLOPs numbers, but in that case its important to note that the Peak GFLOPs for GPUs usually assume a MAD (multiply and add) instruction every cycle. Realistically, it might be half that if you only have A multiply or A add per cycle to do, or even worse if you look at transfer latencies or other overheads. On the other hand, due to caching and branch prediction and such, the CPU is more likely to run at its rated GFLOPs, at least on a single core. I am unsure if SSE is worked into the GFLOP rating for CPUs, but it is something to look at if you want a truly comparable benchmark.

In short, its really hard to compare one to the other, even when you assume embarrassingly parallel algorithms. Best indicator would be to try and see.

Sarnath · September 20, 2010, 8:29am

SSE is included in CPU FLOP estimates. (why would they miss it?).

Sarnath · September 20, 2010, 8:29am

SSE is included in CPU FLOP estimates. (why would they miss it?).

Antagonistic · September 20, 2010, 8:49am

I am one of those guilty of focusing more on the GPU side of things rather than optimally optimizing my ‘gold’ comparison methods :P So SSE is something I was unsure of. Forgive my ignorance about CPU specs.

Antagonistic · September 20, 2010, 8:49am

I am one of those guilty of focusing more on the GPU side of things rather than optimally optimizing my ‘gold’ comparison methods :P So SSE is something I was unsure of. Forgive my ignorance about CPU specs.

cbuchner1 · September 20, 2010, 8:51am

In one of the latest issues of the German c’t magazine made a pretty critical review and benchmark of the Tesla C2050 double precision performance where they compared it with a Dual Xeon setup.

Tesla was outperforming Xeon substantially only for very carefully chosen problem (matrix) sizes.
For single precision they found that you better go with GTX 480 ;)

cbuchner1 · September 20, 2010, 8:51am

In one of the latest issues of the German c’t magazine made a pretty critical review and benchmark of the Tesla C2050 double precision performance where they compared it with a Dual Xeon setup.

Tesla was outperforming Xeon substantially only for very carefully chosen problem (matrix) sizes.
For single precision they found that you better go with GTX 480 ;)

Magorath · September 20, 2010, 12:54pm

Could you give a link to this review ?

Magorath · September 20, 2010, 12:54pm

Could you give a link to this review ?

tera · September 21, 2010, 3:50pm

The article is in German and not freely available. I don’t think the results come as a huge surprise to anyone familiar with the matter. But maybe some more public debunking the 100X GPU vs. CPU myth is still needed.

tera · September 21, 2010, 3:50pm

The article is in German and not freely available. I don’t think the results come as a huge surprise to anyone familiar with the matter. But maybe some more public debunking the 100X GPU vs. CPU myth is still needed.

Topic		Replies	Views
Chart GPU vs CPU CUDA Programming and Performance	11	13807	October 15, 2008
GPU vs CPU performance comparison CUDA Programming and Performance	9	15115	August 13, 2009
Can speed up ratio greater than the number of GPU processor cores? CUDA Programming and Performance	11	2766	June 3, 2010
Like for like GPU/CPU comparison CUDA Programming and Performance	1	541	December 12, 2017
GPU vs. CPU Comparison over the last years CUDA Programming and Performance	9	23137	January 11, 2010
Intel paper: Debunking the 100X GPU vs. CPU myth CUDA Programming and Performance	36	25389	April 7, 2011
Comparison of execution time in CPU and GPU is the CPU better than GPU in execution time ??? CUDA Programming and Performance	6	10535	September 17, 2010
300x to 600x times faster... really? CUDA Programming and Performance	92	34611	February 8, 2010
computation on cuda slower than on cpu CUDA Programming and Performance	3	1962	April 16, 2015
Comparing CPU and GPU Theoretical GFLOPS CUDA Programming and Performance	14	29640	May 24, 2014

performance between the CPU and GPU ? equivalence between the CPU and GPU ??

Related topics