performance between the CPU and GPU ? equivalence between the CPU and GPU ??

Hello

i would like if it’s possible to give a comparaison between the :

the CPU is Intel® core™ Quad CPU Q6600 @ 2.4 GHz

the GPU is Geforce GTS 250 128 cores

so if we can said that the GPU is equivalent to the 8 CPU (just example ) or something like that !!

of course we talk about a comparaison for calculated heavy and not for small data !

You could benchmark it yourself.

You often see people claiming something like 300x speedup on some ports, but… thats not always totally honest. The CPU implementation is usually single-threaded and reasonably unoptimized, due to the implementor’s focus being on the GPU rather than creating an optimized CPU version.

If you want straight numbers to compare you can always use the Peak GFLOPs numbers, but in that case its important to note that the Peak GFLOPs for GPUs usually assume a MAD (multiply and add) instruction every cycle. Realistically, it might be half that if you only have A multiply or A add per cycle to do, or even worse if you look at transfer latencies or other overheads. On the other hand, due to caching and branch prediction and such, the CPU is more likely to run at its rated GFLOPs, at least on a single core. I am unsure if SSE is worked into the GFLOP rating for CPUs, but it is something to look at if you want a truly comparable benchmark.

In short, its really hard to compare one to the other, even when you assume embarrassingly parallel algorithms. Best indicator would be to try and see.

You could benchmark it yourself.

You often see people claiming something like 300x speedup on some ports, but… thats not always totally honest. The CPU implementation is usually single-threaded and reasonably unoptimized, due to the implementor’s focus being on the GPU rather than creating an optimized CPU version.

If you want straight numbers to compare you can always use the Peak GFLOPs numbers, but in that case its important to note that the Peak GFLOPs for GPUs usually assume a MAD (multiply and add) instruction every cycle. Realistically, it might be half that if you only have A multiply or A add per cycle to do, or even worse if you look at transfer latencies or other overheads. On the other hand, due to caching and branch prediction and such, the CPU is more likely to run at its rated GFLOPs, at least on a single core. I am unsure if SSE is worked into the GFLOP rating for CPUs, but it is something to look at if you want a truly comparable benchmark.

In short, its really hard to compare one to the other, even when you assume embarrassingly parallel algorithms. Best indicator would be to try and see.

SSE is included in CPU FLOP estimates. (why would they miss it?).

SSE is included in CPU FLOP estimates. (why would they miss it?).

I am one of those guilty of focusing more on the GPU side of things rather than optimally optimizing my ‘gold’ comparison methods :P So SSE is something I was unsure of. Forgive my ignorance about CPU specs.

I am one of those guilty of focusing more on the GPU side of things rather than optimally optimizing my ‘gold’ comparison methods :P So SSE is something I was unsure of. Forgive my ignorance about CPU specs.

In one of the latest issues of the German c’t magazine made a pretty critical review and benchmark of the Tesla C2050 double precision performance where they compared it with a Dual Xeon setup.

Tesla was outperforming Xeon substantially only for very carefully chosen problem (matrix) sizes.
For single precision they found that you better go with GTX 480 ;)

In one of the latest issues of the German c’t magazine made a pretty critical review and benchmark of the Tesla C2050 double precision performance where they compared it with a Dual Xeon setup.

Tesla was outperforming Xeon substantially only for very carefully chosen problem (matrix) sizes.
For single precision they found that you better go with GTX 480 ;)

Could you give a link to this review ?

Could you give a link to this review ?

The article is in German and not freely available. I don’t think the results come as a huge surprise to anyone familiar with the matter. But maybe some more public debunking the 100X GPU vs. CPU myth is still needed.

The article is in German and not freely available. I don’t think the results come as a huge surprise to anyone familiar with the matter. But maybe some more public debunking the 100X GPU vs. CPU myth is still needed.