Double precision performance

hakan1 · May 21, 2011, 2:32pm

Hi,
I’m just about to get started with CUDA, and am looking at what graphics card to buy. My first thought was with the geforce range (lots of cores for cheap!), but then I found that double precision was throttled in this series, and also in other models. So, before I make my decision, I just want to make sure I got everything right with this. AFAIK this is the DP performance, as a fraction of SP performance:

Tesla series: 1/2 (full performance)
Quadro 4000-6000: 1/2 (full performance)
Quadro 600-2000: 1/12
GTX 5xx: 1/8

It has been remarkably hard to find these figures (or i’ve been looking in the wrong places…)!
If anyone could confirm these figures it would be greatly appreciated!

Thanks!

tera · May 21, 2011, 4:46pm

Unless you know you are limited by double precision performance of a single GPU, I’d recommend buying a consumer card first to familiarize yourself with CUDA and to find out your specific needs. There are remarkably few problems that are actually limited by double precision throughput, as double precision also needs twice the memory bandwidth. And even then, the double precision performance per $ is still better for the consumer cards.

hakan1 · May 21, 2011, 8:43pm

Ultimately I will use CUDA for scientific computing, so at some point double precision is needed. But for my purposes right now, I’m only interested in learning CUDA. So I’m definitely leaning towards a consumer card. I guess for single precision performance, gtx580 will be similar to quadro 6000?

tera · May 22, 2011, 12:23am

Actually the GTX 580 achieves about 50% more single precision GFLOP/s and 33% higher bandwidth than the Quadro 6000. It has only a quarter of the memory though. It’s definitely more than enough for learning CUDA.

hakan1 · May 22, 2011, 9:23am

Yeah the memory won’t be a problem. Is memory bandwidth throttled as well for DP on the gtx series?

tera · May 22, 2011, 10:35am

No, you get the full memory bandwidth (33% more than Quadro 6000). So memory bandwidth limited double precision calculations are actually faster on the GTX 580.

I don’t think it would even be technically possible to throttle memory bandwidth depending on single- or double precision, as the memory controller has no info what the data is used for.

Topic		Replies	Views
Tesla C1060 vs GTX 480 Double precision performance CUDA Programming and Performance	15	4707	September 28, 2010
Double precision throughput on GTX's CUDA Programming and Performance	2	3525	August 12, 2011
Double precision and CUDA CUDA Programming and Performance	9	7843	October 21, 2013
Double precision: GTX 465, GTX 480 and C2050 CUDA Programming and Performance	16	3790	September 9, 2010
double precision and GeForce card capable of double prec calcs? CUDA Programming and Performance	4	14460	June 28, 2011
GeForce 570 vs. Tesla c2050 CUDA Programming and Performance	3	1786	August 16, 2011
Single vs Double Precision CUDA Programming and Performance	2	4883	August 2, 2010
GTX 280 and Tesla 10 DP How much DP peak? CUDA Programming and Performance	8	11472	June 17, 2008
Buying Advice C2050/C2070 CUDA Programming and Performance	14	9694	August 15, 2010
GTX 280, CUDA and Double Precision CUDA Programming and Performance	15	16844	July 17, 2008

Double precision performance

Related topics