Double precision GFlops of Kepler

mjg1 · April 2, 2012, 7:40am

Hi everyone,

does anyone know the double-precision performance in terms of GFlops of the GTX680 and GT640M?

Many thanks!

tera · April 2, 2012, 10:07am

Table 5-1 of the Programming Guide shows it is 8/192 (or 1/24) of the single precision throughput, so 128 GFLOP/s for the GTX680 and “up to” 20 GFLOP/s for the GT640M.

pasoleatis · April 2, 2012, 11:20am

Do you know what is the theoretical double precision per watt compare to the 470 or 480 series?

tera · April 2, 2012, 11:35am

Maybe the documented double precision throughput divided by the documented power consumption? Shouldn’t be too difficult to look up these values.

pasoleatis · April 2, 2012, 5:48pm

0.88 GFLops/W for 580
0.66 GFLops/W for 680

pszilard · April 3, 2012, 1:20am

Yeah, it’s quite crippled. However, what I find quite strange is that the throughput of integer operations has also gotten quite crippled, the 32-bit integer add is only 87.5%, shift/compare 1/12th (!), and mul/mad 1/24th (!!!). This will cripple integer arithmetic-intensive kernels.

I guess I’ll start replacing integer multiplications with 2 or 3 with additions…

rikm · April 3, 2012, 1:21am

I’m wondering where the 8/192 came from, what version of the Programming Guide that is? Because all I can find is the 4.1 version of the C Prog Guide and I don’t see that ratio in that table. Aside from that, reading the GeForce-GTX-680-Whitepaper-FINAL.pdf, it has on page 6 for GFLOPs: GT200(Tesla)1063, GF110(Fermi)1581, GK104(Kepler)3090, which are wildly higher than what you calculated. Granted it’s a whitepaper, but it should be in the right ballpark.

Using the same chart and to answer another question above, I calculate a GFLOP/Watt value of 6.48 for Fermi and 22.3 for Kepler.

avidday · April 3, 2012, 7:31pm

I guess you missed the part about double precision GFLops…

hamster143 · April 5, 2012, 10:10pm

Does anyone have any idea if either the DP or the integer performance will get any better in future? As things stand now, GTX680 is 1.5x slower than GTX580 on DP ops or integer multiplications, and 6x slower on integer shift/compares.

Oh, and has that ‘8’ in the table 5-1 been verified to apply to GTX680? It shows 16 double-precision ops per multiprocessor in compute capability 2.0, and, as I recall, the 16 was only applicable to Tesla/Quadro, and regular gaming cards were crippled and they only did 4 double-precision ops per multiprocessor.

DrAnderson42 · April 6, 2012, 4:22pm

NVIDIA has a long history of not discussing unannounced products, so you aren’t going to get an answer here. Your guess is as good as anyone elses.

Sure, but only if 100% of your instructions are any of these. Real world apps have a mix of instructions. I wouldn’t overanalyze and claim that the 680 is bad for compute until we have a number of real-world CUDA application benchmarks (which we are sorely lacking…). I don’t have a 680 yet, or I would be posting such benchmarks.

dcbarton · April 7, 2012, 8:52pm

I’m doing a lot of testing now and it’s very very bad. Too bad. Much worse that it needs to be. I’m using sm_30 in their latest toolbox. I think they’re trying to push all compute folks into extremely expensive and profitable cards. if true, AMD, here I come!

Topic		Replies	Views
what is the double-precision flops rating of the gtx580? CUDA Programming and Performance	16	33443	April 10, 2014
Double-precision on GTX 280 and coming telsa S1070 CUDA Programming and Performance	11	21576	August 22, 2008
Student buying card for CUDA. Which one? CUDA Programming and Performance	16	14854	December 4, 2012
Does the GTX1060 support double precision? CUDA Programming and Performance	4	11133	February 24, 2017
the superior 680 / 690 gpu how many cycles is 32 x 32 == 64 bits integer CUDA Programming and Performance	4	3265	May 2, 2012
Kepler vs GT200: pure computational performance issue CUDA Programming and Performance	6	1140	August 14, 2013
GTX 280 and Tesla 10 DP How much DP peak? CUDA Programming and Performance	8	11445	June 17, 2008
Double precision throughput on GTX's CUDA Programming and Performance	2	3512	August 12, 2011
Double precision for mobile Nvidia Mobile GPUs CUDA Programming and Performance	4	1037	July 21, 2011
GTX2xx double precision support CUDA Programming and Performance	1	1972	October 16, 2009

Double precision GFlops of Kepler

Related topics