GTX285 vs C1060 vs GTX480 GFLOP/s ?

madmaze · June 24, 2010, 11:43pm

So ive been working with a gtx285 and a c1060 for about a year.
For some reason I assumed the the gtx285 had 933GFLOPs and the c1060 the same just a slower mem clock.
I guess i never questioned it.

Today I was trying to calculate out the theoretical performance to a friend and i could not make sense of anything.

so my gtx285 is 240 cores at 1.48ghz with 3 operations/cycle. that would be 1065.6 GFLOPs is right?
then the c1060 is 240 cores at 1.3ghz with 3 operations/cycle. that would be 936 GFLOPs is this right?

so i concluded that those results are close to 933 GFLOPs… so i figured id double check with the gtx480, this is where i completely lost all sense of knowing the answer.

gtx480 has 480 cores at 1.401ghz with 3 operations/cycle. that would be 2017.44 GFLOPs… and when i looked at the specs i could only find something about 1350 GFLOPs…

What am i missing or reasoning wrong?

seibert · June 25, 2010, 1:09am

FLOP counting is a little confusing because of the dual-issue capabilities. All CUDA cores can complete one instruction (at least the basics) per clock cycle. This includes a single precision floating point multiply-add, which counts as two operations. In addition, all of the compute capability 1.x devices had the ability, in principle, to dual-issue a multiply instruction that was executed by another part of the multiprocessor. That’s where the third operation comes from in the peak GFLOPS estimate.

In the original CUDA GPUs, there was a problem and the dual issue often did not happen even when there was a multiply instruction available. Later GPUs fixed this (not sure if it was G92 or GT200), but the dual issue multiply was still of limited use, except as a gimmick to inflate the peak GFLOPS numbers. In Fermi, it seems that they have removed it.

So to reproduce the NVIDIA calculation of peak GFLOPS, you multiply clock * CUDA cores * 3 for pre-Fermi and clock * cores * 2 for Fermi. Personally, I would not get too caught up in the difference. I mostly compare GPUs looking just at clock * cores (i.e., instruction throughput) and memory bandwidth.

Topic		Replies	Views
Theoretical FLOP speed Need clarification(s) CUDA Programming and Performance	8	28489	March 19, 2009
8800GTX:345GFlops or 518GFlops? CUDA Programming and Performance	8	9672	December 12, 2007
GTX280/GT200 GPU Can you really reach 1TFLOP/s? CUDA Programming and Performance	6	10230	June 19, 2008
gigaflops CUDA Programming and Performance	16	16608	September 11, 2008
calculating thereticaly possible flops architecture differences G80/GT200/Fermi CUDA Programming and Performance	1	1903	April 26, 2010
Confusion about GFlops of c1060/c2050 CUDA Programming and Performance	4	14473	November 29, 2010
what is the double-precision flops rating of the gtx580? CUDA Programming and Performance	16	33658	April 10, 2014
How to compute performance in GFLOPS ? CUDA Programming and Performance	25	12284	November 17, 2008
GPU single and double precision FLOPs CUDA Programming and Performance	1	7568	June 16, 2009
GTX 460 - how man angels on the head of a pin how many cores per MP for a GTX 460 - 32 or 48 CUDA Programming and Performance	15	15791	July 18, 2010

GTX285 vs C1060 vs GTX480 GFLOP/s ?

Related topics