Confusion about GFlops of c1060/c2050

Hardy616 · November 27, 2010, 10:46pm

Hello guys,

I found there are two peak GFlops published here Nvidia Tesla - Wikipedia. Take c2050 for example. One GFlops is 1288 and the other is 1030. What’s the difference exactly? I personally think 1030 GFlops is something like raw rate. 1288 is converted from it. Any idea?

Thanks,
Hardy

Hardy616 · November 27, 2010, 10:46pm

Hello guys,

I found there are two peak GFlops published here Nvidia Tesla - Wikipedia. Take c2050 for example. One GFlops is 1288 and the other is 1030. What’s the difference exactly? I personally think 1030 GFlops is something like raw rate. 1288 is converted from it. Any idea?

Thanks,
Hardy

seibert · November 28, 2010, 2:42pm

The GPUs have multiple functional units that can be active at the same time. If you look at the column titles, you see the higher one says “MUL+ADD+SF”, by which I believe they mean the Multiply-Add instruction dual-issued with a special function instruction (__expf(), __cosf(), etc). The second column says “MUL+ADD”, so the special function contribution has been removed.

Of course, reaching either of these peak GFLOPS requires that you have nothing but MAD or MAD and special function instructions available for execution, with no other bottlenecks (like waiting on global memory reads).

seibert · November 28, 2010, 2:42pm

The GPUs have multiple functional units that can be active at the same time. If you look at the column titles, you see the higher one says “MUL+ADD+SF”, by which I believe they mean the Multiply-Add instruction dual-issued with a special function instruction (__expf(), __cosf(), etc). The second column says “MUL+ADD”, so the special function contribution has been removed.

Of course, reaching either of these peak GFLOPS requires that you have nothing but MAD or MAD and special function instructions available for execution, with no other bottlenecks (like waiting on global memory reads).

Hardy616 · November 29, 2010, 3:40am

That makes sense. Thanks a lot!