double precision exceed 500GFLOPS?

I hope someone could help me check below message, which i get on internet.
“NVIDIA would improve GPU’s DP performance to 1/2 current SP performance in the next GPU architecture.”


where did you read that?

It is actually quite easy to do (as in conceptually, don’t know about practically). In the current generation, NVIDIA has put 1 DP unit per multiprocessor. If they add 3, they would have the ratio of 1:2

I think so, thank you.

But, it seems there isn’t any nvidia official announcement about this plan.

Just brainstorming about DP performance…

While NV COULD just add 3 more DP units and get about 1/2 SP performance, I’d think this is a bad idea…
why spend the transistors on DP when most users would rather just have more processors? Sure it’s all a tradeoff, and a fine balance, and there’s no perfect ratio of registers to shared memory to SPs to DP units to texture units, but probably the current 8:1 SP/DP ratio is “about right”.

But just brainstorming some more, perhaps there’s some architectural magic possible that’d allow it, perhaps getting two SP units to work together to process one DP as well. That’d give the “1/2 SP performance” and in fact would allow you to REMOVE the single DP unit.
Is this possible? Who knows, it may be, but the tricky part is getting the DP behavior to be IEEE-754 compliant, like the current DP hardware.

The AMD RV770 does something like this, but I believe it combines 4 SP units to do 1 DP operation, and the DP results aren’t IEEE (no denormals at least).