GTX 690 SLI vs Tesla C2075, which one is faster on single and double precision floating point ?

I will buy GTX 690 SLI or Tesla C2075 for image processing and computer vision work.
Now, I use CUDA on GeForce 8800 GTS.

GeForce GTX 690 (New architecture)
From http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-690/specifications ,
One GeForce GTX 690 has 3072 CUDA Cores. GTX 690 SLI would have 6144 CUDA Cores.
Each core has 915 Base Clock (MHz).
Memory = 2048 MB per GPU GDDR5 (1 GTX 690 has 2 GPU)
Memory Bandwidth 384 GB/sec
5,621 GFLOPS (From http://hexus.net/tech/reviews/graphics/38805-nvidia-geforce-gtx-690 )
Thus GTX 690 SLI = 11,242 GFLOPS.

Tesla C2075 (Old architecture)
From http://www.nvidia.com/docs/IO/43395/NV-DS-Tesla-C2075.pdf ,
Tesla C2075 has 448 CUDA Cores.
Each core has 1.15 GHz.
Memory = 6GB GDDR5
Memory Bandwidth 144 GB/sec
Peak single precision floating point performance 1030 Gflops
Peak double precision floating point performance 515 Gflops

According to the programming guide (I don’t recall this being refuted), GTX 690’s double precision performance is 1/24’th of its single precision performance. So, 468 GFLOPS for a pair of GTX 690’s.

In almost any practical situation, though, I’d expect a pair of 690 to be faster. C2075 has its own unique advantages like the error-correcting memory.

Definitely the GTX 690. I’ve done image processing with CUDA for almost 4 years and I’ve never needed double precision.

Thank you everybody for the answers.
Thus I should buy GTX 690 SLI.