I will buy GTX 690 SLI or Tesla C2075 for image processing and computer vision work.
Now, I use CUDA on GeForce 8800 GTS.
GeForce GTX 690 (New architecture)
From http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-690/specifications ,
One GeForce GTX 690 has 3072 CUDA Cores. GTX 690 SLI would have 6144 CUDA Cores.
Each core has 915 Base Clock (MHz).
Memory = 2048 MB per GPU GDDR5 (1 GTX 690 has 2 GPU)
Memory Bandwidth 384 GB/sec
5,621 GFLOPS (From http://hexus.net/tech/reviews/graphics/38805-nvidia-geforce-gtx-690 )
Thus GTX 690 SLI = 11,242 GFLOPS.
Tesla C2075 (Old architecture)
From http://www.nvidia.com/docs/IO/43395/NV-DS-Tesla-C2075.pdf ,
Tesla C2075 has 448 CUDA Cores.
Each core has 1.15 GHz.
Memory = 6GB GDDR5
Memory Bandwidth 144 GB/sec
Peak single precision floating point performance 1030 Gflops
Peak double precision floating point performance 515 Gflops