for gpgpu computing purpuses we were doubting between GTX TITANS and Tesla c2075’s (when used about in the same price range).
The specs were however in favor of the GTX Titans, with the ECC memory as an sole exception.
with 4500 GFLOPS (Single) 1300-1500 GFLOPS (Double), cuda 3.5 (dynamic parallelism) for the TITAN
1030 GFLOPS (Single), 515 GFLOPS (double) cuda 2.0 (no dynamic parallelism) for the Tesla c2075
for the specs see
But after receiving our first titan, we noticed that the double performance is less then the C2075!
Single-Precision GFLOPS in AIDA64 4543 (OK)
Double-Precision GFLOPS in AIDA64 222.3! (INSTEAD OF 1300!)
same when doing the matlab gpuBench(), that is testing with matrix multiplications, solving systems and fft’s
MTimes_D Backslash_D FFT_D MTimes_S Backslash_S FFT_S
Tesla C2075 333.84 246.11 73.36 696.37 435.56 163.04
GF GTX TITAN 223.68 82.34 77.05 3635.97 179.13 252.21
I spoke with the programmers of the MATLAB gpuBench benchmark, and they had the same observations, together with many people contacting them. (So it doesnt seem to be system dependent)
Is there any one who could explain these double precision performance differences?