C2050 with ECC slower than C1050


Just installed my new C2050 and immediately ran a benchmark to test vs. my good old C1060.
Expecting around 50% improvement (double number of cores) i was deeply disappointed to find that it was actually slower than the C1060.

After about an hour I found out about turning off ECC and indeed with ECC turned OFF the bench ran as expected - 50% faster than C1060.

My question is: why is there such an overhead for ECC, I would understand a small slowdown, but a 100% slowdown (relative to the ECC OFF runtime) seems excessive…

Am I missing something?


Probably your code is totally memory bound.
However, indeed needed more deatilas about ECC speed in different parts, cache, memory read-write etc.

Even so, a 100% performance hit is a bit excessive for ECC in my opinion…

Can anyone from NVIDIA give some more details about this?

For me running the code, which is also mostly memory bounded, with ECC caused ~10-15% performance lost.

I switched it of and got the added performance.


Any thoughts on why my performance hit was 10 times worse than yours?

I dont know :(… just thought it might be worth noting not to blame ECC right away.

Did you try to enable ECC on again and retry the test?

Did you change other things (specifically in the thread-block size) that might have changed

the performance? Fermi didnt give me any performance boost till i increased the thread-block size to double than

it was on the C1060…