GeForce 570 vs. Tesla c2050

Hey y’all,

Is there any way I can predict how much faster my CUDA applicatiosn will run on a Tesla C2050 as opposed to the GTX 480/570s we use right now. My boss thinks that using multiple 570s is the way to go rather than investing in one or more C2050s. The application largely uses double precision and is not memory bound as far as I know. We don’t need ECC as our applications usually don’t run more than a few hours (most runs are less than 100 ms), so it is strictly the increase in double precision performance we are looking at. I know about the double precision “throttling” on the GeForce cards, so using a naive guestimate my apps should run 4 times faster on the Tesla, but surely it cannot be that simple…


It depends on how well your problem scales over multiple GPUs and whether you can make use of the extra memory on the Tesla. You also need to find out for sure how close you are to the memory bandwidth limit - even if you’re not at the limit already you might well be within a factor of 4 of it.

If you can do it with your code, try to see the difference between running the code with double prec or single prec (maybe a redefine of types through all of the code helps, or maybe you can just check the most important part). If the execution time differenc is less than a factor of 2.5 or so you are probably not in a regime where you can benefit from vastly increased double prec throughput.


You might also try out your code on a Tesla M2050 in Amazon’s ec2 cloud service.