GTX 480 vs Tesla C2050/C2070 for CUDA GPU Appl Dev Advise needed regarding GPU platform

I am interested in purchasing a desktop PC for engineering analysis software development at home primarily using Fortran 95/2003 on Linux. I would like to learn more about both NVIDIA CUDA technologies and graphics programming. I did a lot of GL development in the 90’s on SGI workstations and so am thinking OpenGL might be a good step forward for 3D graphics development.

I have been considering configuring a “gaming PC” with an Intel I7 processor and GTX 480 graphics card, but after reading information on Tesla processor technology for High Performance Computing was wondering if I should reconsider and configure a system with Tesla C2050/C2070 processors.

Please advise.

  • Mark Beyer

I am interested in purchasing a desktop PC for engineering analysis software development at home primarily using Fortran 95/2003 on Linux. I would like to learn more about both NVIDIA CUDA technologies and graphics programming. I did a lot of GL development in the 90’s on SGI workstations and so am thinking OpenGL might be a good step forward for 3D graphics development.

I have been considering configuring a “gaming PC” with an Intel I7 processor and GTX 480 graphics card, but after reading information on Tesla processor technology for High Performance Computing was wondering if I should reconsider and configure a system with Tesla C2050/C2070 processors.

Please advise.

  • Mark Beyer

Do you need double precision for your analysis?

Do you need double precision for your analysis?

I usually tell people looking to jump into CUDA development to start with a GeForce card. Something like the GTX 470 is only $270, has the same power requirements as the Tesla C2050, and let’s you figure out where CUDA can benefit your work with minimal investment. Then, once you have working CUDA programs in front of you, you can better evaluate whether the additional features of the Tesla are necessary.

Although there are many subtle differences between GeForce and Tesla, the two biggest Tesla features are higher performance double precision (by a factor of 4 per multiprocessor clock compared to GTX 470/480) and 3 or 6 GB of device memory. Without CUDA experience, it can be hard to tell ahead of time if you will need either of these things. So a ~$300 investment up front can help you decide if you need to spend $2500 later. And, if CUDA turns out to be not applicable to your computing needs, you have lost much less capital. (CUDA is awesome, but it doesn’t do everything.)

One reason I specifically push the GeForce for initial benchmarking is that many people assume that since they plan to use double precision, the Tesla must be faster. However, unlike most CPU applications, a significant fraction of CUDA programs are ultimately limited by device memory bandwidth or latency. (Feeding 448 CUDA cores with non-trivial operands can easily saturate even the fastest memory bus.) In these situations, the slower double precision performance of the GeForce might have a negligible effect of actual runtime. With a working CUDA program in hand, you can analyze it more carefully to decide if memory speed is the barrier, whereas this can be hard to estimate ahead of time.

I usually tell people looking to jump into CUDA development to start with a GeForce card. Something like the GTX 470 is only $270, has the same power requirements as the Tesla C2050, and let’s you figure out where CUDA can benefit your work with minimal investment. Then, once you have working CUDA programs in front of you, you can better evaluate whether the additional features of the Tesla are necessary.

Although there are many subtle differences between GeForce and Tesla, the two biggest Tesla features are higher performance double precision (by a factor of 4 per multiprocessor clock compared to GTX 470/480) and 3 or 6 GB of device memory. Without CUDA experience, it can be hard to tell ahead of time if you will need either of these things. So a ~$300 investment up front can help you decide if you need to spend $2500 later. And, if CUDA turns out to be not applicable to your computing needs, you have lost much less capital. (CUDA is awesome, but it doesn’t do everything.)

One reason I specifically push the GeForce for initial benchmarking is that many people assume that since they plan to use double precision, the Tesla must be faster. However, unlike most CPU applications, a significant fraction of CUDA programs are ultimately limited by device memory bandwidth or latency. (Feeding 448 CUDA cores with non-trivial operands can easily saturate even the fastest memory bus.) In these situations, the slower double precision performance of the GeForce might have a negligible effect of actual runtime. With a working CUDA program in hand, you can analyze it more carefully to decide if memory speed is the barrier, whereas this can be hard to estimate ahead of time.