Tesla Workstation Advice

Hi everybody,

We’re trying to assemble a Tesla workstation for scientific simulations like molecular dynamics. I’m confused to choose between C2070 and C2075, as I read in one of the forum topics the only main difference between two is their power consumption which is more efficient in C2075, however, graphic resolution of C2070 (2560x1600) is higher than C2075 (1600x1200).

Can you guide me through choosing between two?
I was thinking of using the Tesla card as the workstation main graphic card too, does that affect the computing ability of the card? do you suggest having a separate card as the workstation VGA?

And finally, is there any difference between different Linux distros in handling Tesla on a workstation and if yes what do suggest as the best linux distro? (I was thinking of Ubuntu myself)

I’d really appreciate if you could help me and share your experiences

Happy (almost) New Year External Image

The C2075 uses a slightly newer minor respin of the GPU chip (GF110) than the C2070 (GF100). Main differences are power consumption (as you mentioned) and production yield (more interesting for Nvidia than for you). I suspect you will hardly be able to buy a C2070 anymore.

Tesla cards (like all current CUDA devices) can only execute one context at a time, including any GUI you are running. That means that if you are doing computation on the same GPU that runs your graphic display (which is allowed), the card will have to switch contexts to service both tasks. However, the card can only do cooperative multitasking, switching contexts between operations. If your CUDA kernel takes 0.5 seconds to run, then your display will freeze for that time. After the kernel finishes, the context will switch back to the graphics display, and the screen will update. If your kernels are very short, then the display will still be responsive, but your overall CUDA throughput will be lower because of all the context switching overhead. Even worse, you can’t run cuda-gdb on a display device because gdb would prevent the context from switching back to the graphics display. The graphics display driver has a watchdog timer that prevents this from happening, terminating any CUDA operation if it takes longer than a few seconds.

If possible, it is usually best to have a dedicated display device, or to run your CUDA workstation without X. (Either access the computer remotely over SSH or use only the text console locally.)

I’m not aware of any significant CUDA differences between distributions. The most important thing is to make sure that you pick a distribution release that is supported by the CUDA toolkit you are using. Changes in gcc can make newer Linux distributions not work, so this often forces you away from the latest release. For example, CUDA 4.1 only supports up through Ubuntu 11.04, but not 11.10. You can sometimes make the newer distribution work, but it is usually easier just to pick one that has been tested.

Thanks for your priceless info guys.
About a year ago when I started CUDA I was using my desktop graphics which was an old simple gpu, 9800GT. But I didn’t have freezing problems with that card which was being used as both VGA and computation engine. Has it been because of watchdog settings? (I didn’t set anything at that time I just installed CUDA toolkit and started coding).

About the difference between C2070 and C2075: I think I can find C2070 too, so do you recommend C2070 over C2075 as its got higher resolution and is power efficiency of C2075 something that I can benefit from and tell the difference? (because its gonna be a workstation with one Tesla card)

It probably means your kernels executed in a short enough time that you did not notice. Many CUDA applications execute kernels that only require a few milliseconds to run, which would not have an impact on interactive GUI performance.

The job of the watchdog is to terminate long-running kernels so the display can update, so if you have long-running kernels (> few seconds) your choice is either the kernel aborting (watchdog on) or the display freezing (watchdog off). Neither is a desirable outcome. With short kernels, you don’t have to worry about this.

Another thing to keep in mind when sharing a GUI with the CUDA device is that the 3D-compositing window managers that are now popular with many Linux distributions can cause a noticeable slowdown with CUDA programs that run lots of short kernels. If you don’t have a display-only GPU, you might want to use a lightweight desktop environment.