I have purchased a new Fermi Graphics card and I want to know, which OS will be better for running CUDA programs. I have to run sometime shell script to to run a program multiple time. can any body help me in this regard. ANd one thing more, Can I run CUDA if I have Windows 7 and install Linux using virtual box.
Both windows and linux can run cuda. You should consider different factors, like development environment and stuff like that. I love visual studio, so I choose windows.
And I don’t think you can run cuda from inside virtual box.
I would take the contrarian view and suggest that if you can live without Nsight and your new Fermi isn’t a Telsa or Quadro, Linux is probably to be preferred over WDDM windows. If you have a Fermi Telsa and can use the TCC driver, then there is less to chose between them.
CUDA doesn’t work in virtualization environments, so VirtualBox or Vmware is out of the question.
The most important thing is to use an OS you are familiar with. You don’t want to struggle with basic OS issues while also trying to debug CUDA.
Based on the comments and complains in the forums, I think there is a general anti-vote for the Windows display driver. WDDM introduces a lot of performance quirks that go away once you switch to the TCC Windows driver or Linux. (or go back in time and use Windows XP)
The Windows driver has a huge undocumented problem. Unless you have two video cards and the Fermi card is set up as a compute-only device, the driver will not allow you to run any kernels longer than 5 seconds (and, in some situations, even sequences of kernels adding up to 5 seconds). It really bites when your program is terminated by the driver 2 hours into a 8-hour computation, just because some kernel exceeded the execution time limit.
For any real work, you have to go with a dual GPU setup or to use Linux.
Hmm, given that the watchdog has been around since the dawn of CUDA, I assumed it was documented. However, a quick search doesn’t reveal it in the documentation. Did I miss it somewhere?
To be fair, Linux also has a watchdog if you are running X. You just have the option of not running the GUI in Linux, unlike Windows.
It’s been around since the beginning, but somehow that fact never made to the programming guide. Also, since the X and the Linux kernel are both available with full source code, it should be straightforward to disable the watchdog there.
Given that the GUI desktop is unresponsive while a CUDA kernel is running, I think stopping X is the better approach for long running kernels. :) I just mentioned the Linux watchdog for completeness.
watchdog is baked in the NVIDIA driver, I don’t think you can just disable it. you can disable the watchdog in Win7/Vista (search for “TDR vista,” change the appropriate registry keys to disable TDR or increase the delay to much longer than 2s), and if you boot to a console and don’t run X, you can get no watchdog on Linux as well.
It’s worth prodding the documentation folks to put an explanation of the watchdog in the platform release notes. I did not realize that the various people wandering into the forums confused by the watchdog had no other source for that information.
Windows release notes, halfway down: “Individual kernels are limited to a 2-second runtime by Windows Vista. [should that be all Windows versions, or just Vista?] Kernels that run for longer than 2 seconds will trigger the Timeout Detection and Recovery (TDR) mechanism. For more information, see http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx.”
Linux release notes: “Individual GPU program launches are limited to a run time of less than 5 seconds on a GPU with a display attached. Exceeding this time limit usually causes a launch failure reported through the CUDA driver or the CUDA runtime. GPUs without a display attached are not subject to the 5 second runtime restriction. For this reason it is recommended that CUDA be run on a GPU that is NOT attached to a display and does not have the Windows desktop [???] extended onto it. In this case, the system must contain at least one NVIDIA GPU that serves as the primary graphics adapter.”
But it really should be in the programming guide. Or at least in the reference manual in the section about the error code cudaErrorLaunchTimeout. Release notes are the last place anyone looks.