Running CUDA programs without starting X server

Hello, I have a Ubuntu 9.10 machine with a GTX 480 card. Since I have only one video card on the machine, when I run a kernel that takes longer than about 10 seconds, the watchdog seems to kill it and I get launch timeout.

I have booted ubuntu into text mode, so there is no X server and therefore no watchdog. The problem is, even though the nvidia driver seems to be loaded(lsmod | grep nvidia), the CUDA programs do not work: they cannot find any CUDA capable device.

Do I need to load an additional driver or something?

Thanks!

This is explained in the cuda toolkit release notes.

N.

Thank you, that was it. I should have RTFM-ed more :">

That is fine but every run of cuda code tooks about 5 seconds! Something is missing here! X-es loads something… but it is not a module!
I’ve tried lsmod > modules_1.log during idle and lsmod > modules_2.log and diff modules_1.log modules_2.log gave me only:
diff modules_1.log modules_2.log
14c14
< nvidia 11201625 0

nvidia 11201625 56
What could be missing? It is some initialization of device i suppose. May be i need some permanently running “cuda kick starter”. i mean some code running quite frequently that doing nothing but lets device to be active…
(I do not mean performance level - it could be minimal)

Under Debian I just press ctrl-alt-F1 to go to the shell. There, I just launch my cuda program without being killed by the watchdog after a few seconds. When the program finished, I go back to my X desktop by pressing ctrl-alt-F7.

The Watchdog kills an individual kernel when it takes more than 5 seconds. I am running cuda programs for days without having them killed. A cufft called for example has quite many kernel calls so it little chances to get killed even for very large matrices.

Regarding the original questions. At my workplace we have 2 computers without running X server which are used for CUDA.

no! i mean the hangup before running the kernel. It takes 5-7 seconds to run my program or nvprof or nvidia-smi (any device related program). After that (inside my program) kernels run normally: before running each kernel there is no hang up.
Moreover during the runtime of my program nvidia-smi also runs smoothly. So it is some initialization happens before running the kernel.

I will be very appreciated if you can advice something… I’ve tried lsmod during runtime of device related programs but nothing except nvidia module changed… it was used by 0 before run and by 56 during runtime of my program.

I’d like to add some new results:
If I use script which sets cuda nodes (the same as above provided by Nico) and after that start X then i have no hang up for nvidia-smi but 6 sec hang up for cudaSetDevice(). I have 4 physical 690 cards -> 8 logical in my system.