ubuntu 9.04
cuda driver 190.18
cuda toolkit 2.3
the installtion of cuda is ok
When I run the cuda sdk sample convolutionSeperable, got an error:
cudaSafeCall() Runtime API error in file <main.cpp>, line 215: no CUDAdevice is available.
other samples in sdk could not be executed either.
I wonder how can I deal with the problem.
Help!
PS: I do not install the X window, should I install it?
No, but that (at least indirectly) is the source of your problem. If you read the toolkit release notes you will see this:
o In order to run CUDA applications, the CUDA module must be
loaded and the entries in /dev created. This may be achieved
by initializing X Windows, or by creating a script to load the
kernel module and create the entries.
If you keep reading it will tell you what must be done…
Add it to a script right at the end of the boot process, like rc.local or something. It doesn’t need to be done early, after all CUDA is userspace, and until you can login nothing it going to need it.
It is run by root. You don’t have to change anything - all init scripts are run by process 0 (init, hence the name) as root. My guess is that is the script (at boot with a much more limited set of paths and environment variables) doesn’t find modprobe or mknod. Hard code the paths into those as you did with lspci and it should work.
For the second time: rc.local is run as root. Just because it doesn’t work doesn’t mean it isn’t being run as root. I have a whole cluster of headless, stateless compute nodes which set up CUDA this way. They have no operating system installed and download the OS image and set themselves up from scratch every time they are rebooted. It really does work.
Reboot into the original “no CUDA devices state” and then try running the exact rc.local script you have install by hand, as root and see what it does. Maybe even add some echo messages into the script to see where it gets to when it fails.
I modified the /etc/rc.local instead of /etc/init.d/rc.local.
When I add the script to /etc/init.d/rc.local, everything is OK.
You are right, The init execute the rc.local at the right of root.
Thanks great for your help. You are so kind.
By the way, a whole cluster of headless, stateless compute nodes which has no operating system installed and download the OS image and set themselves up from scratch every time they are rebooted sounds great. could explain it briefly? hehe, just for curious. :rolleyes:
The arrangement is based on perceus and using a home made operating system image which include the CUDA drivers and runtime support. Nodes boot over gigabit ethernet using standard PXE and get sent a small PXEboot image containing a kernel and provisioning client for the Node to boot. Once up, the provisioning client then contacts an administrative server process running elsewhere on the network which sends down the running operating system image over the wire. It gets uncompressed into a ramdisk and bootstrapped. The rest of the image is served via NFS. The whole boot/setup process takes about 30 seconds. If you want to do an operating system reinstall or upgrade, just reboot…