No recognised CUDA devices!

Hello all,

I’ve got a system running 32bit Ubuntu7.10 with 3 graphics cards installed. Two cards are GTX280’s (which I want to use for CUDA, in 2nd and 3rd PCIe slots) and the third is (now - see previous post!) a GeForce 7300GT (for display only, in 1st PCIe slot).

Drivers (177.67), CUDA toolkit (2.0_ubuntu7.10) and SDK (2.02.0807.1535) are all installed and paths set up as required. All the source SDK examples can be compiled quite happily but when I run any of them (e.g. ./bandwidthTest) I get:

NVIDIA: could not open the device file /dev/nvidia2 (Input/output error)
device 0:Device emulation (CPU)

ls /dev/nv* give:
/dev/nvidia0 /dev/nvidia1 /dev/nvidia2 /dev/nvidiactl

ls -l /dev/nv** gives:
crw-rw-rw- 1 root root 195, 0 2008-10-06 13:49 /dev/nvidia0
crw-rw-rw- 1 root root 195, 1 2008-10-06 13:49 /dev/nvidia1
crw-rw-rw- 1 root root 195, 2 2008-10-06 13:49 /dev/nvidia2
crw-rw-rw- 1 root root 195, 255 2008-10-06 13:49 /dev/nvidiactl

If I run the code as root (sudo ./bandwidthTest) I get the error:
./bandwidthTest: error while loading shared libraries: libcudart.so.2: cannot find shared object file: No such file or directory
But this library file IS in the /usr/local/cuda/lib directory.

LD_LIBRARY_PATH is set to /usr/local/cuda/lib so what’s going on???

Has anyone expereinced similar problems and have any pointers to how I might resolve this issue? Is there some step that I’m missing in order to correctly set up a multi GPU CUDA system on linux?

Thanks for your help,

GWG

If libcudart.so.2 is in /usr/local/cuda/lib, then why are you setting LD_LIBRARY_PATH to
/usr/local/lib ?

Oops, sorry that was a typo in my post! LD_LIBRARY_PATH is correctly set to /usr/local/cuda/lib and PATH has /usr/local/cuda correctly added to it on my system.

I encounterd similar problems with opensuse 10.3 (64bit).

It can be (temporarily) solved by calling nvidida-xconfig.
BUT, after rebooting the systems the problem is back.

Because it worked on a identical system it seems to be an installation issue.
The one I installed works fine (as root in user-home, default installalation paths/settings).

The one our admin installed doesn’t (custom installation paths, root in root-home). It seems the installer didnt get the customs settings right. (or our admin, who did me a favor by installing the hole think although he had better thinks to do).

Solved it - thanks to this forum and the Nvidia README!

It’s not pretty, and it’s probably common knowledge, but it may be of use to others:

Step 1
Lazy configuration of xorg.conf for all the available devices:
[font=“Courier”]nvidia-xconfig -a [/font]
But CUDA still only runs in emulation mode becasue it cannot find any enabled devices …

Step 2
Softboot all available devices by adding:
[font=“Courier”]Option “Int10Module” “on”[/font]
to each device section of xorg.conf

This creates horrible virtual memory errors as all cards fight it out …

Step3
Solve memory errors by adding
[font=“Courier”]uppermem 524288
pci=nommconf[/font]
and
[font=“Courier”]vmalloc=256MB[/font]
to grub.conf

deviceQuery shows me I now have two working CUDA gpu’s reading and writing at 1.7 and 1.5MB/s :)

Cheers everyone,

GWG