CUDA Driver and Runtime version mismatch problem

Hi, guys,

I’m trying to run cuda on my new Dell studio XPS x1340 laptop, which has the Graphics card nVidia GeForce G210M. I followed the instructions very strictly and tried twice but it still didn’t work. Please help. The software configuration is below.

OS: ubuntu 9.10 32-bit
Display Driver: devdriver_3.1_linux_32_256.40.run installed successfully.
CUDA Toolkit for Ubuntu Linux 9.10: cudatoolkit_3.1_linux_32_ubuntu9.10.run installed without error. cuda and cuda library path were added to $PATH and $LD_LIBRARY_PATH.
GPU Computing SDK code samples: gpucomputingsdk_3.1_linux.run installed without error. Examples compiled without error.

Now when I ran deviceQuery, I got the following error message.

CUDA Device Query (Runtime API) version (CUDART static linking)
NVIDIA: could not open the device file /dev/nvidia0 (Input/output error).
cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.
FAILED

Please let me know if you have any clues how this could happen. I’ll appreciate your help very much!

Jim

I’ve met the same problem on a Fedora 13: ./deviceQuery don’t work, but sudo ./deviceQuery is ok.

Try using “setenforce 0” to shutdown SELinux to solve the problem.

Thank you very much for the reply!

But I got the same error message after I ran “setenforce 0”.

When I executed “sudo ./deviceQuery”, I was told the shared library “libcudart.so.3” could not be found.

Any clues?

Jim

That last is easy.

while you are root, run:

LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64:

export LD_LIBRARY_PATH

(If you’re running 32-bit then drop the 64 off the end.) All this assumes you let CUDA install itself in /usr/local .

Edit:

Thinking about that a bit longer… if you’re using sudo to run the command, you might have to put it all in one command line, something like this:

sudo -c "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64: ; ./deviceQuery" Or something to that basic effect. The issue is that you need to modify root’s environment such that it looks in /usr/local/cuda/lib64 for the shared object libcudart.so.3, you see.

I hope that’s clear.

KWL

Thanks! Now the library was found but I got the same CUDA Driver and Runtime version mismatch problem. External Media

Jim

Well, right. Me too. That’s why I’m lurking about. I hope someone will explain what I’m missing, or, that I’ll find a post that will make sense out of the bits and pieces I’ve found so far.

Based on what I’ve read around here and elsewhere is seems that the nvidia drivers that are packaged up for use on Linux Distros, Fedore 13 x86_64 in my case, are not adequate to drive the CUDA interface. I tried suing hte nvidia drivers straight from Nvidia, but I’m afraid that hosed my system quite thoroughly and It took me a couple of hours to get it all back together. I’ve also read that there may be an issue with GCC 4.4 that shipping with current Linux Distros, and that one needs to revert to 4.1 or so. (But the compiler problem may only be an issue if the examples won’t compile, as opposed to they won’t run; it is unclear to me at this point.)

(I’m running a pair of GeForce 210 cards in my Tyan S2895/K8WE workstation.)

I can say with some certainty, in my case it isn’t that SE Linux is in the way; I get the same behavior in both permissive and enforcing modes. It isn’t a permission problem because the root user has the same problems that my regular user does.

So, I think I need to figure out how to uninstall the packaged nvidia drivers and then get the “official” driver from Nvidia installed without crushing my system.

:huh:

KWL

The basic idea is that the different CUDA toolkit releases (so runtime API library releases) are tied to minimum driver versions. The driver API , on the other hand, is self contained, so all the libraries needed to make it work are shipped with the driver itself (assuming the driver package you are using hasn’t had them “unbundled” or put in odd places). This also applies to OpenCL for driver releases with OpenCL support.

The relationship for CUDA toolkits on Linux goes something like this:

Cuda 3.1 - 256 series driver or newer

Cuda 3.0 - 195 series driver or newer

Cuda 2.3 - 190 series driver or newer

Cuda 2.2 - 185 series driver or newer

Cuda 2.1 - 180 series driver or newer

An additional layer of complexity is that gcc 4.4 support only appeared in Cuda 3.1, so if you are tied to an older driver/toolkit combination, you will need to install a “legacy” compiler (probably gcc 4.1 on Redhat 5 and clones or Fedora) to get everything to work. I haven’t tried CUDA on a modern Fedora box so I can’t really provide any specific advice about fixes to get the ofiicial NVIDIA package installed, but in general if you drop out of X, remove all of the existing rpms (including any vestigial nv or noveau installations if you have them), then run the installer. If you had a perfectly good X11 configuration on the NVIDIA proprietary drivers before doing the install, just refuse the installers offer to set up X11 for you and it should just work.

As an aside, it is kind of funny to see a very familar name (in relation to a travails of certain Utah software company, now in Chapter 11 bankruptcy) pop up in a very unfamiliar place…

I found a strage solution (Sorry for misleading reply @ 2F):

run ./deviceQuery in root, and then run CUDA programs in normal user.

It works but I have to “sudo ./deviceQuery” every time after reboot.

I think it’s a bug.

I am having this same mismatch problem. I have tried running deviceQuery both as a normal user and as root but both return

[indent]

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

NVIDIA: could not open the device file /dev/nvidia0 (Input/output error).

cudaGetDeviceCount FAILED CUDA Driver and Runtime version may be mismatched.

FAILED

Press to Quit…


[/indent]

I am running Ubuntu 9.10. I have uninstalled all ubuntu nvidia packages and installed the 256.40 dev driver. I am running toolkit version 3.1.

My machine is a dell xps 13 studio with a 512MB nVidia GeForce 210M (which shows up as a 9400M due to the lack of SLi support I think).

My version of gcc is 4.4.1.

Any pointers to the solution would be much appreciated.

Thanks,

John.

I’ve exactly the same problem with my workstation which has a GTX295 (dual GPUs) and a Tesla C1060 and therefore 3 CUDA devices. It’s on Ubuntu 9.1 (i386) as well. But I found that it is more a problem with the driver, as the system only recognizes one of the two GTX295 cores and the C1060 while the other GTX295 is missing. And it works fine when I remove the Tesla C1060 leaving the GTX in the machine.

CUDA has nothing to do with SLI and you don’t need to link them up through SLI in order to use CUDA devices concurrently, so on your system you should be able to see both the 9400M and your 210M unless you have hybrid-SLI power saving on (and should find it somewhere in the BIOS I guess)

And a note that I have absolutely no problem with the same hardware configuration on Fedora 10 64-bit.

Thanks Brian,

Yes it is hybrid SLI. Simply tried to mention everything I could think of in case any of it was relevant. Both seem to be visible:

02:00.0 VGA compatible controller: nVidia Corporation Device 0a74 (rev a2)

03:00.0 VGA compatible controller: nVidia Corporation C79 [GeForce 9400M G] (rev b1)

I’m clueless at this point as to what the problem is.

ls -l /dev/nvidia*

crw-rw-rw- 1 root root 195, 0 2010-07-28 14:08 /dev/nvidia0

crw-rw-rw- 1 root root 195, 1 2010-07-28 14:08 /dev/nvidia1

crw-rw-rw- 1 root root 195, 255 2010-07-28 14:08 /dev/nvidiactl

So there should be no problem with the permissions.

??

Have you looked to see if SELinux is getting invoked? you can try, as root, running setenforce 0 and then try again.

KWL

There is a problem in itself. There has never been, to the best of my knowledge, any Hybrid SLI support in the Linux drivers. So you will probably have to disable it at the BIOS level if you have not already done so.

I have the same problem on Ubuntu 10.04 (Cuda 3.1, devdriver 256, gcc 4.4.3) but only if i compile my cuda code in 32 bits, compiled in 64 bits everything is ok. Actually since i don’t have any fermi card with more than 4Go, compiling in 64bits results in a higher register usage and for my application lower performance.

Alexis.

Hi,

i had the same problem on SLES 11. First i used the nvidia driver from the nvidia repository, and installed it via zypper. Then i got the same error as you. I uninstalled the driver, used the dev-driver form the download-page of CUDA, and everything went smooth.

HtH

Bernd

Hi,

i had the same problem on SLES 11. First i used the nvidia driver from the nvidia repository, and installed it via zypper. Then i got the same error as you. I uninstalled the driver, used the dev-driver form the download-page of CUDA, and everything went smooth.

HtH

Bernd