Cuda 6.5 deviceQuery fails even though Nvidia driver for Tesla is properly installed

steelpeart · August 31, 2014, 4:42am

We have an HP SL250s Gen8 with two Tesla M2090 GPU blades installed. This is running RHEL6.5 but the site that installed the base OS updated their kernel via RHN to the following:

2.6.32-431.23.3.el6.x86_64

The driver was installed after the kernel update and is the latest available Tesla driver - just downloaded today.

Cuda installed via “yum” repo method using the cuda-repo-rhel6-6.5-14.x86_64.rpm and no errors noted at installation time.

When we installed the samples and ran “make” we then tried our usual deviceQuery but only to get this output:

[root@mn318 release]# ./deviceQuery
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

FATAL: Could not open ‘/lib/modules/2.6.32-431.23.3.el6.x86_64/kernel/drivers/video/nvidia-uvm.ko’: No such file or directory
cudaGetDeviceCount returned 30
→ unknown error
Result = FAIL

In the aforementioned directory, only “nvidia” is present.

Any ideas on this one? Many thanks in advance!

Robert_Crovella · August 31, 2014, 12:48pm

Your driver install is broken. It’s not clear which GPU driver you installed. " is the latest available Tesla driver - just downloaded today" doesn’t tell me anything. I don’t know where you downloaded it from or which driver it was. Also, last time I checked, there are no Tesla standalone drivers available yet which support CUDA 6.5 - you have to use the one that comes with the CUDA 6.5 runfile installer (340.29). (I tried using the driver wizard just now on nvidia.com to select the “latest” 64-bit linux driver for M2090, and I got 331.89, which is not acceptable for CUDA 6.5). I suggest downloading the 64-bit linux CUDA 6.5 runfile installer and using that.

Note the first question in the FAQ here:

"Q: Are the latest NVIDIA drivers included in the CUDA Toolkit installers?
A: For convenience, the installer packages on this page include NVIDIA drivers which support application development for all CUDA-capable GPUs supported by this release of the CUDA Toolkit. If you are deploying applications on NVIDIA Tesla products in a server or cluster environment, please use the latest recommended Tesla driver that has been qualified for use with this version of the CUDA Toolkit. If a recommended Tesla driver is not yet available, please check back in a few weeks. "

So use this runfile installer:

http://developer.download.nvidia.com/compute/cuda/6_5/rel/installers/cuda_6.5.14_linux_64.run

and it’s included driver, until a proper (340.xx or higher) driver is separately available for Tesla products.

steelpeart · September 1, 2014, 6:00am

Indeed, the older driver you just mentioned is the one I’m trying to run. I will try installing using the *.run file and include driver as well.

The procedure I used was to first remove cuda the way I installed it, using yum remove cuda. From there, I tried doing the *.run script, but there were some complaints about the earlier driver. The log file included a way to remove the driver components using yum remove commands.

Once those were run, I then re-ran the cuda installation script and included the driver.

nvidia-smi -p reports that I am now using 340.29.

Robert_Crovella · September 1, 2014, 12:33pm

I can’t tell if you’re still having trouble.

Is deviceQuery working now?

CUDA 6.5 requires a proper install of a 340.xx or newer driver.

steelpeart · September 3, 2014, 2:35am

Sorry - late reply due to U.S. holiday:

YES, and thank you very much. This was indeed the one and only problem: the ‘latest driver’ available via the download wizard wasn’t as new as the driver embedded in the cuda*.run installation file!

A surprising problem, really, and unexpected. I would think that the latest drivers need to also make it into the standalone download wizard without the need to install Cuda.

However, everything is now working as expected and the samples built and running well.

Topic		Replies	Views
Problem with cuda 7 toolkit on centos 6.6 CUDA Setup and Installation	3	4838	June 11, 2015
CUDA 6.5 needs 340.xx driver, Tesla cards only have 331.89 available CUDA Setup and Installation	4	4642	August 22, 2014
Jetson TK1 native install -- driver not verified CUDA Setup and Installation	8	3991	February 3, 2015
New person CUDA Setup and Installation	11	1942	May 15, 2015
Problems installing Tesla C2050 on Dell T7500 CUDA Programming and Performance	9	3157	September 27, 2010
CUDA 6.5 and 340 driver? CUDA Setup and Installation	8	10767	September 13, 2016
Tesla K40 Driver and CUDA installation problems CUDA Setup and Installation	0	814	September 6, 2020
Correct Driver for CUDA 6.5 Support? Legacy PGI Compilers (archived)	4	12083	September 15, 2014
Problem with install of driver graphics Tesla C2075 CUDA Setup and Installation	1	1501	August 8, 2013
Updating Inherited RHEL 6.5 w/CUDA 5.5 CUDA Setup and Installation	3	631	September 6, 2017

Cuda 6.5 deviceQuery fails even though Nvidia driver for Tesla is properly installed

Related topics