"Error: incompatible CUDA driver version" with nvprof, nvvp (RPM CUDA installation)

When I try to run my application with nvprof, I get the error:

======== Error: incompatible CUDA driver version.

I’ve installed CUDA 5.5, 6.5, and 7.0 RC on this system at different times. When I go back to CUDA 5.5 and rebuild my application, nvprof works. However, I get the error with the other CUDA versions. My application executes just fine on its own; only nvprof complains.

Details about my system:

  • OS: RHEL 6.5
  • CUDA installed using RPM packages in all cases
  • GTX 580, GTX 680, and GTX 980 GPUs

Each time I uninstall CUDA, then reinstall a new version, I’m very careful to remove all traces of the old installation, then reboot, run ldconfig, and rebuild my application. The nvprof I’m using in each case is the one installed with CUDA (I don’t have a CUDA 5.5 nvprof sitting on the system somewhere).

I also tried installing CUDA 6.5 (and driver) using the .run file installation; nvprof worked. Then I proceeded to uninstall CUDA (and driver), and reinstall CUDA 6.5 using the RPM packages; nvprof had the error again.

To summarize:

  • CUDA 5.5, RPM installation: works
  • CUDA 6.5, RPM installation: error
  • CUDA 6.5, .run installation: works
  • CUDA 7.0 RC, RPM installation: error

I use CUDA RPMs because I need my development system to match our production systems.

Any ideas about what could be wrong? Or is this a bug in the RPM version of CUDA?

For the CUDA 6.5 RPM installation that fails, are you following the steps in the linux getting started guide package manager installation instructions exactly:

http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux/index.html#redhat-installation

For the step that involves installation of the repository meta-data, are you downloading the correct files from the NVIDIA website:

https://developer.nvidia.com/cuda-downloads-geforce-gtx9xx

i.e. this (for GTX 980):

http://developer.download.nvidia.com/compute/cuda/6_5/rel/installers/cuda-repo-rhel6-6-5-prod-6.5-19.x86_64.rpm

My suspicion is that your RPM installation is using some repository other than the correct one.

After a lot of troubleshooting, the solution was really simple: I needed to install all of the x86_86 RPM packages. I’d been doing a minimal CUDA installation of only the absolutely necessary packages, relying on the dependencies encoded in the RPMs to tell me which ones I needed (in a product that’s going to be regulated by the FDA, you get into the habit of avoiding installation of any software you don’t need). Apparently nvprof has as an unlisted dependency. After adding the following packages, it worked:

  • cuda-7.0-18.x86_64
  • cuda-7-0-7.0-18.x86_64
  • cuda-documentation-7-0-7.0-18.x86_64
  • cuda-drivers-346.29-0.x86_64
  • cuda-minimal-build-7-0-7.0-18.x86_64
  • cuda-nvidia-kmod-common-346.29-0.x86_64
  • cuda-runtime-7-0-7.0-18.x86_64
  • cuda-samples-7-0-7.0-18.x86_64
  • cuda-toolkit-7-0-7.0-18.x86_64
  • gpu-deployment-kit-346.29-0.x86_64
  • xorg-x11-drv-nvidia-devel-346.29-1.el6.x86_64
  • xorg-x11-drv-nvidia-gl-346.29-1.el6.x86_64

Thanks, txbob, for your advice.