nvidia-smi: Failed to initialize NVML: Driver/library version mismatch

After our last kernel and cuda update, we are left with mismatched driver/library versions.

rpm -qa | grep cuda | grep runtime

cuda-runtime-6-5-6.5-14.x86_64
cuda-runtime-10-0-10.0.130-1.x86_64

Red Hat Enterprise Linux Server release 6.10 (Santiago)
Kernel 2.6.32-754.9.1.el6.x86_64 on an x86_64

nvidia-smi

Failed to initialize NVML: Driver/library version mismatch

nvidia-bug-report.log.gz (2.57 MB)

There was a 340 legacy driver installed over the current 410 driver so now you have a half 410/half 340 setup which doesnt’ work. Please uninstall both and the reinstall the 410 driver. Don’t forget to run dracut -f to clean the initrd.

Thank you, everything is working fine now:

To fix the problem:

Uninstalled 6-5 cuda

yum remove cuda-license-6-5-6.5-14.x86_64 (removed most 6-5 rpms)

uninstalled 10-0 cuda

yum remove cuda-license-10-0-10.0.130-1.x86_64 (removed most 10-0 rpms)

yum remove cuda-xxx.rpm ; remove any cuda rpm’s that were left over

rpm –qa | grep cuda to check for leftover cuda rpms

yum install cuda-repo-rhel6-10.0.130-1.x86_64.rpm

yum install cuda

dracut –f ; to clean up initrd

Rebooted

Looks good. NVidia-smi is working.

rpm -qa | grep cuda
cuda-cusolver-10-0-10.0.130-1.x86_64
cuda-samples-10-0-10.0.130-1.x86_64
cuda-tools-10-0-10.0.130-1.x86_64
cuda-repo-rhel6-10.0.130-1.x86_64
cuda-cublas-dev-10-0-10.0.130-1.x86_64
cuda-libraries-dev-10-0-10.0.130-1.x86_64
cuda-runtime-10-0-10.0.130-1.x86_64
cuda-cudart-10-0-10.0.130-1.x86_64
cuda-curand-10-0-10.0.130-1.x86_64
cuda-nvtx-10-0-10.0.130-1.x86_64
cuda-driver-dev-10-0-10.0.130-1.x86_64
cuda-nvgraph-dev-10-0-10.0.130-1.x86_64
cuda-npp-dev-10-0-10.0.130-1.x86_64
cuda-nvprof-10-0-10.0.130-1.x86_64
cuda-gdb-10-0-10.0.130-1.x86_64
cuda-memcheck-10-0-10.0.130-1.x86_64
cuda-10-0-10.0.130-1.x86_64
cuda-license-10-0-10.0.130-1.x86_64
cuda-nvgraph-10-0-10.0.130-1.x86_64
cuda-npp-10-0-10.0.130-1.x86_64
cuda-cuobjdump-10-0-10.0.130-1.x86_64
cuda-nvvp-10-0-10.0.130-1.x86_64
cuda-compiler-10-0-10.0.130-1.x86_64
cuda-demo-suite-10-0-10.0.130-1.x86_64
cuda-nvml-dev-10-0-10.0.130-1.x86_64
cuda-cufft-dev-10-0-10.0.130-1.x86_64
cuda-nvjpeg-dev-10-0-10.0.130-1.x86_64
cuda-nvcc-10-0-10.0.130-1.x86_64
cuda-nsight-10-0-10.0.130-1.x86_64
cuda-command-line-tools-10-0-10.0.130-1.x86_64
cuda-nvdisasm-10-0-10.0.130-1.x86_64
cuda-cufft-10-0-10.0.130-1.x86_64
cuda-nvjpeg-10-0-10.0.130-1.x86_64
cuda-misc-headers-10-0-10.0.130-1.x86_64
cuda-libraries-10-0-10.0.130-1.x86_64
cuda-gpu-library-advisor-10-0-10.0.130-1.x86_64
cuda-10.0.130-1.x86_64
cuda-cusparse-dev-10-0-10.0.130-1.x86_64
cuda-cusolver-dev-10-0-10.0.130-1.x86_64
cuda-nvrtc-dev-10-0-10.0.130-1.x86_64
cuda-documentation-10-0-10.0.130-1.x86_64
cuda-visual-tools-10-0-10.0.130-1.x86_64
cuda-toolkit-10-0-10.0.130-1.x86_64
cuda-cusparse-10-0-10.0.130-1.x86_64
cuda-nvrtc-10-0-10.0.130-1.x86_64
cuda-nsight-compute-10-0-10.0.130-1.x86_64
cuda-cudart-dev-10-0-10.0.130-1.x86_64
cuda-curand-dev-10-0-10.0.130-1.x86_64
cuda-nvprune-10-0-10.0.130-1.x86_64
cuda-cublas-10-0-10.0.130-1.x86_64
cuda-cupti-10-0-10.0.130-1.x86_64
cuda-drivers-410.79-1.x86_64