After our last kernel and cuda update, we are left with mismatched driver/library versions.
rpm -qa | grep cuda | grep runtime
cuda-runtime-6-5-6.5-14.x86_64
cuda-runtime-10-0-10.0.130-1.x86_64
Red Hat Enterprise Linux Server release 6.10 (Santiago)
Kernel 2.6.32-754.9.1.el6.x86_64 on an x86_64
nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
nvidia-bug-report.log.gz (2.57 MB)
There was a 340 legacy driver installed over the current 410 driver so now you have a half 410/half 340 setup which doesnt’ work. Please uninstall both and the reinstall the 410 driver. Don’t forget to run dracut -f to clean the initrd.
Thank you, everything is working fine now:
To fix the problem:
Uninstalled 6-5 cuda
yum remove cuda-license-6-5-6.5-14.x86_64 (removed most 6-5 rpms)
uninstalled 10-0 cuda
yum remove cuda-license-10-0-10.0.130-1.x86_64 (removed most 10-0 rpms)
yum remove cuda-xxx.rpm ; remove any cuda rpm’s that were left over
rpm –qa | grep cuda to check for leftover cuda rpms
yum install cuda-repo-rhel6-10.0.130-1.x86_64.rpm
yum install cuda
dracut –f ; to clean up initrd
Rebooted
Looks good. NVidia-smi is working.
rpm -qa | grep cuda
cuda-cusolver-10-0-10.0.130-1.x86_64
cuda-samples-10-0-10.0.130-1.x86_64
cuda-tools-10-0-10.0.130-1.x86_64
cuda-repo-rhel6-10.0.130-1.x86_64
cuda-cublas-dev-10-0-10.0.130-1.x86_64
cuda-libraries-dev-10-0-10.0.130-1.x86_64
cuda-runtime-10-0-10.0.130-1.x86_64
cuda-cudart-10-0-10.0.130-1.x86_64
cuda-curand-10-0-10.0.130-1.x86_64
cuda-nvtx-10-0-10.0.130-1.x86_64
cuda-driver-dev-10-0-10.0.130-1.x86_64
cuda-nvgraph-dev-10-0-10.0.130-1.x86_64
cuda-npp-dev-10-0-10.0.130-1.x86_64
cuda-nvprof-10-0-10.0.130-1.x86_64
cuda-gdb-10-0-10.0.130-1.x86_64
cuda-memcheck-10-0-10.0.130-1.x86_64
cuda-10-0-10.0.130-1.x86_64
cuda-license-10-0-10.0.130-1.x86_64
cuda-nvgraph-10-0-10.0.130-1.x86_64
cuda-npp-10-0-10.0.130-1.x86_64
cuda-cuobjdump-10-0-10.0.130-1.x86_64
cuda-nvvp-10-0-10.0.130-1.x86_64
cuda-compiler-10-0-10.0.130-1.x86_64
cuda-demo-suite-10-0-10.0.130-1.x86_64
cuda-nvml-dev-10-0-10.0.130-1.x86_64
cuda-cufft-dev-10-0-10.0.130-1.x86_64
cuda-nvjpeg-dev-10-0-10.0.130-1.x86_64
cuda-nvcc-10-0-10.0.130-1.x86_64
cuda-nsight-10-0-10.0.130-1.x86_64
cuda-command-line-tools-10-0-10.0.130-1.x86_64
cuda-nvdisasm-10-0-10.0.130-1.x86_64
cuda-cufft-10-0-10.0.130-1.x86_64
cuda-nvjpeg-10-0-10.0.130-1.x86_64
cuda-misc-headers-10-0-10.0.130-1.x86_64
cuda-libraries-10-0-10.0.130-1.x86_64
cuda-gpu-library-advisor-10-0-10.0.130-1.x86_64
cuda-10.0.130-1.x86_64
cuda-cusparse-dev-10-0-10.0.130-1.x86_64
cuda-cusolver-dev-10-0-10.0.130-1.x86_64
cuda-nvrtc-dev-10-0-10.0.130-1.x86_64
cuda-documentation-10-0-10.0.130-1.x86_64
cuda-visual-tools-10-0-10.0.130-1.x86_64
cuda-toolkit-10-0-10.0.130-1.x86_64
cuda-cusparse-10-0-10.0.130-1.x86_64
cuda-nvrtc-10-0-10.0.130-1.x86_64
cuda-nsight-compute-10-0-10.0.130-1.x86_64
cuda-cudart-dev-10-0-10.0.130-1.x86_64
cuda-curand-dev-10-0-10.0.130-1.x86_64
cuda-nvprune-10-0-10.0.130-1.x86_64
cuda-cublas-10-0-10.0.130-1.x86_64
cuda-cupti-10-0-10.0.130-1.x86_64
cuda-drivers-410.79-1.x86_64