I have a centos8 system with a nvidia A10 device. At first, I successfully installed nvidia driver with cuda tookit version12.6 and the command “nvidia-smi” worked. But I want to change for another version of nvidia driver, and then I removed the installed driver and downloaded the nvidia-driver-local-repo-rhel8-550.54.15-1.0-1.x86_64.rpm and followed the instructions of driver install. But “nvidia-smi” did not work, it shows that “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running”. I checked the system log, here are some messages in system log:
" kernel: NVRM: The NVIDIA GPU 0000:00:06.0 (PCI ID: 10de:2236)#012NVRM: installed in this system is not supported by the#012NVRM: NVIDIA 470.256.02 driver release.#012NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'#012NVRM: in this release's README, available on the operating system#012NVRM: specific graphics driver download page at www.nvidia.com."
I don’t know what is the reason behine this issue. I downloaded the driver tool based on my graphic card model.
Hi @cyflhn, welcome to the NVIDIA developer forums.
Looks like something in your system, maybe automatically?, fell back to an older driver version.
Can you run sudo nvidia-bug-report.sh
and attach the resulting output here?
Thanks!
nvidia-bug-report (1).log.gz (389.8 KB)
Here is nvidia bug report output. @MarkusHoHo
It seems you still have residual files from now) 4 different drivers on your system. I see log entries for
- 470.103.01
- 470.256.02
- 525.147.05
- 560.35.03
First of all in this current situation you should make sure to completely remove all traces of NVIDIA drivers.
I don’t know how to do that on RedHat, but on Debian it would be along the line of:
sudo apt purge ^nvidia-* -y
sudo apt purge ^libnvidia-* -y
sudo apt autoremove
followed by a reboot.
And then you should check again which driver to use. If I look for A10 on RedHat8 I find this: