I’ve tried installing the nvidia-driver-455 from the official repos and manually for my server with an Nvidia P5000. Any help so that I don’t have to force an OS roll back is much appreciated!
System Details:
Description: Ubuntu 18.04.5 LTS
uname -a yields Linux cs-bric1 5.4.0-58-generic #64~18.04.1-Ubuntu SMP Wed Dec 9 17:11:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
cat /proc/version gives Linux version 5.4.0-58-generic (buildd@lgw01-amd64-040) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #64~18.04.1-Ubuntu SMP Wed Dec 9 17:11:11 UTC 2020
Dec 20 00:09:19 cs-bric1 kernel: [14121567.982463] NVRM: API mismatch: the client has the version 455.45.01, but
Dec 20 00:09:19 cs-bric1 kernel: [14121567.982463] NVRM: this kernel module has the version 435.21. Please
Dec 20 00:09:19 cs-bric1 kernel: [14121567.982463] NVRM: make sure that this kernel module and all NVIDIA driver
Dec 20 00:09:19 cs-bric1 kernel: [14121567.982463] NVRM: components have the same version.
Please cleanly uninstall and then reinstall the driver.
I ran sudo apt purge ^nvidia and only installed nvidia-utils-435 to get this bug report (google drive).
Next I installed nvidia-driver-450 from the official repos and ran the bug report again (google drive).
Before this I also ran sudo ./NVIDIA-Linux-x86_64-450.80.02.run --uninstall to remove any manually installed traces.
nvidia-smi still fails with NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running
Is there any other way to make sure the kernel-os-driver are all matching?
I followed one of your earlier posts and nvidia-smi seems to be working now! It could have been trying to manually install the driver making a setting that blocked the repo one from loading.