I have recently installed CUDA 11.7 for ubuntu via network install (apt install cuda)
and everything seems to be working except “nvidia-smi”.
The X display comes up (from Xorg.0.log:
[ 17.626] (II) NVIDIA dlloader X Driver 515.48.07 Fri May 27 03:23:48 UTC 2022
[ 17.659] (II) NVIDIA GLX Module 515.48.07 Fri May 27 03:22:01 UTC 2022
Kernel boots up saying its loaded the right module:
[ 7.122168] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 515.48.07 Fri May 27 03:26:43 UTC 2022
[ 7.217695] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 515.48.07 Fri May 27 03:18:00 UTC 2022
Driver:
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 515.48.07 Fri May 27 03:26:43 UTC 2022
GCC version: gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
nvidia-persistenced is running
nvidia-settings will run, and I can make changes. It confirms I am running 515.48.07
however, when I run /usr/bin/nvidia-smi I get an error:
/usr/bin/nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
$
and DMESG shows a mismatch between 510.73.08 and 515.48.07.
[ 7600.385831] NVRM: API mismatch: the client has the version 510.73.08, but
NVRM: this kernel module has the version 515.48.07. Please
NVRM: make sure that this kernel module and all NVIDIA driver
NVRM: components have the same version.
I dont think I have ever had 510 installed on this system. I verified the integrity of the /usr/bin/nvidia-smi from the .deb and verified the binary has the 515.48.07 string in it (other attempts to get the version number of the binary fail)
$ strings /usr/bin/nvidia-smi | egrep ‘515|510’
515.48.07
any idea what this 510 client could be?
nvidia-bug-report.log.gz (226.5 KB)