Hello, I am having a problem with nvidia-smi. I know this has been posted about several times before, but I can’t find an answer that solves my problem.
We had everything set up and running fine and then we rebooted the machine and now it can’t find the NVIDIA driver.
nvidia-smi returns “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running”.
My GPUs, as told by lspci | grep -i nvidia are:
17:00.0 VGA compatible controller: NVIDIA Corporation Device 2204 (rev a1)
17:00.1 Audio device: NVIDIA Corporation Device 1aef (rev a1)
65:00.0 VGA compatible controller: NVIDIA Corporation Device 2204 (rev a1)
65:00.1 Audio device: NVIDIA Corporation Device 1aef (rev a1)
dpkg -l | grep nvidia gives:
ii gpustat 0.6.0-1 all pretty nvidia device monitor
ii libnvidia-cfg1-455:amd64 455.45.01-0ubuntu1 amd64 NVIDIA binary OpenGL/GLX configuration library
ii libnvidia-common-450 450.172.01-0ubuntu1 all Shared files used by the NVIDIA libraries
ii libnvidia-common-455 455.45.01-0ubuntu1 all Shared files used by the NVIDIA libraries
ii libnvidia-common-460 460.106.00-0ubuntu1 all Shared files used by the NVIDIA libraries
rc libnvidia-compute-450:amd64 450.51.05-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-compute-455:amd64 455.45.01-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-container-tools 1.8.1-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.8.1-1 amd64 NVIDIA container runtime library
ii libnvidia-decode-455:amd64 455.45.01-0ubuntu1 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-455:amd64 455.45.01-0ubuntu1 amd64 NVENC Video Encoding runtime library
ii libnvidia-extra-455:amd64 455.45.01-0ubuntu1 amd64 Extra libraries for the NVIDIA driver
ii libnvidia-fbc1-455:amd64 455.45.01-0ubuntu1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-gl-455:amd64 455.45.01-0ubuntu1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii libnvidia-ifr1-455:amd64 455.45.01-0ubuntu1 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
ii libnvidia-ml-dev 10.1.243-3 amd64 NVIDIA Management Library (NVML) development files
rc nvidia-compute-utils-450 450.51.05-0ubuntu1 amd64 NVIDIA compute utilities
ii nvidia-compute-utils-455 455.45.01-0ubuntu1 amd64 NVIDIA compute utilities
ii nvidia-container-runtime 3.8.1-1 all NVIDIA container runtime
ii nvidia-container-toolkit 1.8.1-1 amd64 NVIDIA container runtime hook
ii nvidia-cuda-dev 10.1.243-3 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 10.1.243-3 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 10.1.243-3 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 10.1.243-3 amd64 NVIDIA CUDA development toolkit
rc nvidia-dkms-450 450.51.05-0ubuntu1 amd64 NVIDIA DKMS package
ii nvidia-dkms-455 455.45.01-0ubuntu1 amd64 NVIDIA DKMS package
ii nvidia-driver-455 455.45.01-0ubuntu1 amd64 NVIDIA driver metapackage
rc nvidia-kernel-common-450 450.51.05-0ubuntu1 amd64 Shared files used with the kernel module
ii nvidia-kernel-common-455 455.45.01-0ubuntu1 amd64 Shared files used with the kernel module
ii nvidia-kernel-source-455 455.45.01-0ubuntu1 amd64 NVIDIA kernel source package
ii nvidia-modprobe 510.47.03-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-dev:amd64 10.1.243-3 amd64 NVIDIA OpenCL development files
ii nvidia-prime 0.8.14 all Tools to enable NVIDIA's Prime
ii nvidia-profiler 10.1.243-3 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-settings 510.47.03-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-utils-455 455.45.01-0ubuntu1 amd64 NVIDIA driver support binaries
ii nvidia-visual-profiler 10.1.243-3 amd64 NVIDIA Visual Profiler for CUDA and OpenCL
ii screen-resolution-extra 0.18build1 all Extension for the nvidia-settings control panel
ii xserver-xorg-video-nvidia-455 455.45.01-0ubuntu1 amd64 NVIDIA binary Xorg driver
I can’t work out why it would break just on reboot, without updating/installing anything.
Thanks!