Hi, my workstation came with driver version 460.80 installed. I installed CUDA and cuDNN myself, and was able to check that a few things worked. The last thing I checked was a few simple Tensorflow operations on individual GPU devices (there are 4).
I came back today, started the machine, and now nvidia-smi
says
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Although I’ve tried to follow instructions carefully, I’m a newbie at being a Linux sysadmin and I fear I may have done something wrong while installing CUDA/cuDNN.
Here’s the output of dpkg -l | grep -i nvidia
, in case it helps diagnose issues. This is an Ubuntu (20.04) box.
$ dpkg -l | grep -i nvidia
ii cuda-nsight-compute-11-2 11.2.0-1 amd64 NVIDIA Nsight Compute
ii cuda-nsight-systems-11-2 11.2.0-1 amd64 NVIDIA Nsight Systems
ii cuda-nvtx-11-2 11.2.67-1 amd64 NVIDIA Tools Extension
ii libaccinj64-10.1:amd64 10.1.243-3 amd64 NVIDIA ACCINJ Library (64-bit)
ii libcublas10:amd64 10.1.243-3 amd64 NVIDIA cuBLAS Library
ii libcublaslt10:amd64 10.1.243-3 amd64 NVIDIA cuBLASLt Library
ii libcudart10.1:amd64 10.1.243-3 amd64 NVIDIA CUDA Runtime Library
ii libcufft10:amd64 10.1.243-3 amd64 NVIDIA cuFFT Library
ii libcufftw10:amd64 10.1.243-3 amd64 NVIDIA cuFFTW Library
ii libcuinj64-10.1:amd64 10.1.243-3 amd64 NVIDIA CUINJ Library (64-bit)
ii libcupti-dev:amd64 10.1.243-3 amd64 NVIDIA CUDA Profiler Tools Interface development files
ii libcupti-doc 10.1.243-3 all NVIDIA CUDA Profiler Tools Interface documentation
ii libcupti10.1:amd64 10.1.243-3 amd64 NVIDIA CUDA Profiler Tools Interface runtime library
ii libcurand10:amd64 10.1.243-3 amd64 NVIDIA cuRAND Library
ii libcusolver10:amd64 10.1.243-3 amd64 NVIDIA cuSOLVER Library
ii libcusolvermg10:amd64 10.1.243-3 amd64 NVIDIA cuSOLVERmg Library
ii libcusparse10:amd64 10.1.243-3 amd64 NVIDIA cuSPARSE Library
ii libnppc10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives core runtime library
ii libnppial10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Arithmetic and Logic
ii libnppicc10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Color Conversion
ii libnppicom10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Compression
ii libnppidei10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Data Exchange and Initialization
ii libnppif10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Filters
ii libnppig10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Geometry transforms
ii libnppim10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Morphological operations
ii libnppist10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Statistics
ii libnppisu10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Support
ii libnppitc10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives lib for Image Threshold and Compare
ii libnpps10:amd64 10.1.243-3 amd64 NVIDIA Performance Primitives for signal processing runtime library
ii libnvgraph10:amd64 10.1.243-3 amd64 NVIDIA Graph Analytics library (nvGRAPH)
ii libnvidia-cfg1-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVIDIA binary OpenGL/GLX configuration library
ii libnvidia-common-460 460.91.03-0ubuntu0.20.04.1 all Shared files used by the NVIDIA libraries
ii libnvidia-compute-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVIDIA libcompute package
ii libnvidia-decode-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVENC Video Encoding runtime library
ii libnvidia-extra-460:amd64 460.80-0ubuntu0.20.04.2 amd64 Extra libraries for the NVIDIA driver
ii libnvidia-fbc1-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
ii libnvidia-gl-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii libnvidia-ifr1-460:amd64 460.80-0ubuntu0.20.04.2 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
ii libnvidia-ml-dev 10.1.243-3 amd64 NVIDIA Management Library (NVML) development files
ii libnvjpeg10:amd64 10.1.243-3 amd64 NVIDIA JPEG library (nvJPEG)
ii libnvrtc10.1:amd64 10.1.243-3 amd64 CUDA Runtime Compilation (NVIDIA NVRTC Library)
ii libnvtoolsext1:amd64 10.1.243-3 amd64 NVIDIA Tools Extension Library
ii libnvvm3:amd64 10.1.243-3 amd64 NVIDIA NVVM Library
rc linux-modules-nvidia-460-5.8.0-43-generic 5.8.0-43.49~20.04.1 amd64 Linux kernel nvidia modules for version 5.8.0-43
ii linux-modules-nvidia-460-5.8.0-59-generic 5.8.0-59.66~20.04.1 amd64 Linux kernel nvidia modules for version 5.8.0-59
ii linux-modules-nvidia-460-generic-hwe-20.04 5.8.0-59.66~20.04.1 amd64 Extra drivers for nvidia-460 for the generic-hwe-20.04 flavour
ii linux-objects-nvidia-460-5.8.0-59-generic 5.8.0-59.66~20.04.1 amd64 Linux kernel nvidia modules for version 5.8.0-59 (objects)
ii linux-signatures-nvidia-5.8.0-59-generic 5.8.0-59.66~20.04.1 amd64 Linux kernel signatures for nvidia modules for version 5.8.0-59-generic
ii nsight-compute 10.1.243-3 amd64 NVIDIA Nsight Compute
ii nsight-compute-2020.3.0 2020.3.0.18-1 amd64 NVIDIA Nsight Compute
ii nsight-systems 10.1.243-3 amd64 NVIDIA Nsight Systems
ii nvidia-compute-utils-460 460.80-0ubuntu0.20.04.2 amd64 NVIDIA compute utilities
ii nvidia-cuda-dev 10.1.243-3 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 10.1.243-3 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 10.1.243-3 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 10.1.243-3 amd64 NVIDIA CUDA development toolkit
ii nvidia-driver-460 460.80-0ubuntu0.20.04.2 amd64 NVIDIA driver metapackage
ii nvidia-kernel-common-460 460.80-0ubuntu0.20.04.2 amd64 Shared files used with the kernel module
ii nvidia-kernel-source-460 460.80-0ubuntu0.20.04.2 amd64 NVIDIA kernel source package
ii nvidia-modprobe 460.27.04-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-dev:amd64 10.1.243-3 amd64 NVIDIA OpenCL development files
ii nvidia-prime 0.8.15.3~0.20.04.1 all Tools to enable NVIDIA's Prime
ii nvidia-profiler 10.1.243-3 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-settings 470.57.01-0ubuntu0.20.04.1 amd64 Tool for configuring the NVIDIA graphics driver
ii nvidia-utils-460 460.80-0ubuntu0.20.04.2 amd64 NVIDIA driver support binaries
ii nvidia-visual-profiler 10.1.243-3 amd64 NVIDIA Visual Profiler for CUDA and OpenCL
ii screen-resolution-extra 0.18build1 all Extension for the nvidia-settings control panel
ii xserver-xorg-video-nvidia-460 460.80-0ubuntu0.20.04.2 amd64 NVIDIA binary Xorg driver