NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running

I followed this advice, but didn’t work out:

I want to update the driver so I can Cuda 12.2. I am dealing with a remote server that has GPU which I am using. I am using it through ssh. It worked with Cuda 11.6 and it’s preinstalled driver. After getting new nvidia driver nvidia-smi is no longer working

image

I verified:
System Has gcc Installed
And a CUDA-Capable GPU

What I tried:

  1. sudo apt install nvidia-driver-535
    Didn’t work (I cancelled first time I did this installation and had an error during the installation not sure if this caused the issue)

  2. Install Cuda 12.2 Thoguht, maybe it needed Cuda for nvidia-smi to work. Seems like nvidia-smi doesn’t require Cuda what I see online.

sudo apt-get purge nvidia*
sudo apt-get autoremove
sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*

Install the latest NVIDIA Driver: nvidia-driver-550

Note I did a sudo reboot each time.
Help I don’t know how to fix this!

nvidia-bug-report.log (81.8 MB)

I tried logging dmesg, which driver can I use?

It’s a VM on a vGPU host, you need to use the grid driver. You should get that from the one that set up the host.