NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

Hi,

I am using Ubuntu 18.04 Desktop LTS (Kernel - 5.3.0-40-generic) and a Tesla V100 GPU.


I have been facing the driver and CUDA issues for a while now.
I had issues installing CUDA 10.2, so I did reinstall CUDA and Nvidia drivers a few times, but still had no success.
Yesterday I removed/purged Nvidia drivers and CUDA and did the following process:

  1. I installed the below Nvidia driver using the mentioned process

  1. I installed CUDA 10.2 (.deb local) using the below process, but I changed only the last command to “sudo apt-get install cuda-toolkit-10-2”

After doing this, everything worked properly, but as soon as I rebooted it is giving me the error “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.” for the command “nvidia-smi”, and I am not able to find/load my Nvidia driver.

I tried a few solutions, and also reinstalled it but I am facing the same.
I have generated and attached the Nvidia bug report (changed .gz to .log to upload the file).
I would really appreciate any help on this. Thanks in advance.

Regards,
Sushant
nvidia-bug-report.log.log (98.5 KB)