Nvidia-smi has failed because it couldn't communicate with nvidia-driver

nvidia-bug-report.log.gz (1.0 MB)
Hello everyone, I have several problem with nvidia driver. Until last week I had no problems but suddendly it stopped working. When I run nvidia-smi I always get “Nvidia-smi has failed because it couldn’t communicate with nvidia-driver”.

I tried several times and with different methods to reinstall nvidia-driver but with no success. Also, I noticed that path /proc/driver/nvidia/gpu no longer exists.

Someone could help me, please?

Hi there @andrea.cartolano and welcome to the NVIDIA developer forums.

You are trying to install on Linux SRV-Base 5.15.0-1064-azure which seems like an Azure instance? You might need to talk to Azure support to get some help since there are several issues with your system.

For one I think you might not have kernel headers installed, or not the correct ones. The install process failed to build the needed kernel module.

'make' -j12 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.15.0-1063-azure modules.........(bad exit status: 2)
ERROR (dkms apport): binary package for nvidia: 515.105.01 not found
Error! Bad return status for module build on kernel: 5.15.0-1063-azure (x86_64)
Consult /var/lib/dkms/nvidia/515.105.01/build/make.log for more information.
-> error.
ERROR: Failed to install the kernel module through DKMS. No kernel module was installed; please try installing again without DKMS, or check the DKMS logs for more information.
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

Most importantly, there were attempts to install with two different methods which are likely to cause conflicts:

> An alternate method of installing the NVIDIA driver was detected. (This is usually a package provided by your distributor.) A driver installed via that method may integrate better with your system than a driver installed by nvidia-installer.

Please review the message provided by the maintainer of this alternate installation method and decide how to proceed:

The NVIDIA driver provided by Ubuntu can be installed by launching the "Software & Updates" application, and by selecting the NVIDIA driver from the "Additional Drivers" tab.

If unfamiliar with Linux system internals and how kernel modules are loaded, build and used, you should avoid mixing installation methods.

My recommendation would be to purge all remnants of NVIDIA drivers and use the recommended method above to do a clean re-install.

Thanks!

Thank you for your reply.

Yes, as I have mentioned earlier I tried different methods to reinstall nvidia-driver since any of them was working. I will inverstigate with Azure support.

1 Like