nvidia-smi stops working after reboot ununtu 18.04

mvkac4app · August 22, 2018, 7:26am

I have Ubuntu 18.04 installed on ASUS laptop with GEFORCE 940MX GPU card. I have tried everything using proprietary drivers or using NVIDIA run file to install cuda drivers. Finally, I was able to install NVIDIA and CUDA drivers using cuda run file.

OpenGL and NVIDIA-X-config were not installed during this installation.
Also, secure boot is disabled prior to installation.

Now, nvidia-smi works after this installation, but whenever I reboot system it gives error: “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”

It will be really helpful if experts can comment how to overcome this issue. Thanks in advance

generix · August 22, 2018, 11:01am

This is probably an Optimus laptop, so you shouldn’t use the .run installer. Use the driver from ubuntu graphics ppa, then download the .deb and install cuda-toolkit.

mcNvidia · September 6, 2019, 4:02pm

Is there a better answer to this?

I had to use the run-installer to prevent the installation of “drm” (direct rendering machine?) which apparantly allows the intel onboard graphics card to be used by the graphical user interface (gdm? lightdm? xorg?).
The GPU memry should be untouched by the GUI since the NVIDIA GPU is used for deep-learning and is needed in it’s full capacity.

Therefore I had to

sudo ./NVIDIA-Linux-x86_64-430.40.run --no-open-gl-files --no-drm

generix · September 6, 2019, 4:38pm

There most likely is.
But since I know nothing about neither your hardware nor your distro, I’ll have to use my psychic powers again and say this:
[url]ubuntu 18.04+headless_390+intel iGPU after prime-select intel lost contact to GeFORCE 1050ti - Linux - NVIDIA Developer Forums

mcNvidia · September 6, 2019, 4:45pm

Looks good,now I just need to find out how to write such a /etc/X11/xorg.conf for my case with two monitors.

(same config as above mentioned: 18.04, Intel onboard for display, Nvidia RTX 2080 for Cuda)

generix · September 6, 2019, 5:04pm

It’s 2019, you don’t need to specify any monitors in xorg.conf since 2007 anymore, except for very specific cases. Just setting the device to use should be enough which that snippet does.
Just connect the monitor and use Gnome’s control center to arrange them. Unless you have set ‘nomodeset’ and killed the intel gpu.

mcNvidia · September 6, 2019, 5:11pm

hmm, the snippet is working (can login to the GUI after rebooting), but the nvidia-smi is still not working (same error as before: “NVIDIA-SMI has failed because it couldn’t communicate …” ).

generix · September 6, 2019, 5:12pm

Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
[url]https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/[/url]

mcNvidia · September 6, 2019, 5:13pm

I did add these lines to my /etc/modprobe.d/blacklist.conf before installing via nvidia-runfile:

blacklist nouveau
blacklist lbm-nouveau
alias nouveau off
alias lbm-nouveau off
options nouveau modeset=0

is that a problem?
nvidia-bug-report.log.gz (147 KB)

generix · September 6, 2019, 5:28pm

There’s no kernel driver installed.
Did you already uninstall the .run installer driver using the --uninstall option?
Afterwards, you should install the nvidia driver from repo, either software&drivers application or
sudo apt install nvidia-driver-430
Only if that has been done the post I linked to works, of course.

mcNvidia · September 6, 2019, 5:39pm

As I mentioned before, installing the driver via sudo apt install nvidia-driver-430 leads to the fact that the nvidia GPU is used for the GUI / x-server.

I can’t have that.

My configuration (before rebooting) was:

Intel onboard graphics for GUI
Nvidia RTX 2080 for CUDA computations only

Therefore I had to install it via the runfile like this:

sudo ./NVIDIA-Linux-x86_64-430.40.run --no-open-gl-files --no-drm

where the options --no-open-gl-files and --no-drm prevent the system from allocating memory on the Nvidia GPU for the X-server (a problem I had before)

Is there a way to install the ppa drivers with the above mentioned options? (no-open-gl and no-drm)?

As far as I understand it drm (direct rendering manager?) is especially guilty of this behaviour, to borrow memory from the Nvidia GPU for the x-server, so it needs to be suppressed.

generix · September 6, 2019, 5:45pm

Are you kidding me?

uninstall .run
install repo driver
do [url]ubuntu 18.04+headless_390+intel iGPU after prime-select intel lost contact to GeFORCE 1050ti - Linux - NVIDIA Developer Forums
Last answer.

mcNvidia · September 6, 2019, 5:52pm

I’m not, I’ve done the driver installation before - with the described (undesired) effect.

However I just found a solution here: [url]https://gist.github.com/wangruohui/bc7b9f424e3d5deb0c0b8bba990b1bc5[/url]

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt install nvidia-headless-435 nvidia-utils-435

"IMPORTANT:

The nvidia-headless-435 contains only the driver, while the full nvidia-driver-435 package contain everything including display component such OpenGL libraries. If you hope to connect the display to a NVIDIA display card, install the full package, otherwise, install only the driver.

The nvidia-utils-435 package provide utilities such as nvidia-smi.

Reboot. If the installation is successful, command nvidia-smi will show all NVIDIA GPUs."

Still, thanks for your quick replies.

Topic		Replies	Views
Nvidia driver: Infinite login loop, Ubuntu 18.04 Linux	1	7361	March 30, 2021
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. on ubuntu 18.04 with NVIDIA Corporation GK110GL [Quadro K5200] Linux	7	3575	October 14, 2021
Can't use any NVIDIA driver on Ubuntu 18.04 (4.15.0-39-generic) Linux	7	20547	October 12, 2021
Nvidia-smi “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure …” Linux linux	8	2287	January 12, 2022
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver - Driver 440, RTX 2080 MaxQ Linux	3	1742	October 12, 2021
(Ubuntu 18.04) NVIDIA-SMI has failed, but the Software and Updates app indicates that I am using nvidia-driver-430 Linux	10	5304	December 19, 2020
VIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running Linux ubuntu , linux	5	66259	March 8, 2023
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver Linux cuda , ubuntu	4	2033	May 4, 2021
Nvidia driver is not working on Ubuntu 22.04 Linux linux , linux-driver	25	38410	February 20, 2025
NVIDIA-SMI has failed / nvmlInit(): Driver Not Loaded Linux	3	3126	December 16, 2019

nvidia-smi stops working after reboot ununtu 18.04

Related topics