Ubuntu Server 20.04LTS randomly shuts down after driver installation

Hello all,

I have a gpu box running Ubuntu Server 20.04 LTS with four 2080 Ti’s. Today, after upgrading my system, I purged my previous versions of CUDA toolkit and nvidia driver to install newer versions. I am able to successfully install any of the drivers listed on ubuntu-drivers devices and can confirm this by successfully running nvidia-smi on my system, post-reboot. But now my computer randomly shuts down without warning! I have encountered this problem before and have always been able to fixed it by installing another driver version listed on ubuntu-drivers devices, but this time no luck.

Really at a loss here. I do not know how to fix nor why this happens. Anyone know of a solution to this?
This drivers in question are 418-server, 450, 450-server, 455, and 460.

If the system is shutting down spontaneously, this point to the psu being broken/insufficient.

I do not think it is the PSU. It was working just fine before the linux kernel update. And I have even removed 2 of the GPUs from the box and it still spontaneously shuts down after installing the new nvidia drivers.

I can confirm that it is definitely NOT the PSU. Manually installing the old driver (440) resolved the issue. I still don’t know what is wrong with the drivers listed in ubuntu-drivers-common , but something is seriously messed up.

A system shutdown can only be triggered by either the psu or the mainboard. Did you already check for a bios update?

Hi, I’m having a similar problem… just assembled a Linux 20.4 headless server to use exclusively as a PLEX Media Server.
It works fine without a GPU, but when I installed a GTX 1060 for better transcoding, the system started to shutdown without warning…
I’ve installed the new drivers from nvidia, but it keeps happening.
Only removing the GPU from the motherboard makes it work again.
I’ll try to update the BIOS later (not at home right now)… Is there anything else to do?
I don’t have a dummy cable neither a monitor connected to it, is this a problem?
Thanks

You should make sure that no Xserver is running.

I’m having a similar problem. Here’s the description: Random system crashes - Linux Mint Forums
System was shutting down instantly, during idling or low-intensity tasks. I suspect it might have something to do with the GPU / drivers as well