Hi all!
I recently purchased the following rig:
2 x NVIDIA RTX 3090
AMD Ryzen 9 3900XT
ASRock X570 Motherboard
Antec Signature 1300W Power Supply
Installed Nvidia driver 455.38, CUDA 11.1, PyTorch 1.7.1, on Ubuntu 20.04 and tried running deep learning benchmarks.
The problem is everything runs fine if I use either the first GPU, or the second GPU. But the moment when I run both of them, the PC just shuts off. I suspected power related issue but I installed Windows and ran all the GPU benchmarks but everything runs totally fine. Now I’m suspected some driver related issue. Did anyone else face something similar? Or is there something I can do to narrow down the issue? Thanks!
While the driver stack is very similar, Windows and Linux do use the GPUs differently enough that total system power might be different between the two. The symptoms you describe do sound very much like an inadequate power supply. Can you please try to use nvidia-smi -pl
to set a lower power limit on your GPUs to see if that makes the problem go away?
Hi! Thanks, setting the power limit does solve my issue! But according to my calculation, my PSU would trip only if the GPUs draw over 1800+ W, and I thought that would be plenty, but I guess not. Is there any documentation I can go through or guide to get a better PSU?
I don’t have a lot of wisdom to share on the PSU part picking side of things since I’m mostly a software guy. I’d recommend finding a good buyer’s guide from a tech press site.
One thing that might be worth trying is swapping the cables around to make sure the load is balanced equally across the circuitry within the PSU. The PSU’s owner’s manual might have recommendations on how to properly balance load.
1 Like