I have a laptop with an RTX 2070-Max Q GPU and an Intel i7-9750H CPU running Arch Linux, and I’m using prime offloading to manage switching between the iGPU and the dGPU. Normally when the dGPU is suspended the laptop draws around 8-12W of power, but lately I’ve noticed an issue where the dGPU is continuously powered on and my laptop is eating upwards of 40W. Usually this starts when I unplug the charger.
As soon as I unplug the charger, I see the following in dmesg:
[Oct 6 21:25] NVRM: GPU at PCI:0000:01:00: GPU-f23b1fb2-f19c-5d9c-f24c-f2a68d1bceab
[ +0.000066] NVRM: GPU Board Serial Number:
[ +0.000002] NVRM: Xid (PCI:0000:01:00): 61, pid=423, 0d20(3270) 00000000 00000000
and nvidia-smi shows that the info-ROM has been corrupted:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.04 Driver Version: 455.23.04 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 207... Off | 00000000:01:00.0 Off | N/A |
| N/A 59C P3 ERR! / N/A | 5MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 606 G /usr/lib/Xorg 4MiB |
+-----------------------------------------------------------------------------+
WARNING: infoROM is corrupted at gpu 0000:01:00.0
So far I’m not sure what is causing this issue or how to resolve it. Attached is a copy of log generated by nvidia-bug-report.sh.
nvidia-bug-report.log.gz (873.0 KB)