While the GPU is in D3Cold (Video Memory: Off), it might fail to wake up, randomly.
This is visible by the application trying to wake the GPU up freezing (until timeouts happen) and the following DMESG messages appearing in the log:
[ 417.180294] NVRM: gpuWaitForGfwBootComplete_TU102: failed to wait for GFW_BOOT: (progress 0x9)
[ 417.180299] NVRM: kgspWaitForGfwBootOk_TU102: failed to wait for GFW boot complete: 0x55 VBIOS version 94.04.43.00.9F
[ 417.180300] NVRM: kgspWaitForGfwBootOk_TU102: (the GPU may be in a bad state and may need to be reset)
[ 423.189349] NVRM: _kgspLogXid119: ********************************* GSP Timeout **********************************
[rest in the nvidia-bug-report attachment]
Reproduction steps
- Install Nvidia driver (open-dkms variant)
- Configured udev rules based on Chapter 22 of readme
- Configured the following driver options:
options nvidia_drm modeset=1 fbdev=1
options nvidia "NVreg_DynamicPowerManagement=0x02" "NVreg_DynamicPowerManagementVideoMemoryThreshold=256"
sudo mkinitcpio -P
,sudo update-grub
, full reboot of the system- Manually removed audio PCI device of the card using:
echo 1 | sudo tee /sys/bus/pci/devices/0000:01:00.1/remove
- Wait for card to go into suspended/D3Cold mode
- Run
prime-run glxgears
orsudo nvidia-smi
to wake the card up
Repeat steps 5 and 6 until the bug triggers. Usually the GPU wakes up successfully several times before this happens.
Hardware
Notebook: Acer Nitro AN515-45
Graphics card: GeForce RTX 3080 Mobile 8GB
No external monitor attached
Happens when running both on AC power and battery
OS info
Manjaro (unstable branch, fully updated as of 18/10/2024)
Kernel: 6.11.4-1-MANJARO
Driver: nvidia-open-dkms 560.35.03
Desktop environment: KDE Plasma 6.2.1.1 running under Wayland
nvidia-bug-report attached
nvidia-bug-report.log.gz (558.4 KB)