System freezes traced to nvidia_drm module on Arch Linux (GTX 1650 Ti Mobile + Ryzen 4800H)

Hello, NVIDIA Forum,

I’m using Arch Linux and have been experiencing severe system instability.
After carefully isolating possible causes, I found strong evidence that the issue is related to the nvidia_drm kernel module (or an interaction involving it).
I’m opening this post to seek advice and to discuss the reasoning process. I’d appreciate suggestions and explanations so I can also learn from your experience.

Below, I’ve explained the problem, my system specifications, possible triggering mechanisms, what I’ve already tried, and why I suspect the nvidia_drm module.

The Problem

The system suddenly freezes and becomes completely unresponsive — even SysRq keys do not work.
At that point, my laptop’s integrated display turns gray, while my external HDMI monitor shows a solid green screen (see attached photos).
When this happens, the only recovery method is a hard shutdown via the power button.
No relevant logs are left in journalctl after reboot.

System Specifications

Laptop: ASUS TUF A15 (FA506II-BQ200)
CPU: AMD Ryzen 7 4800H (8 C / 16 T, integrated Vega GPU)
dGPU: NVIDIA GeForce GTX 1650 Ti Mobile (TU117M [10de:1f95])
RAM: 16 GB (8 × 2) DDR4 3200 MHz
Kernel: Linux 6.17.4-arch2-1
Drivers: nvidia-dkms 580.95.05 (proprietary)
Display Server: Wayland (KDE Plasma)
Bootloader: systemd-boot
Kernel Parameters (current):idle=nomwait modprobe.blacklist=nvidia_drm

Possible Triggering Mechanisms

From observation, the issue only occurs while the system is under low power usage — for example, when browsing, editing text, or reading PDFs.
Interestingly, while playing GPU-intensive games, the problem never appears.
Even under heavy gaming loads, the system remains stable.
This suggests that the issue may be related to power-management transitions or runtime PM behavior between the integrated and discrete GPUs.

What I Have Tried So Far

Disabled GSP firmware
Parameter: nvidia.NVreg_EnableGpuFirmware=0
Result: No improvement — freezes persisted.

Explicitly enabled DRM modeset
Parameter: nvidia_drm.modeset=1
Result: No improvement — freezes persisted.

Disabled DRM modeset
Parameter: nvidia_drm.modeset=0
Result: No improvement — freezes persisted.

Completely blacklisted nvidia_drm module
Parameter: modprobe.blacklist=nvidia_drm
Result: System became stable. No freezes after several days of testing.

Why I Suspect the nvidia_drm Module

After blacklisting nvidia_drm, the system’s stability improved dramatically.
This strongly suggests that the issue is connected to that module or its interaction with hybrid graphics (AMD iGPU + NVIDIA dGPU).

However, blacklisting the module is only a temporary diagnostic solution.
It prevents PRIME offloading and hardware-accelerated rendering via the NVIDIA GPU, causing increased thermal load on the iGPU and higher fan usage.

I’m looking for a more refined solution that restores hybrid GPU functionality and PRIME offloading while maintaining stability.

Attachments

Photos of problems:

-https://0x0.st/K2Pi.jpeg

-https://0x0.st/K2P-.jpeg

-https://0x0.st/K2Po.jpeg

nvidia-bug-report.log.gz (878.9 KB)

Closing

Thank you for reading this far.
I’d appreciate any help, insight, or debugging ideas.
Even partial explanations about how nvidia_drm interacts with hybrid power management would be extremely valuable.