NVIDIA modeset driver hang shortly after re-enabling HDMI screen

I have a laptop with an HDMI cable attached to a KVM switch. My software stack is as follows:

  1. Linux 7.0.0 - but can be reproduced on some older 6.1x.x kernel versions
  2. sway 1.12-dev-c57daaf0 (multiple instances running, maybe something to do with it)
  3. NVIDIA driver version 595.58.03 - but can be reproduced on 595.45.04. I know this issue is new, so it’s a recent regression. I believe the newest version without the bug is 590.48.01
  4. Other applications typically running are Firefox, Discord, and alacritty. As far as I know nothing else that uses the GPU.

The freeze typically happens when I switch onto this computer using the KVM switch. It does not happen immediately, but usually after a few seconds. I have never seen it happen under any other circumstances.

When the freeze happens the nvidia-modeset kernel thread is at 100% CPU, and sway is stuck waiting on a semaphore(see logs).

One more detail: the laptop screen is disabled in sway.

Two copies of logs are attached since the driver is unresponsive. One is taken with safe mode.

nvidia-bug-report.log.gz (174.3 KB)

nvidia-bug-report.log.gz (115.4 KB)

Reproduced the bug on nvidia/595.71.05, 7.0.0, x86_64: installed

The stacktrace seems slightly different but the symptoms and reproduction steps of the bug remain the same.

nvidia-bug-report_frozen_1.log.gz (120.0 KB)

nvidia-bug-report_safe_1.log.gz (174.1 KB)

Just had it happen without doing anything relating to turning the screen on or off. Seems that either the new update made it unrelated to the screen, or it just didn’t happen this way before by chance.