Framebuffer output stops since Linux Kernel 6

Since my distro of choice has upgraded to Linux Kernel 6 and newer, the virtual console output is broken. The last known kernel that worked is 5.18.16. I am currently using the nvidia linux driver 525.89.02, the latest kernel 6.1.14. The computer boots, as expected and for a brief moment the virtual console outputs on the screen and then stops, as if the screen hangs/freezes. A short moment later X11/Gnome load normally, and the system operates as normal except for the virtual console.

Switching TTYs I see the same the same output prior to the output stopping.

Now the thing is the virtual console is working it just the screen is not updating. I can CTRL+ALT+F3, login and execute commands successfully, the screen never changes. So I can login and enter ‘sudo reboot’ and the machine will reboot shortly there after. I can switch back to Gnome without issue. I believe the issue might be with the framebuffer. I have tried setting modeset to 0 and 1 without success.

I have attached the nvidia-bug-report file

nvidia-bug-report.log.gz (544.0 KB)

Here is an the image of what is output on the monitor when it stops working

Also my last boot
http://0x0.st/Hzcs.txt

Is there any suggestion? I think the issue maybe related to this this kernel bug 216303 – Commit ee7a69aa38d87a3bbced7b8245c732c05ed0c6ec broke legacy frame buffer with NVIDIA but this seems to affect those with intel cpus with gpus, that doesn’t apply to me with my threadripper.

my fix:

diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
index 41e77de1ea82..7ca0730ed1c5 100644
--- a/drivers/video/aperture.c
+++ b/drivers/video/aperture.c
@@ -294,7 +294,7 @@ int aperture_remove_conflicting_devices(resource_size_t base, resource_size_t si
 	 * ask for this, so let's assume that a real driver for the display
 	 * was already probed and prevent sysfb to register devices later.
 	 */
-	sysfb_disable();
+	// sysfb_disable();
 
 	aperture_detach_devices(base, size);

every kernel release

this happen with both drivers (close source and open source)

Interesting. I hadn’t seen this particular version of this bug before, but I think you’re right that this is another configuration that triggers this kernel bug. Normally, the problem occurs when a DRM driver for a different GPU registers a framebuffer console, but in this case it looks like the vfio-pci driver taking control of the secondary GPU has the side effect of disabling the framebuffer console on the primary GPU.

There’s a kernel patch to fix the sysfb_disable() regression, although as far as I know it hasn’t been merged yet: [11/11] video/aperture: Only remove sysfb on the default vga pci device - Patchwork

Hi

the patch need backport to 6.1.8 (my kernel)

diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
index 41e77de1ea82..5c94abdb1ad6 100644
--- a/drivers/video/aperture.c
+++ b/drivers/video/aperture.c
@@ -332,15 +332,16 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na
 	primary = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
 #endif
 
+	if (primary)
+		sysfb_disable();
+
 	for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
 		if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
 			continue;
 
 		base = pci_resource_start(pdev, bar);
 		size = pci_resource_len(pdev, bar);
-		ret = aperture_remove_conflicting_devices(base, size, primary, name);
-		if (ret)
-			return ret;
+		aperture_detach_devices(base, size);
 	}
 
 	/*

i’m not sure if is correct the backport

but testing on. WORKING. no more blackscreen when boot. switch ttys also working. all correct on my setup

greetings

Any progress on this? I am experiencing EXACTLY the same behavior on a Quadro P1000 running kernel 6.1.0-9 (Debian 12) with the 525.105.17 driver.

Thanks for any help!

Its still broken for me @marmorsteinrm running Debian Unstable here.

As a workaround, I installed “kmscon” and it seems to work. It draws very slowly, which is awkward, but it also “fixes” the bug which prevented scrolling up and down with Shift-PageUp. Because it implements some of the same interfaces as an xterm, it has some nice features. At first, I was confused because Alt-Fx didn’t seem to work for switching terminals, but I found that CTRL-ALT-Fx instead works. I’d still prefer for this bug to be fixed so I can go back to a normal (fast) terminal, though. Still – if you’re running into this, kmscon seems like a decent approach for now.

my problem fixed in kernels 6.5.x

greetings