I’ve been using my LG OLED55BX TV as a second monitor for several months now, without issues. But at some point I realised that only two HDMI ports on the TV support HDMI 2.1, so since I’m trying to get 10bit colour working I switched my cable. Unfortunately I also upgraded my drivers around about the same time, to 390.143. Sadly can’t tell which did it, but since then (last three months) if I boot with my OLED55BX off, then turn it on later, the driver crashes (NULL ptr.) At first I thought, upgrade my kernel & drivers, then see what’s up. I’ve done that now, and with 460.67 drivers and a 5.12.4 kernel, this hard crash still happens reproducibly, with my HDMI cable in either 2.1 compatible port.
As I say, can’t tell if it was the move to the HDMI 2.1 port or the upgrade to >=390 drivers that did it, but my gut tells me it’s HDMI 2.1 related.
Attached find dmesg output with the crash callstack somewhere in nvidia_modeset, an nvidia-bug-report form before the crash, and another nvidia-bug-report run from after the crash, Xorg.log, xorg.conf, and config.gz.
PS if anyone knows anything about how to get 10bit colour on linux with xorg, please let me know, been trying to get that working for ages!
monitor-hang-202106041402 (94.8 KB)
nvidia-bug-report-before.log.gz (66.2 KB)
nvidia-bug-report-after.log.gz (67.4 KB)
config.gz (34.0 KB)
Xorg.0.log.old (34.8 KB)
xorg.conf (2.4 KB)
More info. When booting with the LG OLED55BX on, then changing settings on the monitor, what looks like the same crash can happen. Here is an nvidia-bug-report.log.gz and dmesg output with another stack trace of a null ptr dereference in nvidia modeset, turning AMD Freesync Premium OFF on the monitor cased this crash.
monitor-hang-202106071256 (95.0 KB)
nvidia-bug-report-202106071256.log.gz (67.7 KB)
I can confirm this on slightly different setup:
With xubuntu-21.04, nvidia driver 470.57.02 (as well as 470.42.01 and 465.31 before) I see nvidia kernel module crashing with NULL pointer dereference.
I’m using 4K@120Hz on LG OLED48C1, there is also DP connected to older LCD panel which I don’t use.
GPU is 3080Ti (GeForce RTX™ 3080 Ti GAMING OC 12G Key Features | Graphics Card - GIGABYTE Global)
[Wed Jul 21 14:36:37 2021] NVRM: GPU at PCI:0000:01:00: GPU-1b78bb48-ea35-a89d-4855-3c25ece549c9
[Wed Jul 21 14:36:37 2021] NVRM: Xid (PCI:0000:01:00): 32, pid=5113, Channel ID 00000010 intr 00800000
[Wed Jul 21 20:08:18 2021] nvidia-modeset: WARNING: GPU:0: HDMI FRL link training failed.
[Wed Jul 21 20:08:30 2021] BUG: kernel NULL pointer dereference, address: 0000000000000000
[Wed Jul 21 20:08:30 2021] #PF: supervisor read access in kernel mode
[Wed Jul 21 20:08:30 2021] #PF: error_code(0x0000) - not-present page
[Wed Jul 21 20:08:30 2021] PGD 0 P4D 0
[Wed Jul 21 20:08:30 2021] Oops: 0000 [#1] SMP NOPTI
[Wed Jul 21 20:08:30 2021] CPU: 63 PID: 3708 Comm: Xorg Tainted: P OE 5.11.0-25-generic #27-Ubuntu
[Wed Jul 21 20:08:30 2021] Hardware name: ASUS System Product Name/ROG ZENITH II EXTREME, BIOS 1402 01/15/2021
[Wed Jul 21 20:08:30 2021] RIP: 0010:_nv002189kms+0x12/0x30 [nvidia_modeset]
[Wed Jul 21 20:08:30 2021] Code: c0 48 85 d2 74 07 80 7a 08 00 0f 95 c0 f3 c3 66 0f 1f 84 00 00 00 00 00 48 8b 87 20 7f 00 00 40 84 f6 40 0f 95 c6 40 0f b6 f6 <48> 8b 38 48 8
b 07 48 8b 80 10 01 00 00 e9 3c 76 76 f6 66 2e 0f 1f
[Wed Jul 21 20:08:30 2021] RSP: 0018:ffffb67150fe7ce0 EFLAGS: 00010246
[Wed Jul 21 20:08:30 2021] RAX: 0000000000000000 RBX: ffffb6715053e008 RCX: 0000000000000001
[Wed Jul 21 20:08:30 2021] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb6715053e008
[Wed Jul 21 20:08:30 2021] RBP: ffffb67140d9e058 R08: 0000000000000000 R09: 0000000000000be8
[Wed Jul 21 20:08:30 2021] R10: ffff95ab4b854008 R11: 0000000000010004 R12: ffff95ab86757008
[Wed Jul 21 20:08:30 2021] R13: 0000000000000000 R14: ffffb67140d9d008 R15: ffffb67140d9d430
[Wed Jul 21 20:08:30 2021] FS: 00007fe72be14a40(0000) GS:ffff95c9fdfc0000(0000) knlGS:0000000000000000
[Wed Jul 21 20:08:30 2021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Jul 21 20:08:30 2021] CR2: 0000000000000000 CR3: 00000001835a8000 CR4: 0000000000350ee0
[Wed Jul 21 20:08:30 2021] Call Trace:
[Wed Jul 21 20:08:30 2021] ? _nv002198kms+0x1ed/0x220 [nvidia_modeset]
[Wed Jul 21 20:08:30 2021] ? _nv002567kms+0x119b/0x1a80 [nvidia_modeset]
[Wed Jul 21 20:08:30 2021] ? __check_object_size.part.0+0x4a/0x150
[Wed Jul 21 20:08:30 2021] ? _nv000562kms+0x50/0x50 [nvidia_modeset]
[Wed Jul 21 20:08:30 2021] ? nvKmsIoctl+0x96/0x1d0 [nvidia_modeset]
[Wed Jul 21 20:08:30 2021] ? nvkms_ioctl+0x107/0x180 [nvidia_modeset]
[Wed Jul 21 20:08:30 2021] ? nvidia_frontend_unlocked_ioctl+0x3b/0x50 [nvidia]
[Wed Jul 21 20:08:30 2021] ? __x64_sys_ioctl+0x91/0xc0
[Wed Jul 21 20:08:30 2021] ? do_syscall_64+0x38/0x90
[Wed Jul 21 20:08:30 2021] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
I get a very similar problem. I got an LG CX Oled, and if I turn it off for a while (5-10mins) when I come back, the whole system will freeze. I posted it on another thread, but here it is: System freeze on resume
In case it helps, the crashlog is found here: Aug 11 19:35:30 scruffy kernel: BUG: kernel NULL pointer dereference, address: 0 - Pastebin.com
Card is a 2070 (Non-super). Gigabyte Windforce, if it matters.
I’ll keep an eye on this thread. Later I might try to trigger a crash and run the bug report tool.
nvidia-bug-report.log.gz (267.1 KB)
Seems that wasn’t how the tool worked. Uploaded the bug report file.
I might have found a way to mitigate this until a fix is issued (If one will ever be issued).
Before you turn on your monitor/tv, switch to a different screen (I do CTRL+ALT+F2). Once your TV is up, I give it a sec or two just to be sure and bring it back to the main screen (CTRL+ALT+F1) and it doesn’t crash.
Did it twice now without any problems.
1 Like