390 on Ubuntu 24.04

Background: Ubuntu dropped support for the 390 driver in Ubuntu 22.04.4/23.10, so I’ve been patching their last release for newer kernels and pushing it to “nvidiaexp” PPA. So far, this has worked well from the reports I’ve received.

Problem: Enter Ubuntu 24.04/Noble.
I again pushed the patched package to the PPA, and again, it rebuilt successfully for Noble. I also used a 24.04 VM and made sure I was able to install the packages and the dkms modules built correctly.
But people were reporting black screens or just blinking cursor after installing the 390 packages on 24.04. So I got out my GTX950, did a fresh install of Xubuntu 24.04, and ran into the same problem using the packages. I couldn’t even Ctrl+Alt+F1 to a terminal. I had to boot with nomodeset and purge the packages to get the system to boot again. The only thing I’ve been able to find in the logs is crashes from nvidia-smi and/or Xorg after the nvidia/nvidia-uvm/nvidia-drm modules (successfully) load. (Example below)

So does anyone know if something major changed between (X)ubuntu 23.10 and 24.04 that would break 390? I know kernel 6.8 is now used, but again, the kernel modules seem okay and other distros like Arch didn’t have to do anything special for 6.8 other than a one-line patch, which I used for my package.

If anyone is interested in trying 390 on 24.04, see: launchpad DOT net/~dtl131/+archive/ubuntu/nv390test

Sample crash:
[ 7.689022] kernel: WARNING: Flushing system-wide workqueues will be prohibited in near future.
[ 7.689025] kernel: CPU: 11 PID: 1418 Comm: Xorg Tainted: P OE 6.8.0-31-generic #31-Ubuntu
[ 7.689028] kernel: Hardware name: Gigabyte Technology Co., Ltd. B450 AORUS PRO WIFI/B450 AORUS PRO WIFI-CF, BIOS F65 03/22/2024
[ 7.689030] kernel: Call Trace:
[ 7.689032] kernel:
[ 7.689035] kernel: dump_stack_lvl+0x48/0x70
[ 7.689040] kernel: dump_stack+0x10/0x20
[ 7.689043] kernel: __warn_flushing_systemwide_wq+0x1a/0x30
[ 7.689048] kernel: os_flush_work_queue+0x59/0x80 [nvidia]
[ 7.689192] kernel: rm_disable_adapter+0x52/0xd0 [nvidia]
[ 7.689340] kernel: WARNING: kernel stack frame pointer at 000000007ad8ca60 in Xorg:1418 has bad value 00000000a2cca138
[ 7.689343] kernel: unwind stack type:0 next_sp:0000000000000000 mask:0x2 graph_idx:0
[ 7.689344] kernel: 00000000792517f0: ffffa6a382673b08 (0xffffa6a382673b08)
[ 7.689356] kernel: 00000000559c4f42: ffffffff98a59819 (show_trace_log_lvl+0x269/0x3f0)
[ 7.689359] kernel: 000000009b73088b: ffffffff9a5cfe42 (linux_banner+0x425662/0x5831d0)
[ 7.689362] kernel: 000000008a45610e: ffff90bb094d0000 (0xffff90bb094d0000)
[ 7.689365] kernel: 000000004e527769: ffffffff9a5fa453 (linux_banner+0x44fc73/0x5831d0)
[ 7.689366] kernel: 000000002038fabd: ffffa6a382673b78 (0xffffa6a382673b78)
[ 7.689369] kernel: 0000000023d01ad7: 0000000001ef0800 (0x1ef0800)
[ 7.689372] kernel: 00000000bc8f8c1e: 0000000000000002 (0x2)
[ 7.689375] kernel: 00000000d09345c4: 0000000000000001 (0x1)
[ 7.689378] kernel: 0000000016b14fba: ffffa6a382670000 (0xffffa6a382670000)
[ 7.689381] kernel: 000000003d71daf2: ffffa6a382674000 (0xffffa6a382674000)
[ 7.689384] kernel: 00000000f3304d44: 0000000000000000 …
[ 7.689385] kernel: 000000003c4a1565: ffffa6a382670000 (0xffffa6a382670000)
[ 7.689387] kernel: 000000005a0fafab: ffffa6a382674000 (0xffffa6a382674000)
[ 7.689390] kernel: 000000005cd9d942: 0000000000000000 …
[ 7.689391] kernel: 00000000ee294e42: 0000000000000002 (0x2)
[ 7.689394] kernel: 000000000b30da20: ffff90bb094d0000 (0xffff90bb094d0000)
[ 7.689397] kernel: 000000003ce512ba: 0000000000000000 …
[ 7.689397] kernel: 00000000a02e9e15: 0000000000000001 (0x1)
[ 7.689400] kernel: 0000000003a1f1a6: ffffa6a382673b70 (0xffffa6a382673b70)
[ 7.689403] kernel: 00000000e0589a65: ffffa6a382673a08 (0xffffa6a382673a08)
[ 7.689406] kernel: 00000000f7deef65: ffffffffc19112b2 (rm_disable_adapter+0x52/0xd0 [nvidia])
[ 7.689548] kernel: 00000000870882ae: 0000000000000000 …
[ 7.689549] kernel: 000000008b155357: 1759695c0557a600 (0x1759695c0557a600)
[ 7.689552] kernel: 0000000032ba7a03: 0000000000000246 (0x246)
[ 7.689555] kernel: 00000000a4cc654e: ffffffff9a5fa453 (linux_banner+0x44fc73/0x5831d0)
[ 7.689557] kernel: 00000000b3fe91f7: ffffa6a382673c48 (0xffffa6a382673c48)
[ 7.689560] kernel: 000000008e7fe9c6: ffff90bb01ef0800 (0xffff90bb01ef0800)
[ 7.689563] kernel: 0000000056875644: ffff90bb01ef0c80 (0xffff90bb01ef0c80)
[ 7.689565] kernel: 0000000026358eec: ffffa6a382673b18 (0xffffa6a382673b18)
[ 7.689568] kernel: 00000000955cc28f: ffffffff98a59ae8 (show_stack+0x58/0x70)
[ 7.689570] kernel: 00000000b29dabd4: ffffa6a382673b38 (0xffffa6a382673b38)
[ 7.689573] kernel: 00000000ac9a4c7b: ffffffff99b77cf8 (dump_stack_lvl+0x48/0x70)
[ 7.689575] kernel: 000000003f40a4ce: ffff90bb306c8000 (0xffff90bb306c8000)
[ 7.689578] kernel: 00000000593d1022: 0000000000000001 (0x1)
[ 7.689581] kernel: 000000005086e24f: ffffa6a382673b48 (0xffffa6a382673b48)
[ 7.689583] kernel: 000000007749ce04: ffffffff99b77d40 (dump_stack+0x10/0x20)
[ 7.689585] kernel: 0000000071562e04: ffffa6a382673b58 (0xffffa6a382673b58)
[ 7.689588] kernel: 00000000cdd5cdb4: ffffffff98b25fca (__warn_flushing_systemwide_wq+0x1a/0x30)
[ 7.689590] kernel: 00000000b4dba364: ffffa6a382673b70 (0xffffa6a382673b70)
[ 7.689593] kernel: 000000001d34a552: ffffffffc1252cb9 (os_flush_work_queue+0x59/0x80 [nvidia])
[ 7.689714] kernel: 000000005d09d8e1: 00000000c1892ac7 (0xc1892ac7)
[ 7.689718] kernel: 000000007ad8ca60: ffff90bb306cb000 (0xffff90bb306cb000)
[ 7.689720] kernel: 000000005d0c2516: ffffffffc19112b2 (rm_disable_adapter+0x52/0xd0 [nvidia])
[ 7.689862] kernel: 000000004150e190: 000000000000058a (0x58a)
[ 7.689866] kernel: 000000007ccd3f40: ffffffff00000785 (0xffffffff00000785)
[ 7.689868] kernel: 00000000c73a7469: 000f41fbebf5a180 (0xf41fbebf5a180)
[ 7.689871] kernel: 000000009a8ad0e8: 000f41fcdb08a240 (0xf41fcdb08a240)
[ 7.689874] kernel: 000000000c788f42: 000f41fcdb08a240 (0xf41fcdb08a240)
[ 7.689877] kernel: 000000001272f409: 000f41fc63d30e40 (0xf41fc63d30e40)
[ 7.689880] kernel: 0000000017f7591e: 0000000100000000 (0x100000000)
[ 7.689883] kernel: 000000002cb30cbb: 0000000000000000 …
[ 7.689883] kernel: 00000000f524c4e1: 0000000000000200 (0x200)
[ 7.689886] kernel: 000000003d74048b: 000000200000000b (0x200000000b)
[ 7.689889] kernel: 00000000e1e730a5: 000000000000058a (0x58a)
[ 7.689892] kernel: 00000000f07e62f1: 0000000000000000 …
[ 7.689892] kernel: 000000009b9253e3: ffffffff99f04100 (__entry_text_end+0x102549/0x10254d)
[ 7.689896] kernel: 000000003aa9a7ad: ffffffff99c37401 (down_trylock+0x21/0x40)
[ 7.689899] kernel: 00000000e2959402: 000f41fcda60c980 (0xf41fcda60c980)
[ 7.689902] kernel: 00000000c5153e66: ffff90bb01ef0800 (0xffff90bb01ef0800)
[ 7.689905] kernel: 00000000fa223003: ffff90bb01ef0800 (0xffff90bb01ef0800)
[ 7.689908] kernel: 000000003c843261: ffff90bb306c8000 (0xffff90bb306c8000)
[ 7.689910] kernel: 000000000851c7c0: ffff90bb306c8000 (0xffff90bb306c8000)
[ 7.689913] kernel: 00000000a49c85f3: ffffffffc124357c (nv_shutdown_adapter+0x1c/0xa0 [nvidia])
[ 7.690033] kernel: 000000007108b5ca: ffff90bb01ef0800 (0xffff90bb01ef0800)
[ 7.690036] kernel: 000000006cb94e24: ffff90bb306c8000 (0xffff90bb306c8000)
[ 7.690039] kernel: 000000005498bda8: ffff90bb01ef0800 (0xffff90bb01ef0800)
[ 7.690042] kernel: 00000000b70842f1: ffffa6a382673c88 (0xffffa6a382673c88)
[ 7.690045] kernel: 00000000676d05d3: ffffffffc1243b18 (nv_close_device+0x138/0x220 [nvidia])
[ 7.690164] kernel: 00000000f61853ea: ffffffff99f04110 (srso_alias_return_thunk+0x5/0xfbef5)
[ 7.690167] kernel: 0000000000561bb5: ffff90bb06b76a00 (0xffff90bb06b76a00)
[ 7.690170] kernel: 0000000090530a2e: ffff90bb01ef0800 (0xffff90bb01ef0800)
[ 7.690173] kernel: 000000008e29f2c4: ffff90bb0b43a400 (0xffff90bb0b43a400)
[ 7.690176] kernel: 000000006e244d5b: ffff90bb306c8000 (0xffff90bb306c8000)
[ 7.690179] kernel: 00000000ff8db938: ffff90bb01ef0c80 (0xffff90bb01ef0c80)
[ 7.690181] kernel: 00000000b6370714: ffffa6a382673cd0 (0xffffa6a382673cd0)
[ 7.690184] kernel: 000000008d74342c: ffffffffc1248ce1 (nvidia_close+0xd1/0x400 [nvidia])
[ 7.690305] kernel: 00000000b751c13a: 0000000000000000 …
[ 7.690305] kernel: 00000000bec5ded8: ffffffffc193a2e0 (nv_frontend_fops+0x120/0xffffffffffffde40 [nvidia])
[ 7.690444] kernel: 00000000d5c9abdd: ffff90bb18bfdf88 (0xffff90bb18bfdf88)
[ 7.690448] kernel: 0000000037853717: ffff90bb0b43a400 (0xffff90bb0b43a400)
[ 7.690451] kernel: 000000005ad2784c: ffff90bb01658da0 (0xffff90bb01658da0)
[ 7.690453] kernel: 00000000a2ffc58a: ffff90bb0fcf6240 (0xffff90bb0fcf6240)
[ 7.690456] kernel: 00000000c0a70826: ffffa6a382673cf8 (0xffffa6a382673cf8)
[ 7.690459] kernel: 00000000728d526e: ffffffffc12424b7 (nvidia_frontend_close+0x47/0x80 [nvidia])
[ 7.690578] kernel: 000000003b5c5edd: ffff90bb0b43a400 (0xffff90bb0b43a400)
[ 7.690582] kernel: 00000000729a6331: 000000000008001b (0x8001b)
[ 7.690585] kernel: 00000000486fcdc9: ffff90bb18bfdf88 (0xffff90bb18bfdf88)
[ 7.690588] kernel: 00000000b7cbbf97: ffffa6a382673d30 (0xffffa6a382673d30)
[ 7.690590] kernel: 00000000e0b068b1: ffffffff98ee2921 (__fput+0xa1/0x2e0)
[ 7.690593] kernel: 000000000d4132d9: ffff90bb18840500 (0xffff90bb18840500)
[ 7.690596] kernel: 0000000015e90def: ffff90bb094d0000 (0xffff90bb094d0000)
[ 7.690599] kernel: 0000000034c543d5: ffff90bb094d0cd4 (0xffff90bb094d0cd4)
[ 7.690601] kernel: 0000000094b689f1: 0000000000000001 (0x1)
[ 7.690604] kernel: 00000000dcc7f473: ffff90bb0772b180 (0xffff90bb0772b180)
[ 7.690607] kernel: 00000000d59f4d41: ffffa6a382673d40 (0xffffa6a382673d40)
[ 7.690610] kernel: 000000000d11b371: ffffffff98ee2bce (____fput+0xe/0x20)
[ 7.690611] kernel: 00000000f02faeb3: ffffa6a382673d68 (0xffffa6a382673d68)
[ 7.690614] kernel: 0000000048a36808: ffffffff98b319c1 (task_work_run+0x61/0xa0)
[ 7.690617] kernel: 000000009ee208cd: ffff90bb094d0000 (0xffff90bb094d0000)
[ 7.690620] kernel: 000000001764e8a2: 0000000000000001 (0x1)
[ 7.690622] kernel: 000000004920423d: 000000000000008b (0x8b)
[ 7.690625] kernel: 000000003f8c1b5e: ffffa6a382673db8 (0xffffa6a382673db8)
[ 7.690628] kernel: 00000000e5445549: ffffffff98b068b3 (do_exit+0x2b3/0x530)
[ 7.690631] kernel: 00000000da91ebc8: 000000000000000b (0xb)
[ 7.690634] kernel: 00000000bfe91183: 0000000004f03000 (0x4f03000)
[ 7.690637] kernel: 00000000a92620d6: 0000000004f03000 (0x4f03000)
[ 7.690640] kernel: 000000007a1a0177: 1759695c0557a600 (0x1759695c0557a600)
[ 7.690642] kernel: 00000000d182cf19: 000000000000008b (0x8b)
[ 7.690645] kernel: 000000008a14fd70: ffff90bb0ac0d100 (0xffff90bb0ac0d100)
[ 7.690648] kernel: 00000000f429cb56: ffff90bb094d0000 (0xffff90bb094d0000)
[ 7.690651] kernel: 000000008e2c865e: ffff90bb094d0c48 (0xffff90bb094d0c48)
[ 7.690654] kernel: 00000000155f4572: ffffa6a382673de8 (0xffffa6a382673de8)
[ 7.690656] kernel: 00000000c2fbbe36: ffffffff98b06d25 (do_group_exit+0x35/0x90)
[ 7.690659] kernel: 00000000b9bdd4e0: 000000000000000b (0xb)
[ 7.690661] kernel: 000000005955b9c1: ffffa6a382673e70 (0xffffa6a382673e70)
[ 7.690664] kernel: 000000001abf54fe: ffff90bb0ac0d100 (0xffff90bb0ac0d100)
[ 7.690667] kernel: 00000000949022bf: ffff90bb094d0c48 (0xffff90bb094d0c48)
[ 7.690670] kernel: 000000001fe07cd2: ffffa6a382673e60 (0xffffa6a382673e60)
[ 7.690673] kernel: 0000000004874162: ffffffff98b1b9c4 (get_signal+0x954/0x990)
[ 7.690676] kernel: 0000000080e1ca95: 0000000000000004 (0x4)
[ 7.690678] kernel: 0000000049efca66: ffffa6a382673f58 (0xffffa6a382673f58)
[ 7.690681] kernel: 0000000056ecdb21: 0000000000000308 (0x308)
[ 7.690684] kernel: 0000000071fc6912: ffffa6a382673e90 (0xffffa6a382673e90)
[ 7.690687] kernel: 000000003e3bfacc: ffff90bb0000000b (0xffff90bb0000000b)
[ 7.690690] kernel: 0000000036fb1ead: ffff90bb0000000b (0xffff90bb0000000b)
[ 7.690692] kernel: 00000000c5f8f2e2: 0000000098b1a17e (0x98b1a17e)
[ 7.690695] kernel: 0000000027dcee7f: 1759695c0557a600 (0x1759695c0557a600)
[ 7.690698] kernel: 00000000ecef9fde: ffffa6a382673f58 (0xffffa6a382673f58)
[ 7.690701] kernel: 00000000e04601dd: ffffa6a382673e70 (0xffffa6a382673e70)
[ 7.690704] kernel: 00000000a5ef9e2f: ffffa6a382673f58 (0xffffa6a382673f58)
[ 7.690707] kernel: 00000000959ca6b8: 0000000000000000 …
[ 7.690707] kernel: 00000000692f0459: ffffa6a382673ee0 (0xffffa6a382673ee0)
[ 7.690710] kernel: 000000004dadf567: ffffffff98a530f9 (arch_do_signal_or_restart+0x39/0x120)
[ 7.690713] kernel: 00000000de67ec5b: 0000000000000000 …
[ 7.690714] kernel: 00000000680efadc: 000000000000000b (0xb)
[ 7.690716] kernel: 000000002b7fc43b: 0000000000000001 (0x1)
[ 7.690719] kernel: 00000000230eacf5: 0000000000000308 (0x308)
[ 7.690722] kernel: 0000000033a423db: 0000000000000000 …
[ 7.690723] kernel: 000000005dd6b5be: 1759695c0557a600 (0x1759695c0557a600)
[ 7.690725] kernel: 00000000fe77e6db: 0000000000400004 (0x400004)
[ 7.690728] kernel: 000000003b7bb60f: ffff90bb094d0000 (0xffff90bb094d0000)
[ 7.690731] kernel: 0000000094e9d94b: ffffa6a382673f08 (0xffffa6a382673f08)
[ 7.690734] kernel: 000000003638473c: ffffffff99c23ffe (irqentry_exit_to_user_mode+0x1fe/0x260)
[ 7.690737] kernel: 00000000e7e64081: ffffa6a382673f58 (0xffffa6a382673f58)
[ 7.690740] kernel: 0000000023cec18c: 0000000000000308 (0x308)
[ 7.690743] kernel: 000000008b4d2ae6: 0000000000000004 (0x4)
[ 7.690745] kernel: 000000004c48f3e4: ffffa6a382673f18 (0xffffa6a382673f18)
[ 7.690748] kernel: 000000006dc2e450: ffffffff99c240b3 (irqentry_exit+0x43/0x50)
[ 7.690751] kernel: 000000007a0559ef: ffffa6a382673f48 (0xffffa6a382673f48)
[ 7.690754] kernel: 00000000013e1a8d: ffffffff99c236d4 (exc_page_fault+0x94/0x1b0)
[ 7.690756] kernel: 0000000015db396a: 0000000000000000 …
[ 7.690756] kernel: 00000000a14bcf29: ffffa6a382673f59 (0xffffa6a382673f59)
[ 7.690759] kernel: 000000006234b0ef: ffffffff99e00ba7 (asm_exc_page_fault+0x27/0x30)
[ 7.690761] kernel: 0000000060ea41a0: 00007fff27ea0bb0 (0x7fff27ea0bb0)
[ 7.690764] kernel: 0000000026ae7344: 0000000000000000 …
[ 7.690765] kernel: 00000000e24be976: 0000566102f49431 (0x566102f49431)
[ 7.690767] kernel: 0000000021c11a0d: 0000000000000004 (0x4)
[ 7.690770] kernel: 00000000805543a2: 00007fff27ea0620 (0x7fff27ea0620)
[ 7.690773] kernel: 00000000d35eee08: 00000000ffffffff (0xffffffff)
[ 7.690776] kernel: 00000000208977d6: 0000566102db8020 (0x566102db8020)
[ 7.690779] kernel: 00000000754bb8cc: 0000566102d5a348 (0x566102d5a348)
[ 7.690781] kernel: 000000002075e900: 0000000000000005 (0x5)
[ 7.690784] kernel: 0000000091b1090c: 0000000000000000 …
[ 7.690785] kernel: 000000000a36c1f0: 0000000000000003 (0x3)
[ 7.690788] kernel: 00000000374a9612: 0000000000000000 …
[ 7.690788] kernel: 00000000a51707d7: 0000000000000004 (0x4)
[ 7.690791] kernel: 0000000052f55b2e: 0000566102f49431 (0x566102f49431)
[ 7.690794] kernel: 00000000414cfc15: 0000000000000002 (0x2)
[ 7.690797] kernel: 000000009c5419bf: ffffffffffffffff (0xffffffffffffffff)
[ 7.690800] kernel: 00000000b9407cac: 00007c8d1fa9854d (0x7c8d1fa9854d)
[ 7.690803] kernel: 000000005c6902cc: 0000000000000033 (0x33)
[ 7.690806] kernel: 000000005477f100: 0000000000010206 (0x10206)
[ 7.690809] kernel: 00000000de76f053: 00007fff27ea05f8 (0x7fff27ea05f8)
[ 7.690812] kernel: 00000000037be114: 000000000000002b (0x2b)
[ 7.690817] kernel: ? __entry_text_end+0x102549/0x10254d
[ 7.690844] kernel: ? down_trylock+0x21/0x40
[ 7.690847] kernel: ? nv_shutdown_adapter+0x1c/0xa0 [nvidia]
[ 7.690968] kernel: ? nv_close_device+0x138/0x220 [nvidia]
[ 7.691088] kernel: ? srso_alias_return_thunk+0x5/0xfbef5
[ 7.691092] kernel: ? nvidia_close+0xd1/0x400 [nvidia]
[ 7.691214] kernel: ? nvidia_frontend_close+0x47/0x80 [nvidia]
[ 7.691334] kernel: ? __fput+0xa1/0x2e0
[ 7.691337] kernel: ? ____fput+0xe/0x20
[ 7.691339] kernel: ? task_work_run+0x61/0xa0
[ 7.691342] kernel: ? do_exit+0x2b3/0x530
[ 7.691346] kernel: ? do_group_exit+0x35/0x90
[ 7.691349] kernel: ? get_signal+0x954/0x990
[ 7.691353] kernel: ? arch_do_signal_or_restart+0x39/0x120
[ 7.691358] kernel: ? irqentry_exit_to_user_mode+0x1fe/0x260
[ 7.691362] kernel: ? irqentry_exit+0x43/0x50
[ 7.691364] kernel: ? exc_page_fault+0x94/0x1b0
[ 7.691367] kernel: ? asm_exc_page_fault+0x27/0x30
[ 7.691372] kernel: