535.129.03 freeze system, crash or getting nuts on RTX3050

Hi,

All drivers since 535.98 are not working on my machine, I got freeze, display glitch …
I’m playing Diablo 4, the game is working perfectly with 535.98, but when I try to update a lot of issues appears.

Please have a look to kernel.log we clearly see the stack error :

2023-11-05T10:11:14.789878+01:00 morrowind kernel: [  231.298095]  ? __die+0x23/0x70
2023-11-05T10:11:14.789878+01:00 morrowind kernel: [  231.298099]  ? page_fault_oops+0x171/0x4f0
2023-11-05T10:11:14.789878+01:00 morrowind kernel: [  231.298102]  ? _nv013176rm+0xc1/0x130 [nvidia]
2023-11-05T10:11:14.789878+01:00 morrowind kernel: [  231.298328]  ? exc_page_fault+0x7f/0x180
2023-11-05T10:11:14.789878+01:00 morrowind kernel: [  231.298332]  ? asm_exc_page_fault+0x26/0x30
2023-11-05T10:11:14.789879+01:00 morrowind kernel: [  231.298338]  ? _nv043160rm+0x1d/0x40 [nvidia]
2023-11-05T10:11:14.789879+01:00 morrowind kernel: [  231.298505]  _nv016185rm+0xd0/0x120 [nvidia]
2023-11-05T10:11:14.789879+01:00 morrowind kernel: [  231.298667]  _nv045221rm+0x5e9/0x690 [nvidia]
2023-11-05T10:11:14.789879+01:00 morrowind kernel: [  231.298830]  _nv045216rm+0x6c/0x80 [nvidia]
2023-11-05T10:11:14.789880+01:00 morrowind kernel: [  231.299015]  _nv045243rm+0x61/0xb0 [nvidia]
2023-11-05T10:11:14.789880+01:00 morrowind kernel: [  231.299178]  _nv043394rm+0x95/0x100 [nvidia]
2023-11-05T10:11:14.789880+01:00 morrowind kernel: [  231.299317]  _nv000681rm+0x6c/0x80 [nvidia]
2023-11-05T10:11:14.789880+01:00 morrowind kernel: [  231.299469]  rm_cleanup_file_private+0x135/0x200 [nvidia]
2023-11-05T10:11:14.789880+01:00 morrowind kernel: [  231.299619]  nvidia_close+0x157/0x300 [nvidia]
2023-11-05T10:11:14.789880+01:00 morrowind kernel: [  231.299723]  nvidia_frontend_close+0x2b/0x50 [nvidia]
2023-11-05T10:11:14.789881+01:00 morrowind kernel: [  231.299829]  __fput+0xf2/0x2a0
2023-11-05T10:11:14.789881+01:00 morrowind kernel: [  231.299832]  task_work_run+0x5a/0x90

1°) Crash/Freeze : Here a bug report after reboot + kernel logs :
nvidia-bug-report_after_frozen_screen.log.gz (768.3 KB)
kernel.log.tar.gz (63.1 KB)

2°) Freeze, but I was able to “recover” and here the results :
nvidia-bug-report.log.gz (869.4 KB)

Thanks for highlighting issue to us, could you please share reliable repro steps to repro issue at our end…

launch Diablo 4 using Proton Eggroll : here
I tried with 8.22 and 8.3 issues occurs. About 8.22 I’m not sure, but there might be also an issue with it.

Do you need more informations about my system ?

New crash happen
kernel.log (123.7 KB)

Hi @amrits

any update on the crash ? Have you any clue of the issue ? Is it NVIDIA driver issue ? Linux Kernel issue ? BIOS issue ?

The crash still happening on my side. I came most of the time after resuming from Sleep.

Steps :

  • Resuming from Sleep
  • Launch Diablo 4
  • The game launch after 1 or 2 minutes system froze
  • Doing a hard reboot, try again and it works.

Thank you

Hi @amrits,

I got another kind of crash (same configuration) :

2023-12-03T13:50:01.356322+01:00 morrowind kernel: [50390.808853] BUG: unable to handle page fault for address: 0000000100000018
2023-12-03T13:50:01.356332+01:00 morrowind kernel: [50390.808858] #PF: supervisor read access in kernel mode
2023-12-03T13:50:01.356332+01:00 morrowind kernel: [50390.808860] #PF: error_code(0x0000) - not-present page
2023-12-03T13:50:01.356333+01:00 morrowind kernel: [50390.808861] PGD 0 P4D 0 
2023-12-03T13:50:01.356333+01:00 morrowind kernel: [50390.808864] Oops: 0000 [#1] PREEMPT SMP NOPTI
2023-12-03T13:50:01.356334+01:00 morrowind kernel: [50390.808866] CPU: 9 PID: 46042 Comm: brave Tainted: P           OE      6.5.0-2-amd64 #1  Debian 6.5.6-1
2023-12-03T13:50:01.356334+01:00 morrowind kernel: [50390.808868] Hardware name: ASUS System Product Name/PRIME B650-PLUS, BIOS 1811 10/07/2023
2023-12-03T13:50:01.356334+01:00 morrowind kernel: [50390.808870] RIP: 0010:_nv042956rm+0x1d/0x40 [nvidia]
2023-12-03T13:50:01.356335+01:00 morrowind kernel: [50390.809057] Code: 00 00 00 44 89 c0 c3 66 0f 1f 44 00 00 66 0f 1f 00 48 8b 47 18 48 85 c0 74 29 48 39 f>
2023-12-03T13:50:01.356335+01:00 morrowind kernel: [50390.809059] RSP: 0018:ffff9fa2471e3aa0 EFLAGS: 00010286
2023-12-03T13:50:01.356336+01:00 morrowind kernel: [50390.809061] RAX: 0000000100000000 RBX: ffff914b7ac85a48 RCX: ffff914cde5c3808
2023-12-03T13:50:01.356336+01:00 morrowind kernel: [50390.809062] RDX: ffffffffffffffd8 RSI: ffff914cd2f4d030 RDI: ffff914cde5c3830
2023-12-03T13:50:01.356336+01:00 morrowind kernel: [50390.809063] RBP: ffff914b7ac859f0 R08: ffffffffffffffd8 R09: ffff914b7ac85940
2023-12-03T13:50:01.356342+01:00 morrowind kernel: [50390.809064] R10: 00000000000380a0 R11: ffff914e4c076008 R12: 0000000000000000
2023-12-03T13:50:01.356342+01:00 morrowind kernel: [50390.809065] R13: ffff914cde5c3830 R14: ffff914b66fd8008 R15: ffff915010195808

RIP: 0010:_nv042956rm+0x1d/0x40 [nvidia]

Hi,

Same crash with drivers : 535.146.02

2023-12-17T23:12:18.962014+01:00 morrowind kernel: [  156.232807] RIP: 0010:_nv043176rm+0x1d/0x40 [nvidia]
2023-12-17T23:12:18.962015+01:00 morrowind kernel: [  156.232986] Code: 00 00 00 44 89 c0 c3 66 0f 1f 44 00 00 66 0f 1f 00 48 8b 47 18 48 85 c0 74 29 48 39 f>
2023-12-17T23:12:18.962015+01:00 morrowind kernel: [  156.232987] RSP: 0018:ffffb75488adbb18 EFLAGS: 00010286
2023-12-17T23:12:18.962015+01:00 morrowind kernel: [  156.232989] RAX: 000000001bd83000 RBX: ffff891742bbaa48 RCX: 0000000000000000
2023-12-17T23:12:18.962015+01:00 morrowind kernel: [  156.232990] RDX: ffffffffffffffd8 RSI: ffff891b7ccbd430 RDI: ffff891bbc643030
2023-12-17T23:12:18.962016+01:00 morrowind kernel: [  156.232991] RBP: ffff891742bba9f0 R08: ffffffffffffffd8 R09: ffff891742bba940
2023-12-17T23:12:18.962016+01:00 morrowind kernel: [  156.232992] R10: 00000000000380a0 R11: ffff8915814e2008 R12: 0000000000000000
2023-12-17T23:12:18.962016+01:00 morrowind kernel: [  156.232993] R13: ffff891bbc643030 R14: ffff8915b0900008 R15: ffff891bc6664808
2023-12-17T23:12:18.962016+01:00 morrowind kernel: [  156.232994] FS:  00000000003e2000(0063) GS:ffff891c9dec0000(006b) knlGS:00000000f7f34700
2023-12-17T23:12:18.962016+01:00 morrowind kernel: [  156.232995] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
2023-12-17T23:12:18.962028+01:00 morrowind kernel: [  156.232996] CR2: 000000001bd83018 CR3: 000000073ce22000 CR4: 0000000000750ee0
2023-12-17T23:12:18.962029+01:00 morrowind kernel: [  156.232997] PKRU: 55555554
2023-12-17T23:12:18.962029+01:00 morrowind kernel: [  156.232998] Call Trace:
2023-12-17T23:12:18.962029+01:00 morrowind kernel: [  156.233002]  <TASK>
2023-12-17T23:12:18.962030+01:00 morrowind kernel: [  156.233004]  ? __die+0x23/0x70
2023-12-17T23:12:18.962030+01:00 morrowind kernel: [  156.233008]  ? page_fault_oops+0x171/0x4e0
2023-12-17T23:12:18.962030+01:00 morrowind kernel: [  156.233012]  ? exc_page_fault+0x7f/0x180
2023-12-17T23:12:18.962030+01:00 morrowind kernel: [  156.233015]  ? asm_exc_page_fault+0x26/0x30
2023-12-17T23:12:18.962030+01:00 morrowind kernel: [  156.233021]  ? _nv043176rm+0x1d/0x40 [nvidia]
2023-12-17T23:12:18.962031+01:00 morrowind kernel: [  156.233184]  _nv016187rm+0xd0/0x120 [nvidia]
2023-12-17T23:12:18.962031+01:00 morrowind kernel: [  156.233346]  _nv047230rm+0xa4/0x110 [nvidia]
2023-12-17T23:12:18.962031+01:00 morrowind kernel: [  156.233563]  _nv010782rm+0x51/0x1a0 [nvidia]
2023-12-17T23:12:18.962031+01:00 morrowind kernel: [  156.233771]  _nv018451rm+0x49/0x3d0 [nvidia]
2023-12-17T23:12:18.962032+01:00 morrowind kernel: [  156.233975]  _nv002410rm+0xd/0x20 [nvidia]
2023-12-17T23:12:18.962032+01:00 morrowind kernel: [  156.234149]  _nv004110rm+0x16/0xb0 [nvidia]
2023-12-17T23:12:18.962032+01:00 morrowind kernel: [  156.234314]  _nv016162rm+0x52c/0x690 [nvidia]
2023-12-17T23:12:18.962032+01:00 morrowind kernel: [  156.234490]  _nv043516rm+0xab/0xe0 [nvidia]
2023-12-17T23:12:18.962032+01:00 morrowind kernel: [  156.234623]  _nv045238rm+0xa9/0x130 [nvidia]
2023-12-17T23:12:18.962033+01:00 morrowind kernel: [  156.234787]  _nv045237rm+0x3e5/0x690 [nvidia]
2023-12-17T23:12:18.962033+01:00 morrowind kernel: [  156.234948]  _nv043418rm+0xd5/0x160 [nvidia]
2023-12-17T23:12:18.962033+01:00 morrowind kernel: [  156.235080]  _nv043419rm+0x41/0x70 [nvidia]
2023-12-17T23:12:18.962033+01:00 morrowind kernel: [  156.235210]  _nv000567rm+0x4a/0x60 [nvidia]
2023-12-17T23:12:18.962033+01:00 morrowind kernel: [  156.235341]  _nv000715rm+0x1b7/0xe70 [nvidia]
2023-12-17T23:12:18.962034+01:00 morrowind kernel: [  156.235492]  rm_ioctl+0x58/0xb0 [nvidia]
2023-12-17T23:12:18.962034+01:00 morrowind kernel: [  156.235641]  nvidia_ioctl+0x5d8/0x880 [nvidia]
2023-12-17T23:12:18.962034+01:00 morrowind kernel: [  156.235745]  nvidia_frontend_compat_ioctl+0x3c/0x60 [nvidia]
2023-12-17T23:12:18.962034+01:00 morrowind kernel: [  156.235851]  __do_compat_sys_ioctl+0xc3/0x1a0
2023-12-17T23:12:18.962034+01:00 morrowind kernel: [  156.235855]  __do_fast_syscall_32+0x86/0xe0
2023-12-17T23:12:18.962034+01:00 morrowind kernel: [  156.235858]  ? srso_alias_return_thunk+0x5/0x7f
2023-12-17T23:12:18.962035+01:00 morrowind kernel: [  156.235860]  ? syscall_exit_to_user_mode+0x2b/0x40
2023-12-17T23:12:18.962035+01:00 morrowind kernel: [  156.235862]  ? srso_alias_return_thunk+0x5/0x7f
2023-12-17T23:12:18.962035+01:00 morrowind kernel: [  156.235864]  ? __do_fast_syscall_32+0x95/0xe0
2023-12-17T23:12:18.962035+01:00 morrowind kernel: [  156.235866]  do_fast_syscall_32+0x33/0x80
2023-12-17T23:12:18.962035+01:00 morrowind kernel: [  156.235867]  entry_SYSCALL_compat_after_hwframe+0x6d/0x75
2023-12-17T23:12:18.962035+01:00 morrowind kernel: [  156.235870] RIP: 0023:0xf7f9d579
2023-12-17T23:12:18.962036+01:00 morrowind kernel: [  156.235871] Code: c4 01 10 03 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 0>
2023-12-17T23:12:18.962036+01:00 morrowind kernel: [  156.235873] RSP: 002b:00000000007ff674 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
2023-12-17T23:12:18.962036+01:00 morrowind kernel: [  156.235874] RAX: ffffffffffffffda RBX: 000000000000001e RCX: 00000000c0104629
2023-12-17T23:12:18.962036+01:00 morrowind kernel: [  156.235875] RDX: 00000000007ff750 RSI: 00000000f7e1dff4 RDI: 00000000007ff750
2023-12-17T23:12:18.962037+01:00 morrowind kernel: [  156.235876] RBP: 0000000000000000 R08: 00000000007ff674 R09: 0000000000000000
2023-12-17T23:12:18.962037+01:00 morrowind kernel: [  156.235877] R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000000000
2023-12-17T23:12:18.962037+01:00 morrowind kernel: [  156.235878] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
2023-12-17T23:12:18.962037+01:00 morrowind kernel: [  156.235881]  </TASK>
2023-12-17T23:12:18.962037+01:00 morrowind kernel: [  156.235882] Modules linked in: nvidia_uvm(POE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_connt>
2023-12-17T23:12:18.962038+01:00 morrowind kernel: [  156.235929]  efi_pstore configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_gener>
2023-12-17T23:12:18.962038+01:00 morrowind kernel: [  156.235956] CR2: 000000001bd83018
2023-12-17T23:12:18.962039+01:00 morrowind kernel: [  156.235958] ---[ end trace 0000000000000000 ]---

It seems related to VRAM usage. When game is in “Ultra settings” for texture, the crash occurs. In “Low settings” issue seems not occurs.

Hi @poupouille
I am unfortunately not able to duplicate issue locally after trying steps in your earlier comments.
I will spend few more cycles on few other systems and update.
I have also filed a bug 4464466 internally for tracking purpose.

1 Like

Hi @poupouille
Just wanted to know if you have any other steps which have reproduced the same issue.
Because I am still not able to duplicate issue with the earlier steps share by you.

Hi @amrits ,

I worked around the issue in Diablo 4 by playing with lower resolutions which ate less VRAM. In texture at maximum capacity the crash was occuring. I did not retest.
As I gave you the call stack, you should be able to find the issue ;)
Or maybe the issue is on my setup, but no other games crash the system like that.