[535.161.07] Nvidia driver crash

I was running Blender on a server. The server suddenly became unresponsive. So I had to ask the manager to restart the server. Below is the log I found at that time. And the issue seems to have happened again. Currently it responds ping but it won’t allow open a ssh session. (Interestingly, it checks my password and rejects it if it’s a wrong password.) What should I investigate more? Or do you have any advice? Thanks!

sudo journalctl -o short-precise -k -b -1 > jctl_20240720.log

Jul 19 02:11:55.268428 gen06 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth02bff90: link becomes ready
Jul 20 06:28:57.935938 gen06 kernel: general protection fault, probably for non-canonical address 0x65b00c3dac7abbab: 0000 [#1] SMP NOPTI
Jul 20 06:28:57.936183 gen06 kernel: CPU: 187 PID: 1028105 Comm: blender Tainted: P           OE     5.15.0-101-generic #111-Ubuntu
Jul 20 06:28:57.936219 gen06 kernel: Hardware name: Supermicro AS -4124GO-NART/H12DGO-6, BIOS 2.5a 04/18/2023
Jul 20 06:28:57.936266 gen06 kernel: RIP: 0010:__kmalloc+0x111/0x330
Jul 20 06:28:57.936297 gen06 kernel: Code: 8b 50 08 49 8b 00 49 83 78 10 00 48 89 45 c8 0f 84 c5 01 00 00 48 85 c0 0f 84 bc 01 00 00 41 8b 4c 24 28 49 8b 3c 24 48 01 c1 <48> 8b 19 48 89 ce 49 33 9c 24 b8 00 00 00 48 8d 4a 01 48 0f ce 48
Jul 20 06:28:57.936328 gen06 kernel: RSP: 0018:ffffaedc5e933a80 EFLAGS: 00010202
Jul 20 06:28:57.936359 gen06 kernel: RAX: 65b00c3dac7abb8b RBX: 0000000000006cc0 RCX: 65b00c3dac7abbab
Jul 20 06:28:57.936432 gen06 kernel: RDX: 0000000003de5e18 RSI: 0000000000006cc0 RDI: 00000000000360a0
Jul 20 06:28:57.936459 gen06 kernel: RBP: ffffaedc5e933ac0 R08: ffff95944ecf60a0 R09: ffff951781bf3f08
Jul 20 06:28:57.936482 gen06 kernel: R10: 0000000000000246 R11: 00000000ffffffff R12: ffff951700044500
Jul 20 06:28:57.936505 gen06 kernel: R13: ffffffffc16b8aee R14: 0000000000006cc0 R15: 0000000000000000
Jul 20 06:28:57.936544 gen06 kernel: FS:  00007f255946f580(0000) GS:ffff95944ecc0000(0000) knlGS:0000000000000000
Jul 20 06:28:57.936572 gen06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 20 06:28:57.936597 gen06 kernel: CR2: 00007f6f116ae000 CR3: 0000000592bea000 CR4: 0000000000350ee0
Jul 20 06:28:57.936620 gen06 kernel: Call Trace:
Jul 20 06:28:57.936647 gen06 kernel:  <TASK>
Jul 20 06:28:57.936669 gen06 kernel:  ? show_trace_log_lvl+0x1d6/0x2ea
Jul 20 06:28:57.936691 gen06 kernel:  ? show_trace_log_lvl+0x1d6/0x2ea
Jul 20 06:28:57.936712 gen06 kernel:  ? os_alloc_mem+0xce/0xe0 [nvidia]
Jul 20 06:28:57.936732 gen06 kernel:  ? show_regs.part.0+0x23/0x29
Jul 20 06:28:57.936757 gen06 kernel:  ? __die_body.cold+0x8/0xd
Jul 20 06:28:57.936784 gen06 kernel:  ? die_addr+0x3e/0x60
Jul 20 06:28:57.936810 gen06 kernel:  ? exc_general_protection+0x1c5/0x410
Jul 20 06:28:57.936831 gen06 kernel:  ? asm_exc_general_protection+0x27/0x30
Jul 20 06:28:57.936857 gen06 kernel:  ? os_alloc_mem+0xce/0xe0 [nvidia]
Jul 20 06:28:57.936877 gen06 kernel:  ? __kmalloc+0x111/0x330
Jul 20 06:28:57.936903 gen06 kernel:  os_alloc_mem+0xce/0xe0 [nvidia]
Jul 20 06:28:57.936923 gen06 kernel:  _nv012729rm+0x34/0x50 [nvidia]
Jul 20 06:28:57.936949 gen06 kernel: WARNING: kernel stack frame pointer at 00000000f249b0c0 in blender:1028105 has bad value 0000000005c29af2
Jul 20 06:28:57.936971 gen06 kernel: unwind stack type:0 next_sp:0000000000000000 mask:0x2 graph_idx:0
Jul 20 06:28:57.936992 gen06 kernel: 00000000505ddae1: ffffaedc5e933ae0 (0xffffaedc5e933ae0)
Jul 20 06:28:57.937017 gen06 kernel: 00000000d7f2c19c: ffffffffc16b8aee (os_alloc_mem+0xce/0xe0 [nvidia])
Jul 20 06:28:57.937045 gen06 kernel: 000000002e323510: ffff95175c38d388 (0xffff95175c38d388)
Jul 20 06:28:57.937067 gen06 kernel: 000000009ea7209f: 0000000000000028 (0x28)
Jul 20 06:28:57.937089 gen06 kernel: 00000000f249b0c0: ffff951bbce4ad30 (0xffff951bbce4ad30)
Jul 20 06:28:57.937114 gen06 kernel: 00000000b8d01d44: ffffffffc1f88634 (_nv012729rm+0x34/0x50 [nvidia])
Jul 20 06:28:57.937137 gen06 kernel: 000000004e309926: ffffffffc1f87b50 (_nv042304rm+0x40/0x40 [nvidia])
Jul 20 06:28:57.937162 gen06 kernel: 000000008a0d7867: ffffffffc1f8847b (_nv012731rm+0x2b/0xd0 [nvidia])
Jul 20 06:28:57.937184 gen06 kernel: 000000002a6767d3: 00000000000fb009 (0xfb009)
Jul 20 06:28:57.937210 gen06 kernel: 000000004a612717: ffff951bbce4ad80 (0xffff951bbce4ad80)
Jul 20 06:28:57.937243 gen06 kernel: 00000000023398e4: ffff951bbce4ae48 (0xffff951bbce4ae48)
Jul 20 06:28:57.937275 gen06 kernel: 00000000841692d0: ffffffffc1746de8 (_nv045226rm+0x38/0x1b0 [nvidia])
Jul 20 06:28:57.937297 gen06 kernel: 00000000714f08a3: ffff951781bf3f08 (0xffff951781bf3f08)
Jul 20 06:28:57.937322 gen06 kernel: 00000000192df73c: ffff951bbce4ae48 (0xffff951bbce4ae48)
Jul 20 06:28:57.937344 gen06 kernel: 000000009aed3eeb: 0000000000000055 (0x55)
Jul 20 06:28:57.937364 gen06 kernel: 0000000050fea753: 0000000000000079 (0x79)
Jul 20 06:28:57.937383 gen06 kernel: 000000006972563e: ffffffffc4a48800 (_nv000453rm+0xad4/0xfffffffffd7fd2d4 [nvidia])
Jul 20 06:28:57.937399 gen06 kernel: 000000007743ca87: ffffffffc1f8a507 (_nv045231rm+0x67/0x390 [nvidia])
Jul 20 06:28:57.937424 gen06 kernel: 000000007d804514: ffff951781bf3908 (0xffff951781bf3908)
Jul 20 06:28:57.937446 gen06 kernel: 00000000adbe8836: ffffffffc4a48280 (_nv000453rm+0x554/0xfffffffffd7fd2d4 [nvidia])
Jul 20 06:28:57.937463 gen06 kernel: 00000000348228e6: ffff951781bf3f08 (0xffff951781bf3f08)
Jul 20 06:28:57.937483 gen06 kernel: 000000009b6101e3: 0000000000000051 (0x51)
Jul 20 06:28:57.937508 gen06 kernel: 0000000081a0b24a: ffff951bbce4af70 (0xffff951bbce4af70)
Jul 20 06:28:57.937530 gen06 kernel: 000000002c5a2e6e: ffffffffc17486d4 (_nv043411rm+0x164/0x2c0 [nvidia])
Jul 20 06:28:57.937550 gen06 kernel: 000000001c42f7c0: 0000000000000079 (0x79)
Jul 20 06:28:57.937568 gen06 kernel: 00000000ad0b9864: ffffffffc4a48280 (_nv000453rm+0x554/0xfffffffffd7fd2d4 [nvidia])
Jul 20 06:28:57.937586 gen06 kernel: 00000000875d5fb4: 00000000c1d21ca4 (0xc1d21ca4)
Jul 20 06:28:57.937605 gen06 kernel: 00000000ee2d6c74: 000000005c000015 (0x5c000015)
Jul 20 06:28:57.937874 gen06 kernel: 0000000064e97f2d: ffff951781bf3908 (0xffff951781bf3908)
Jul 20 06:28:57.937910 gen06 kernel: 0000000089de9f93: ffffffffc1748a6c (_nv043412rm+0x5c/0x90 [nvidia])
Jul 20 06:28:57.937933 gen06 kernel: 0000000075e52520: 0000000000000000 ...
Jul 20 06:28:57.937955 gen06 kernel: 000000004b9b9f0e: ffff951bbce4af70 (0xffff951bbce4af70)
Jul 20 06:28:57.937975 gen06 kernel: 000000007eaf8a25: ffff951781bf3900 (0xffff951781bf3900)
Jul 20 06:28:57.937994 gen06 kernel: 0000000021ea00c8: 0000000000000000 ...
Jul 20 06:28:57.938014 gen06 kernel: 00000000e709a31d: ffff951781bf3900 (0xffff951781bf3900)
Jul 20 06:28:57.938029 gen06 kernel: 000000005c16f643: ffff951784454000 (0xffff951784454000)
Jul 20 06:28:57.938047 gen06 kernel: 00000000ec2915f6: 0000000000000030 (0x30)
Jul 20 06:28:57.938069 gen06 kernel: 00000000977d834c: ffffffffc175762b (_nv000573rm+0x6b/0x80 [nvidia])
Jul 20 06:28:57.938089 gen06 kernel: 0000000046fda1ed: 0000000000000000 ...
Jul 20 06:28:57.938112 gen06 kernel: 00000000c6106d47: ffff951bbce4af70 (0xffff951bbce4af70)
Jul 20 06:28:57.938129 gen06 kernel: 00000000cc893326: ffff9628c2856000 (0xffff9628c2856000)
Jul 20 06:28:57.938154 gen06 kernel: 00000000091c1909: ffffffffc2130e40 (_nv000716rm+0xa40/0xe70 [nvidia])
Jul 20 06:28:57.938176 gen06 kernel: 00000000a8ebdc7b: 0000000000000000 ...
Jul 20 06:28:57.938198 gen06 kernel: 000000008ea76801: ffff962a190f9000 (0xffff962a190f9000)
Jul 20 06:28:57.938215 gen06 kernel: 00000000283e3b54: 0000000000000000 ...
Jul 20 06:28:57.938236 gen06 kernel: 00000000b48f07fc: ffff951bbce48000 (0xffff951bbce48000)
Jul 20 06:28:57.938254 gen06 kernel: 00000000896cca62: ffff951784454000 (0xffff951784454000)
Jul 20 06:28:57.938274 gen06 kernel: 00000000dbb17c29: ffffaedc5e933db8 (0xffffaedc5e933db8)
Jul 20 06:28:57.938288 gen06 kernel: 0000000042532978: ffff9628c2856000 (0xffff9628c2856000)
Jul 20 06:28:57.938309 gen06 kernel: 0000000037189268: 000000000000002b (0x2b)
Jul 20 06:28:57.938326 gen06 kernel: 0000000061bf0c83: ffffffffc21379e8 (rm_ioctl+0x58/0xb0 [nvidia])
Jul 20 06:28:57.938348 gen06 kernel: 0000000031d19d7b: 000000305e933dd0 (0x305e933dd0)
Jul 20 06:28:57.938368 gen06 kernel: 000000006ffd73b6: ffff951781bf3900 (0xffff951781bf3900)
Jul 20 06:28:57.938386 gen06 kernel: 0000000085fb1cde: 00000000000fb009 (0xfb009)
Jul 20 06:28:57.938403 gen06 kernel: 000000002e77fb70: 000000017f106f64 (0x17f106f64)
Jul 20 06:28:57.938421 gen06 kernel: 00000000eba7e549: 0059848e98618a00 (0x59848e98618a00)
Jul 20 06:28:57.938438 gen06 kernel: 00000000d4d2802f: 0059848f86ccb200 (0x59848f86ccb200)
Jul 20 06:28:57.938454 gen06 kernel: 00000000d56ea050: 0059849594853600 (0x59849594853600)
Jul 20 06:28:57.938473 gen06 kernel: 00000000cafda73d: 0059848f0f971e00 (0x59848f0f971e00)
Jul 20 06:28:57.938490 gen06 kernel: 000000009d26a412: 0000000000000000 ...
Jul 20 06:28:57.938508 gen06 kernel: 000000002899a3ab: 00000120000000bb (0x120000000bb)
Jul 20 06:28:57.938525 gen06 kernel: 000000002bd5e154: 00000000000fb009 (0xfb009)
Jul 20 06:28:57.938541 gen06 kernel: 00000000eca369c2: ffffaedc4d573d58 (0xffffaedc4d573d58)
Jul 20 06:28:57.938558 gen06 kernel: 00000000f9164fc0: 0000000000000000 ...
Jul 20 06:28:57.938575 gen06 kernel: 00000000b3eecb46: 0000000000000001 (0x1)
Jul 20 06:28:57.938593 gen06 kernel: 0000000036c306f5: 0000000000000000 ...
Jul 20 06:28:57.938608 gen06 kernel: 00000000f1c55eef: fffffff000000000 (0xfffffff000000000)
Jul 20 06:28:57.938626 gen06 kernel: 000000000e77af2d: ffffffffc4a7c230 (_nv042300rm+0x90/0xfffffffffd7c8e60 [nvidia])
Jul 20 06:28:57.938647 gen06 kernel: 0000000026b8d20f: 0000000000000010 (0x10)
Jul 20 06:28:57.938665 gen06 kernel: 00000000100bcd89: 0000000000000000 ...
Jul 20 06:28:57.938687 gen06 kernel: 000000004eecdc3d: 0000000000000030 (0x30)
Jul 20 06:28:57.938709 gen06 kernel: 00000000a4f40558: ffff951781bf3900 (0xffff951781bf3900)
Jul 20 06:28:57.938735 gen06 kernel: 0000000086f1483a: ffff9628c2856000 (0xffff9628c2856000)
Jul 20 06:28:57.938755 gen06 kernel: 000000009b1e0e0c: ffff951784454000 (0xffff951784454000)
Jul 20 06:28:57.938773 gen06 kernel: 000000009f3837da: ffffffffc16abcbd (nvidia_ioctl+0x61d/0x840 [nvidia])
Jul 20 06:28:57.938791 gen06 kernel: 00000000f624c254: ffff951bbce48000 (0xffff951bbce48000)
Jul 20 06:28:57.938812 gen06 kernel: 00000000211b6b29: 00007ffee888dde0 (0x7ffee888dde0)
Jul 20 06:28:57.938834 gen06 kernel: 00000000c56d9221: 00007ffe0000002b (0x7ffe0000002b)
Jul 20 06:28:57.938871 gen06 kernel: 00000000be3afe33: 00000000c030462b (0xc030462b)
Jul 20 06:28:57.938891 gen06 kernel: 00000000b25db7dd: 00007ffee888dde0 (0x7ffee888dde0)
Jul 20 06:28:57.938917 gen06 kernel: 000000001cc90301: 45cb356bc3b91600 (0x45cb356bc3b91600)
Jul 20 06:28:57.938937 gen06 kernel: 0000000019c51c02: 0000000000000003 (0x3)
Jul 20 06:28:57.938957 gen06 kernel: 00000000bd6d04b5: ffff95244c2f8e00 (0xffff95244c2f8e00)
Jul 20 06:28:57.938972 gen06 kernel: 000000007776fc33: 00000000c030462b (0xc030462b)
Jul 20 06:28:57.938986 gen06 kernel: 0000000026867c2a: ffff9517792ba120 (0xffff9517792ba120)
Jul 20 06:28:57.939005 gen06 kernel: 000000007cfc0a6d: ffff95244c2f8e00 (0xffff95244c2f8e00)
Jul 20 06:28:57.939023 gen06 kernel: 000000003244ad44: ffffaedc5e933df0 (0xffffaedc5e933df0)
Jul 20 06:28:57.939042 gen06 kernel: 000000002701f3c6: ffffffffc16be638 (nvidia_frontend_unlocked_ioctl+0x58/0x90 [nvidia])
Jul 20 06:28:57.939070 gen06 kernel: 00000000486c2eef: ffff95244c2f8e01 (0xffff95244c2f8e01)
Jul 20 06:28:57.939097 gen06 kernel: 000000006f7c3d79: ffff95244c2f8e01 (0xffff95244c2f8e01)
Jul 20 06:28:57.939119 gen06 kernel: 00000000ae13ab61: 0000000000000028 (0x28)
Jul 20 06:28:57.939139 gen06 kernel: 000000004a226ae9: 00000000c030462b (0xc030462b)
Jul 20 06:28:57.939157 gen06 kernel: 000000002094b94f: 00007ffee888dde0 (0x7ffee888dde0)
Jul 20 06:28:57.939178 gen06 kernel: 000000009ffc8192: ffffaedc5e933e28 (0xffffaedc5e933e28)
Jul 20 06:28:57.939198 gen06 kernel: 00000000eb666bc0: ffffffff903b0eb5 (__x64_sys_ioctl+0x95/0xd0)
Jul 20 06:28:57.939212 gen06 kernel: 0000000015cc9552: 0000000000000000 ...
Jul 20 06:28:57.939233 gen06 kernel: 00000000bce05eee: ffffaedc5e933f58 (0xffffaedc5e933f58)
Jul 20 06:28:57.939248 gen06 kernel: 00000000efe602eb: 0000000000000000 ...
Jul 20 06:28:57.939269 gen06 kernel: 000000001663e6c1: ffffaedc5e933f48 (0xffffaedc5e933f48)
Jul 20 06:28:57.939287 gen06 kernel: 00000000f76598ac: ffffffff90dbaa9c (do_syscall_64+0x5c/0xc0)
Jul 20 06:28:57.939311 gen06 kernel: 00000000fa0900fd: 0000000000000000 ...
Jul 20 06:28:57.939330 gen06 kernel: 000000006e996232: ffffaedc5e933e58 (0xffffaedc5e933e58)
Jul 20 06:28:57.939347 gen06 kernel: 0000000059b3243b: ffffffff9016eca7 (exit_to_user_mode_prepare+0x37/0xb0)
Jul 20 06:28:57.939363 gen06 kernel: 0000000042120a75: ffffaedc5e933f58 (0xffffaedc5e933f58)
Jul 20 06:28:57.939380 gen06 kernel: 00000000dfb22934: ffffaedc5e933e70 (0xffffaedc5e933e70)
Jul 20 06:28:57.939399 gen06 kernel: 000000005a7f8205: ffffffff90dbef45 (syscall_exit_to_user_mode+0x35/0x50)
Jul 20 06:28:57.939419 gen06 kernel: 00000000e803556e: ffffaedc5e933f58 (0xffffaedc5e933f58)
Jul 20 06:28:57.939462 gen06 kernel: 00000000c33541b9: ffffaedc5e933f48 (0xffffaedc5e933f48)
Jul 20 06:28:57.939485 gen06 kernel: 00000000c18344c1: ffffffff90dbaaa9 (do_syscall_64+0x69/0xc0)
Jul 20 06:28:57.939507 gen06 kernel: 0000000062e04802: ffffaedc5e933e98 (0xffffaedc5e933e98)
Jul 20 06:28:57.939525 gen06 kernel: 000000003299c78c: ffffffff9016eca7 (exit_to_user_mode_prepare+0x37/0xb0)
Jul 20 06:28:57.939544 gen06 kernel: 00000000d6fc8b12: ffffaedc5e933f58 (0xffffaedc5e933f58)
Jul 20 06:28:57.939568 gen06 kernel: 0000000045cbd1ce: ffffaedc5e933eb0 (0xffffaedc5e933eb0)
Jul 20 06:28:57.939584 gen06 kernel: 000000008b268259: ffffffff90dbef45 (syscall_exit_to_user_mode+0x35/0x50)
Jul 20 06:28:57.939605 gen06 kernel: 000000004a6c86de: 0000000000000000 ...
Jul 20 06:28:57.939622 gen06 kernel: 000000000c44919e: ffffaedc5e933f48 (0xffffaedc5e933f48)
Jul 20 06:28:57.939661 gen06 kernel: 0000000038bff850: ffffffff90dbaaa9 (do_syscall_64+0x69/0xc0)
Jul 20 06:28:57.939682 gen06 kernel: 0000000051b199c2: ffffffff90dbaaa9 (do_syscall_64+0x69/0xc0)
Jul 20 06:28:57.939703 gen06 kernel: 00000000274adfeb: ffffaedc5e933f48 (0xffffaedc5e933f48)
Jul 20 06:28:57.939718 gen06 kernel: 00000000f225456b: ffffffff90dbaaa9 (do_syscall_64+0x69/0xc0)
Jul 20 06:28:57.939742 gen06 kernel: 00000000712c6f38: ffffaedc5e933f58 (0xffffaedc5e933f58)
Jul 20 06:28:57.939768 gen06 kernel: 000000005786f2b4: ffffaedc5e933ef8 (0xffffaedc5e933ef8)
Jul 20 06:28:57.939791 gen06 kernel: 00000000d259b57c: ffffffff90dbef45 (syscall_exit_to_user_mode+0x35/0x50)
Jul 20 06:28:57.939811 gen06 kernel: 00000000d6ba4e23: 0000000000000000 ...
Jul 20 06:28:57.939827 gen06 kernel: 000000005903be38: ffffaedc5e933f48 (0xffffaedc5e933f48)
Jul 20 06:28:57.939841 gen06 kernel: 00000000ac964425: ffffffff90dbaaa9 (do_syscall_64+0x69/0xc0)
Jul 20 06:28:57.939860 gen06 kernel: 00000000495aeab8: ffffffff90dbef45 (syscall_exit_to_user_mode+0x35/0x50)
Jul 20 06:28:57.939875 gen06 kernel: 000000007803f839: 0000000000000000 ...
Jul 20 06:28:57.939891 gen06 kernel: 0000000050a833fd: ffffaedc5e933f48 (0xffffaedc5e933f48)
Jul 20 06:28:57.939908 gen06 kernel: 000000007441a634: ffffffff90dbaaa9 (do_syscall_64+0x69/0xc0)
Jul 20 06:28:57.939919 gen06 kernel: 00000000bc262b6b: 0000000000000000 ...
Jul 20 06:28:57.939934 gen06 kernel: 00000000f616fed9: ffffffff90e000da (entry_SYSCALL_64_after_hwframe+0x62/0xcc)
Jul 20 06:28:57.939947 gen06 kernel: 0000000010b8f5fa: 00007ffee888dd60 (0x7ffee888dd60)
Jul 20 06:28:57.940084 gen06 kernel: 000000000d57bbeb: 00007ffee888de08 (0x7ffee888de08)
Jul 20 06:28:57.940105 gen06 kernel: 000000008c9f92c6: 0000000000000028 (0x28)
Jul 20 06:28:57.940122 gen06 kernel: 000000007f2e69f2: 00000000c030462b (0xc030462b)
Jul 20 06:28:57.940136 gen06 kernel: 000000009e02a270: 00007ffee888dda0 (0x7ffee888dda0)
Jul 20 06:28:57.940151 gen06 kernel: 0000000014a3d18e: 00007ffee888dde0 (0x7ffee888dde0)
Jul 20 06:28:57.940165 gen06 kernel: 00000000252e7683: 0000000000000246 (0x246)
Jul 20 06:28:57.940178 gen06 kernel: 000000006ffe0db2: 00007f24bb352c00 (0x7f24bb352c00)
Jul 20 06:28:57.940192 gen06 kernel: 000000008c6f5b1a: 00007ffee888de08 (0x7ffee888de08)
Jul 20 06:28:57.940207 gen06 kernel: 00000000ed46db08: 00007ffee888dde0 (0x7ffee888dde0)
Jul 20 06:28:57.940227 gen06 kernel: 00000000d8697319: ffffffffffffffda (0xffffffffffffffda)
Jul 20 06:28:57.940243 gen06 kernel: 00000000405b4639: 00007f255ab4cc5b (0x7f255ab4cc5b)
Jul 20 06:28:57.940260 gen06 kernel: 00000000cf0dbf4b: 00007ffee888dde0 (0x7ffee888dde0)
Jul 20 06:28:57.940276 gen06 kernel: 0000000078a40210: 00000000c030462b (0xc030462b)
Jul 20 06:28:57.940294 gen06 kernel: 000000001ee37f20: 0000000000000028 (0x28)
Jul 20 06:28:57.940508 gen06 kernel: 0000000077d12691: 0000000000000010 (0x10)
Jul 20 06:28:57.940693 gen06 kernel: 00000000f041f826: 00007f255ab4cc5b (0x7f255ab4cc5b)
Jul 20 06:28:57.940732 gen06 kernel: 00000000168e9052: 0000000000000033 (0x33)
Jul 20 06:28:57.940773 gen06 kernel: 00000000a8052c18: 0000000000000246 (0x246)
Jul 20 06:28:57.940797 gen06 kernel: 00000000265c3b22: 00007ffee888dcf0 (0x7ffee888dcf0)
Jul 20 06:28:57.940816 gen06 kernel: 00000000589fc5e1: 000000000000002b (0x2b)
Jul 20 06:28:57.940834 gen06 kernel:  ? _nv042304rm+0x40/0x40 [nvidia]
Jul 20 06:28:57.940861 gen06 kernel:  ? _nv012731rm+0x2b/0xd0 [nvidia]
Jul 20 06:28:57.940886 gen06 kernel:  ? _nv045226rm+0x38/0x1b0 [nvidia]
Jul 20 06:28:57.940919 gen06 kernel:  ? _nv045231rm+0x67/0x390 [nvidia]
Jul 20 06:28:57.940962 gen06 kernel:  ? _nv043411rm+0x164/0x2c0 [nvidia]
Jul 20 06:28:57.941224 gen06 kernel:  ? _nv043412rm+0x5c/0x90 [nvidia]
Jul 20 06:28:57.941251 gen06 kernel:  ? _nv000573rm+0x6b/0x80 [nvidia]
Jul 20 06:28:57.941277 gen06 kernel:  ? _nv000716rm+0xa40/0xe70 [nvidia]
Jul 20 06:28:57.941299 gen06 kernel:  ? rm_ioctl+0x58/0xb0 [nvidia]
Jul 20 06:28:57.941319 gen06 kernel:  ? nvidia_ioctl+0x61d/0x840 [nvidia]
Jul 20 06:28:57.941339 gen06 kernel:  ? nvidia_frontend_unlocked_ioctl+0x58/0x90 [nvidia]
Jul 20 06:28:57.941359 gen06 kernel:  ? __x64_sys_ioctl+0x95/0xd0
Jul 20 06:28:57.941381 gen06 kernel:  ? do_syscall_64+0x5c/0xc0
Jul 20 06:28:57.941415 gen06 kernel:  ? exit_to_user_mode_prepare+0x37/0xb0
Jul 20 06:28:57.941453 gen06 kernel:  ? syscall_exit_to_user_mode+0x35/0x50
Jul 20 06:28:57.941487 gen06 kernel:  ? do_syscall_64+0x69/0xc0
Jul 20 06:28:57.941732 gen06 kernel:  ? exit_to_user_mode_prepare+0x37/0xb0
Jul 20 06:28:57.941781 gen06 kernel:  ? syscall_exit_to_user_mode+0x35/0x50
Jul 20 06:28:57.941817 gen06 kernel:  ? do_syscall_64+0x69/0xc0
Jul 20 06:28:57.941848 gen06 kernel:  ? do_syscall_64+0x69/0xc0
Jul 20 06:28:57.941880 gen06 kernel:  ? do_syscall_64+0x69/0xc0
Jul 20 06:28:57.941915 gen06 kernel:  ? syscall_exit_to_user_mode+0x35/0x50
Jul 20 06:28:57.941947 gen06 kernel:  ? do_syscall_64+0x69/0xc0
Jul 20 06:28:57.941983 gen06 kernel:  ? syscall_exit_to_user_mode+0x35/0x50
Jul 20 06:28:57.942014 gen06 kernel:  ? do_syscall_64+0x69/0xc0
Jul 20 06:28:57.942045 gen06 kernel:  ? entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jul 20 06:28:57.942088 gen06 kernel:  </TASK>
Jul 20 06:28:57.942139 gen06 kernel: Modules linked in: xt_nat xt_tcpudp cpuid tls veth xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc rpcsec_gss_krb5 nfsv4 overlay nfs fscache netfs intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm rapl binfmt_misc ipmi_ssif nls_iso8859_1 joydev input_leds ccp k10temp ptdma acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid nvidia_uvm(POE) sch_fq_codel dm_multipath nfsd scsi_dh_rdac scsi_dh_emc scsi_dh_alua auth_rpcgss nfs_acl lockd grace pstore_blk ramoops msr reed_solomon pstore_zone efi_pstore sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) hid_generic rndis_host cdc_ether usbhid usbnet hid mii
Jul 20 06:28:57.942258 gen06 kernel:  ast drm_vram_helper ses drm_ttm_helper ttm enclosure crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd drm_kms_helper syscopyarea sysfillrect sysimgblt megaraid_sas fb_sys_fops igb cec ahci mpt3sas dca xhci_pci rc_core libahci i2c_algo_bit raid_class xhci_pci_renesas i2c_piix4 drm scsi_transport_sas
Jul 20 06:28:57.942295 gen06 kernel: ---[ end trace 9e5323febb98cfd0 ]---
Jul 20 06:28:58.084455 gen06 kernel: RIP: 0010:__kmalloc+0x111/0x330
Jul 20 06:28:58.084623 gen06 kernel: Code: 8b 50 08 49 8b 00 49 83 78 10 00 48 89 45 c8 0f 84 c5 01 00 00 48 85 c0 0f 84 bc 01 00 00 41 8b 4c 24 28 49 8b 3c 24 48 01 c1 <48> 8b 19 48 89 ce 49 33 9c 24 b8 00 00 00 48 8d 4a 01 48 0f ce 48
Jul 20 06:28:58.084641 gen06 kernel: RSP: 0018:ffffaedc5e933a80 EFLAGS: 00010202
Jul 20 06:28:58.084666 gen06 kernel: RAX: 65b00c3dac7abb8b RBX: 0000000000006cc0 RCX: 65b00c3dac7abbab
Jul 20 06:28:58.088404 gen06 kernel: RDX: 0000000003de5e18 RSI: 0000000000006cc0 RDI: 00000000000360a0
Jul 20 06:28:58.089237 gen06 kernel: RBP: ffffaedc5e933ac0 R08: ffff95944ecf60a0 R09: ffff951781bf3f08
Jul 20 06:28:58.089770 gen06 kernel: R10: 0000000000000246 R11: 00000000ffffffff R12: ffff951700044500
Jul 20 06:28:58.090035 gen06 kernel: R13: ffffffffc16b8aee R14: 0000000000006cc0 R15: 0000000000000000
Jul 20 06:28:58.090257 gen06 kernel: FS:  00007f255946f580(0000) GS:ffff95944ecc0000(0000) knlGS:0000000000000000
Jul 20 06:28:58.090548 gen06 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 20 06:28:58.090786 gen06 kernel: CR2: 00007f6f116ae000 CR3: 0000000592bea000 CR4: 0000000000350ee0
Jul 20 06:28:58.294644 gen06 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000038
Jul 20 06:28:58.314229 gen06 kernel: #PF: supervisor read access in kernel mode

@multiwatts May I ask if there is a solution to your problem? I also encountered a similar problem.

[image]

Still, I have no idea.

which os and gpu did you used? i have a similar problem, my driver version is 535.161.08

Hi, do you have any progress? I also encountered a similar problem with RTX4090 driver 525.89.02.

Hi, have you solved it ? And could you tell me what have you done to trigger this bug. I can’t reproduce it now. Appreciate it.