2080Ti "soft lockup CPU" in Centos8.5

Hi,

The shell constantly prompts " kernel:watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [irq/150-nvidia:2122]". This message occurred recently, and I have used 2080Ti for more than one years.

Please let me know how to fix this issue. Thanks. I have attached some log contents as below:

[92524.797307] Hardware name: Gigabyte Technology Co., Ltd. Z490 GAMING X/Z490 GAMING X, BIOS F5 08/28/2020
[92524.797470] RIP: 0010:_nv018457rm+0x24f/0x270 [nvidia]
[92524.797471] Code: 00 00 4c 89 f6 48 89 df 49 8b 86 08 05 00 00 e8 37 c6 0e e2 be 00 00 81 02 bf 95 df 5e 0e 31 c0 e8 46 04 c9 ff e8 a1 1d 3b 00 fe 48 8b 04 25 a8 01 00 00 0f 0b be 00 00 77 02 bf 95 df 5e 0e
[92524.797471] RSP: 0018:ffffad8508277d88 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[92524.797472] RAX: 0000000000000000 RBX: ffff9b0e42f58008 RCX: 0000000000000020
[92524.797472] RDX: 0000000000000001 RSI: ffff9b0c06ca5d1c RDI: 0000000000000001
[92524.797473] RBP: ffff9b0c06ca5d28 R08: 0000000000000020 R09: ffff9b0c06ca5d10
[92524.797473] R10: ffff9b0e42f58008 R11: ffff9b0e42f59098 R12: ffff9b0c06c8c008
[92524.797473] R13: 0000000000000010 R14: ffff9b0c06f1e008 R15: 000000000001ffdf
[92524.797474] FS: 0000000000000000(0000) GS:ffff9b0ebd840000(0000) knlGS:0000000000000000
[92524.797474] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[92524.797475] CR2: 000056489395c9f8 CR3: 0000000c34e0a001 CR4: 00000000007606e0
[92524.797475] PKRU: 55555554
[92524.797475] Call Trace:
[92524.797614] ? _nv030099rm+0x14c/0x190 [nvidia]
[92524.797741] ? _nv028705rm+0x9f9/0xdc0 [nvidia]
[92524.797868] ? _nv028713rm+0x15d/0x400 [nvidia]
[92524.797952] ? _nv000709rm+0xa9/0x240 [nvidia]
[92524.797955] ? irq_finalize_oneshot.part.46+0xf0/0xf0
[92524.798039] ? rm_isr_bh+0x1c/0x60 [nvidia]
[92524.798088] ? nvidia_isr_kthread_bh+0x1b/0x40 [nvidia]
[92524.798089] ? irq_thread_fn+0x1f/0x50
[92524.798090] ? irq_thread+0xe7/0x170
[92524.798091] ? irq_forced_thread_fn+0x70/0x70
[92524.798092] ? irq_thread_check_affinity+0xe0/0xe0
[92524.798093] ? kthread+0x112/0x130
[92524.798093] ? kthread_flush_work_fn+0x10/0x10
[92524.798095] ? ret_from_fork+0x1f/0x40


1 Like

@icefrog1950
Could you please try once with latest release driver and share bug report if issue still persists.