Soft lockup CPU stuck when tryng to switch to VT consoles with NVIDIA driver installed

With the NVIDIA driver installed I can not switch to a VT console using or any other number VT. The screens just go blank.

I can switch back to the graphical VT after a period. The desktop (Gnome in this case) is usable, but there is screen corruption.

I notice the following in the log followed by a trace of some sort

NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Xorg:1296]

I believe that the same issue occurs when trying to logout/shutdown the system. Switching to a VT is just an easier way of recreating the problem.
nvidia-bug-report.log.gz (87.6 KB)

Just to confirm that this is occurring at login, logout, shutdown and VT switch.

A part from having no VT consoles the system is usable and has been stable so far. The login, logout and shutdown just take 22s longer to occur than they should.

[   43.638689] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Xorg:1296]
[   43.638704] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge ebtable_nat ip6table_security ip6table_raw ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 iptable_security iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel nvidia_drm(POE) nvidia_modeset(POE) kvm snd_hda_codec_realtek snd_hda_codec_hdmi mei_wdt snd_hda_codec_generic iTCO_wdt iTCO_vendor_support ppdev irqbypass crct10dif_pclmul crc32_pclmul nvidia(POE) snd_hda_intel snd_hda_codec snd_hda_core ghash_clmulni_intel intel_cstate
[   43.638714]  intel_uncore snd_hwdep snd_seq intel_rapl_perf snd_seq_device snd_pcm snd_timer mei_me snd joydev i2c_i801 mei soundcore shpchp lpc_ich parport_pc parport tpm_infineon tpm_tis soc_button_array tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc hid_logitech_hidpp i915 i2c_algo_bit drm_kms_helper crc32c_intel 8021q drm garp stp serio_raw llc mrp r8169 mii video fjes hid_logitech_dj
[   43.638716] CPU: 0 PID: 1296 Comm: Xorg Tainted: P           OE   4.7.9-200.fc24.x86_64 #1
[   43.638716] Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F6 08/03/2013
[   43.638717] task: ffff8800b2fe1e80 ti: ffff8800b8064000 task.ti: ffff8800b8064000
[   43.638843] RIP: 0010:[<ffffffffc07f70dc>]  [<ffffffffc07f70dc>] os_io_read_dword+0xc/0x10 [nvidia]
[   43.638844] RSP: 0018:ffff8800b8067a58  EFLAGS: 00000202
[   43.638844] RAX: 00000000eccfa0c0 RBX: ffff88022131de78 RCX: 0000000000000001
[   43.638845] RDX: 000000000000e00c RSI: 00000000000a0000 RDI: 000000000000e00c
[   43.638845] RBP: ffff8800b8067a58 R08: 00000000000c4817 R09: 00000000000c4817
[   43.638845] R10: 0000000000000000 R11: ffffffffc0d74b90 R12: 000000000000c000
[   43.638846] R13: 0000000000001a5e R14: ffff88022131de7c R15: ffff88022131de80
[   43.638846] FS:  00007fbf814cea40(0000) GS:ffff88023e200000(0000) knlGS:0000000000000000
[   43.638847] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   43.638847] CR2: 0000000000feda04 CR3: 00000000b2e48000 CR4: 00000000001406f0
[   43.638848] Stack:
[   43.638849]  ffff88022131de40 ffffffffc0d7e4da 000000000000c000 ffffffffc0d752a5
[   43.638849]  ffff88022131de78 ffffffffc0d75084 ffff8800b3c98008 ffff8800b3c98008
[   43.638850]  0000000000004f02 ffff88022131de78 0000000000000118 ffffffffc0d5e29d
[   43.638850] Call Trace:
[   43.638913]  [<ffffffffc0d7e4da>] _nv021643rm+0x84da/0xbd60 [nvidia]
[   43.638975]  [<ffffffffc0d752a5>] ? _nv000923rm+0x85/0xb0 [nvidia]
[   43.639037]  [<ffffffffc0d75084>] ? _nv016020rm+0x164/0x220 [nvidia]
[   43.639099]  [<ffffffffc0d5e29d>] ? _nv016724rm+0x4d/0x140 [nvidia]
[   43.639161]  [<ffffffffc0d62784>] ? _nv000863rm+0x294/0x390 [nvidia]
[   43.639223]  [<ffffffffc0d62aa0>] ? _nv000782rm+0x220/0x3c0 [nvidia]
[   43.639285]  [<ffffffffc0d5f23d>] ? _nv015419rm+0x3d/0x60 [nvidia]
[   43.639347]  [<ffffffffc0d0a5c3>] ? _nv018828rm+0xc3/0x390 [nvidia]
[   43.639409]  [<ffffffffc0d092e3>] ? _nv018834rm+0x53/0xa0 [nvidia]
[   43.639471]  [<ffffffffc0d08ffa>] ? _nv018867rm+0x62a/0x650 [nvidia]
[   43.639532]  [<ffffffffc0d09032>] ? _nv000794rm+0x12/0x20 [nvidia]
[   43.639594]  [<ffffffffc0cee464>] ? _nv003179rm+0x1e54/0x3280 [nvidia]
[   43.639656]  [<ffffffffc0d6d00e>] ? rm_kernel_rmapi_op+0x10e/0x1f0 [nvidia]
[   43.639669]  [<ffffffffc1493769>] ? nvkms_call_rm+0x59/0x70 [nvidia_modeset]
[   43.639678]  [<ffffffffc14f8e9d>] ? _nv001958kms+0x6d/0xa0 [nvidia_modeset]
[   43.639685]  [<ffffffffc14c7832>] ? _nv001983kms+0x32/0x40 [nvidia_modeset]
[   43.639689]  [<ffffffffc14a83b9>] ? _nv001877kms+0x209/0x230 [nvidia_modeset]
[   43.639692]  [<ffffffff8b220cc5>] ? __kmalloc+0x1a5/0x220
[   43.639696]  [<ffffffffc1495e80>] ? _nv000311kms+0xd0/0xd0 [nvidia_modeset]
[   43.639699]  [<ffffffffc1495ef7>] ? _nv000314kms+0x77/0x130 [nvidia_modeset]
[   43.639703]  [<ffffffffc1493391>] ? nvkms_alloc+0x41/0x60 [nvidia_modeset]
[   43.639707]  [<ffffffffc1495e80>] ? _nv000311kms+0xd0/0xd0 [nvidia_modeset]
[   43.639711]  [<ffffffffc14950c1>] ? nvKmsIoctl+0x161/0x1e0 [nvidia_modeset]
[   43.639714]  [<ffffffffc1493d65>] ? nvkms_ioctl_common+0x45/0x80 [nvidia_modeset]
[   43.639718]  [<ffffffffc1493e11>] ? nvkms_ioctl+0x71/0xa0 [nvidia_modeset]
[   43.639749]  [<ffffffffc07eb080>] ? nvidia_frontend_compat_ioctl+0x40/0x50 [nvidia]
[   43.639780]  [<ffffffffc07eb09e>] ? nvidia_frontend_unlocked_ioctl+0xe/0x10 [nvidia]
[   43.639782]  [<ffffffff8b25b692>] ? do_vfs_ioctl+0xa2/0x5d0
[   43.639783]  [<ffffffff8b25bc39>] ? SyS_ioctl+0x79/0x90
[   43.639786]  [<ffffffff8b7ec5b2>] ? entry_SYSCALL_64_fastpath+0x1a/0xa4
[   43.639794] Code: 1f 44 00 00 55 89 fa 48 89 e5 ec 5d c3 66 90 0f 1f 44 00 00 55 89 fa 48 89 e5 66 ed 5d c3 90 0f 1f 44 00 00 55 89 fa 48 89 e5 ed <5d> c3 66 90 0f 1f 44 00 00 55 48 85 ff 48 89 e5 75 10 85 d2 48

Same issue with 370.28 driver. First report was with 367.57.
nvidia-bug-report.log.gz (86.5 KB)

Please try 375.10. It has an improved VT switching console restore code path that should apply in your configuration.

@mjtbrady

These commands will update you to 375.10 (the glvnd package fixes some EGL issues).

sudo dnf update https://kojipkgs.fedoraproject.org//work/tasks/631/16200631/libglvnd-0.2.999-6.git28867bb.fc24.x86_64.rpm
sudo dnf --enablerepo=rpmfusion-nonfree-rawhide update *\nvidia\*

@aplattner

Upgrading to 375.10 has resolved all my issues with login/logout/shutdown and switching to a VT.

@leigh123linux

Thanks for the hint. Saved me time messing around. I was already running that version of libglvnd.