[BUG] Nvidia 440.64 + kernel 5.5.6/stable -- boot trace; WAS Nvidia 440.59 + kernel 5.5.1/stable -- boot trace

running a local build of 440.59, UNPATCHED

inxi -Gx
	Graphics:  Card: NVIDIA GP108 [GeForce GT 1030] bus-ID: 09:00.0
	           Display Server: x11 (X.Org 1.20.3 ) drivers: nvidia (unloaded: modesetting,fbdev,vesa)
	           Resolution: 2560x1440@59.95hz
	           OpenGL: renderer: GeForce GT 1030/PCIe/SSE2 version: 4.5.0 NVIDIA 440.59 Direct Render: Yes

on stable-release Kernel,

uname -rm
		5.5.1-27.g267811a-default x86_64

boot & exec function ok.

there IS one nvidia boot log trace, apparently NON-fatal

...
[   30.436377] ------------[ cut here ]------------
[   30.436377] refcount_t: underflow; use-after-free.
[   30.436385] WARNING: CPU: 10 PID: 3179 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
[   30.436386] Modules linked in: nvidia_uvm(OE) cachefiles fscache iscsi_ibft iscsi_boot_sysfs vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) dmi_sysfs bluetooth ecdh_generic ecc squashfs loop sch_fq_codel edac_mce_amd kvm_amd kvm irqbypass i2c_dev uvcvideo nvidia_drm(POE) nvidia_modeset(POE) videobuf2_vmalloc crct10dif_pclmul videobuf2_memops crc32_pclmul videobuf2_v4l2 nvidia(POE) videobuf2_common videodev ghash_clmulni_intel aesni_intel eeepc_wmi asus_wmi battery crypto_simd drm_kms_helper cryptd sparse_keymap drm sp5100_tco rfkill glue_helper pcspkr video wmi_bmof mxm_wmi i2c_piix4 ipmi_devintf ipmi_msghandler fb_sys_fops syscopyarea sysfillrect snd_hda_codec_realtek ccp sysimgblt button acpi_cpufreq hid_generic usbhid raid10 md_mod tcp_bbr igb r8169 realtek xhci_pci i2c_algo_bit mpt3sas sr_mod raid_class crc32c_intel dca libphy pinctrl_amd scsi_transport_sas cdrom xhci_hcd wmi snd_usb_audio snd_usbmidi_lib mc snd_rawmidi snd_seq_device usbcore snd_hda_intel snd_intel_dspcfg snd_hda_codec_via
[   30.436408]  snd_hda_codec_hdmi snd_hda_codec_generic ledtrig_audio snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore sg nct6775 hwmon_vid k10temp msr sunrpc nbd dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs
[   30.436414] CPU: 10 PID: 3179 Comm: X Tainted: P           OE     5.5.1-27.g267811a-default #1
[   30.436415] Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
[   30.436416] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[   30.436417] Code: 05 42 86 05 01 01 e8 cb 13 bb ff 0f 0b c3 80 3d 30 86 05 01 00 75 95 48 c7 c7 40 cd 92 b4 c6 05 20 86 05 01 01 e8 ac 13 bb ff <0f> 0b c3 80 3d 0f 86 05 01 00 0f 85 72 ff ff ff 48 c7 c7 98 cd 92
[   30.436417] RSP: 0018:ffffad94c295fd88 EFLAGS: 00010282
[   30.436418] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
[   30.436418] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff9029dcc9bdd0
[   30.436418] RBP: ffff9029b3009ce8 R08: 00000000000006d2 R09: 0000000000000003
[   30.436419] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9029cb2ddae8
[   30.436419] R13: ffff9029cb2dd800 R14: 0000000000000004 R15: 0000000000000000
[   30.436420] FS:  00007f7d6abc8ec0(0000) GS:ffff9029dcc80000(0000) knlGS:0000000000000000
[   30.436420] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   30.436420] CR2: 00007f7d6a7f80b0 CR3: 0000000f97220000 CR4: 0000000000340ee0
[   30.436421] Call Trace:
[   30.436425]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[   30.436427]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[   30.436435]  drm_drop_master+0x1e/0x30 [drm]
[   30.436440]  drm_master_release+0x9f/0xb0 [drm]
[   30.436446]  drm_file_free.part.0+0x1fe/0x260 [drm]
[   30.436451]  drm_release+0x9a/0xd0 [drm]
[   30.436453]  __fput+0xc1/0x250
[   30.436454]  task_work_run+0xa1/0xc0
[   30.436456]  exit_to_usermode_loop+0x10c/0x130
[   30.436457]  do_syscall_64+0x1fa/0x240
[   30.436459]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   30.436461] RIP: 0033:0x7f7d68838f24
[   30.436461] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 ca c4 20 00 48 63 ff 85 c0 75 13 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 44 f3 c3 66 90 48 83 ec 18 48 89 7c 24 08 e8
[   30.436461] RSP: 002b:00007ffe12f28f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   30.436462] RAX: 0000000000000000 RBX: 000000000000000a RCX: 00007f7d68838f24
[   30.436462] RDX: 0000564437d10020 RSI: 0000000000000002 RDI: 000000000000000a
[   30.436463] RBP: 0000564437d55f20 R08: 0000564437d5f830 R09: 00007f7d685a8470
[   30.436463] R10: 0000000000000079 R11: 0000000000000246 R12: 0000564437d56090
[   30.436463] R13: 0000564437d55f60 R14: 0000000000000000 R15: 0000000000000000
[   30.436464] ---[ end trace 3ba8bcec794abbb6 ]---
...

I’m seeing the same symptoms with 440.59 as currently packaged in Arch Linux. The error is added to dmesg when the X server is first started. If I quit the X server and restart it without rebooting, another such error is not added.

$ inxi -Gx
Graphics:  Device-1: Intel Xeon E3-1200 v2/3rd Gen Core processor Graphics 
           vendor: ASUSTeK P8 series driver: i915 v: kernel bus ID: 00:02.0 
           Device-2: NVIDIA GM204 [GeForce GTX 970] vendor: Micro-Star MSI driver: nvidia 
           v: 440.59 bus ID: 01:00.0 
           Display: server: X.Org 1.20.7 driver: nvidia resolution: 1920x1080~60Hz 
           OpenGL: renderer: GeForce GTX 970/PCIe/SSE2 v: 4.6.0 NVIDIA 440.59 
           direct render: Yes
$ uname -rm
5.5.2-arch1-1 x86_64
[   16.375051] ------------[ cut here ]------------
[   16.375052] refcount_t: underflow; use-after-free.
[   16.375062] WARNING: CPU: 1 PID: 682 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
[   16.375062] Modules linked in: nct6775 hwmon_vid ir_rc5_decoder rc_dvbsky intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi fuse kvm_intel kvm irqbypass ts2020 hid_logitech_hidpp m88ds3103 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel si2157 si2168 i2c_mux input_leds aesni_intel crypto_simd eeepc_wmi iTCO_wdt mei_hdcp asus_wmi iTCO_vendor_support cryptd snd_emu10k1 glue_helper battery intel_cstate sparse_keymap smipcie dvb_core rfkill snd_hda_intel intel_uncore snd_util_mem wmi_bmof mxm_wmi mousedev snd_intel_dspcfg intel_rapl_perf videobuf2_vmalloc pcspkr i2c_i801 snd_ac97_codec videobuf2_memops snd_hda_codec i915 videobuf2_common snd_rawmidi joydev snd_seq_device lpc_ich videodev snd_hda_core mei_me ac97_bus mc snd_hwdep snd_pcm snd_timer emu10k1_gp e1000e gameport mei snd i2c_algo_bit rc_core intel_gtt soundcore ie31200_edac evdev mac_hid wmi hid_logitech_dj vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) loop sg crypto_user ip_tables
[   16.375084]  x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_generic usbhid hid sr_mod cdrom sd_mod ahci libahci libata ehci_pci firewire_ohci firewire_core crc_itu_t crc32c_intel xhci_pci ehci_hcd scsi_mod xhci_hcd nvidia_drm(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart nvidia_uvm(OE) nvidia_modeset(POE) nvidia(POE) ipmi_devintf ipmi_msghandler
[   16.375094] CPU: 1 PID: 682 Comm: Xorg.wrap Tainted: P           OE     5.5.2-arch1-1 #1
[   16.375094] Hardware name: System manufacturer System Product Name/P8Z77-V, BIOS 1606 08/16/2012
[   16.375095] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[   16.375096] Code: 05 19 1c 0a 01 01 e8 7b 80 c1 ff 0f 0b c3 80 3d 07 1c 0a 01 00 75 95 48 c7 c7 b8 f2 73 a5 c6 05 f7 1b 0a 01 01 e8 5c 80 c1 ff <0f> 0b c3 80 3d e6 1b 0a 01 00 0f 85 72 ff ff ff 48 c7 c7 10 f3 73
[   16.375097] RSP: 0018:ffffb5614269fd90 EFLAGS: 00010286
[   16.375098] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   16.375098] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 00000000ffffffff
[   16.375099] RBP: ffff9a5e11b6f4e8 R08: 0000000000000486 R09: 0000000000000004
[   16.375099] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9a5e21d07ae8
[   16.375099] R13: ffff9a5e21d07800 R14: 0000000000000008 R15: 0000000000000000
[   16.375100] FS:  00007f70eaf9e540(0000) GS:ffff9a5e27e80000(0000) knlGS:0000000000000000
[   16.375101] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.375101] CR2: 00007f70eae27490 CR3: 0000000224a2a003 CR4: 00000000001606e0
[   16.375102] Call Trace:
[   16.375106]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[   16.375109]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[   16.375124]  drm_drop_master+0x1e/0x30 [drm]
[   16.375131]  drm_master_release+0x9f/0xb0 [drm]
[   16.375137]  drm_file_free.part.0+0x1fe/0x260 [drm]
[   16.375144]  drm_release+0x9a/0xd0 [drm]
[   16.375146]  __fput+0xae/0x230
[   16.375148]  task_work_run+0x93/0xb0
[   16.375150]  exit_to_usermode_loop+0xda/0x100
[   16.375152]  do_syscall_64+0x11f/0x150
[   16.375153]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   16.375155] RIP: 0033:0x7f70eaec6c37
[   16.375156] Code: ff ff e8 dc e4 01 00 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 d3 4e f9 ff
[   16.375156] RSP: 002b:00007fff53e8ed28 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   16.375157] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00007f70eaec6c37
[   16.375157] RDX: 00007fff53e8ed40 RSI: 00000000c04064a0 RDI: 0000000000000003
[   16.375158] RBP: 00007fff53e8ed90 R08: 0000000000000000 R09: 00007fff53e8ebb0
[   16.375158] R10: 000055ddf8c0364b R11: 0000000000000246 R12: 0000000000000003
[   16.375158] R13: 0000000000000001 R14: 0000000000000000 R15: 00007fff53e8ed40
[   16.375160] ---[ end trace b7b46af8408dac9a ]---

Also, I noticed that switching video mode is now slower. There’s a couple of seconds delay after hitting enter on an xrandr command, during which the screen freezes and I hear my CPU fan briefly accelerate, before my screen blanks as the monitor actually begins its mode switch. Nothing new appears in the kernel log during these.

$ time xrandr --output HDMI-0 --mode 1024x768

real    0m2.178s
user    0m0.000s
sys     0m0.003s

$ time xrandr --output HDMI-0 --mode 1920x1080

real    0m2.200s
user    0m0.003s
sys     0m0.001s

With kernel 5.4.6 / driver 440.44, the screen blanks immediately when I hit enter. In fact, this also happens with the newer versions too if I very quickly change the mode again after a previous change - within a couple of seconds. This is me (with 440.59) switching back and forth by hitting up-up-enter as soon as the screen comes back on - only the first is slow:

$ time xrandr --output HDMI-0 --mode 1024x768

real    0m2.190s
user    0m0.002s
sys     0m0.002s

$ time xrandr --output HDMI-0 --mode 1920x1080

real    0m0.088s
user    0m0.003s
sys     0m0.000s

$ time xrandr --output HDMI-0 --mode 1024x768

real    0m0.082s
user    0m0.003s
sys     0m0.000s

$ time xrandr --output HDMI-0 --mode 1920x1080

real    0m0.090s
user    0m0.003s
sys     0m0.000s

This behaviour (ie. the slowness) continues after restart of the X server.
nvidia-bug-report.log.gz (423 KB)

The same (440.44):

[    4.056833] ------------[ cut here ]------------
[    4.058060] refcount_t: underflow; use-after-free.
[    4.059207] WARNING: CPU: 2 PID: 280 at lib/refcount.c:190 refcount_sub_and_test_checked+0x55/0x60
[    4.060329] Modules linked in: acpi_cpufreq(E-) intel_rapl_msr(E) intel_rapl_common(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) efi_pstore(E) intel_cstate(E) intel_uncore(E) 88XXau(OE) intel_rapl_perf(E) efivars(E) pcspkr(E) cfg80211(E) iTCO_wdt(E) iTCO_vendor_support(E) snd_hda_codec_realtek(E) joydev(E) rfkill(E) mei_me(E) snd_hda_codec_generic(E) ledtrig_audio(E) sg(E) snd_hda_codec_hdmi(E) watchdog(E) snd_hda_intel(E) mei(E) nvidia_drm(POE) snd_intel_nhlt(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) nvidia_modeset(POE) snd_soc_rt5640(E) snd_soc_rl6231(E) snd_soc_core(E) snd_compress(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) acpi_pad(E) evdev(E) nvidia(POE) ipmi_devintf(E) ipmi_msghandler(E) parport_pc(E) ppdev(E) lp(E) parport(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E) uas(E) usb_storage(E) hid_generic(E)
[    4.060343]  usbhid(E) hid(E) sd_mod(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) ahci(E) libahci(E) aesni_intel(E) crypto_simd(E) cryptd(E) libata(E) glue_helper(E) nvme(E) scsi_mod(E) lpc_ich(E) i915(E) mfd_core(E) i2c_i801(E) nvme_core(E) ehci_pci(E) xhci_pci(E) i2c_algo_bit(E) xhci_hcd(E) ehci_hcd(E) r8169(E) drm_kms_helper(E) realtek(E) libphy(E) usbcore(E) usb_common(E) drm(E) fan(E) video(E) button(E)
[    4.067181] CPU: 2 PID: 280 Comm: plymouthd Tainted: P           OE     5.4.0-kali3-amd64 #1 Debian 5.4.13-1kali1
[    4.068690] Hardware name: Gigabyte Technology Co., Ltd. Z97-HD3/Z97-HD3, BIOS F10c 03/03/2016
[    4.070205] RIP: 0010:refcount_sub_and_test_checked+0x55/0x60
[    4.071721] Code: e0 41 5c c3 44 89 e0 41 5c c3 44 0f b6 25 32 f7 cf 00 45 84 e4 75 e4 48 c7 c7 50 5b ee ad c6 05 1f f7 cf 00 01 e8 89 1b c8 ff <0f> 0b eb d0 0f 1f 80 00 00 00 00 48 89 fe bf 01 00 00 00 eb 96 66
[    4.073290] RSP: 0018:ffffadcd0044bcb0 EFLAGS: 00010282
[    4.074854] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    4.076424] RDX: ffff917b5fb27380 RSI: ffff917b5fb17688 RDI: ffff917b5fb17688
[    4.077996] RBP: ffff917b5b10dce8 R08: ffff917b5fb17688 R09: ffffadcd0367c01c
[    4.079560] R10: 0000000000aaaaaa R11: 0000000000000000 R12: 0000000000000000
[    4.081118] R13: 0000000000000000 R14: ffff917b505bd800 R15: 0000000000000008
[    4.082665] FS:  00007f24a03c4740(0000) GS:ffff917b5fb00000(0000) knlGS:0000000000000000
[    4.084219] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.085774] CR2: 0000562311460af8 CR3: 000000040f8b8005 CR4: 00000000001606e0
[    4.087337] Call Trace:
[    4.088887]  nv_drm_atomic_helper_disable_all+0xae/0x260 [nvidia_drm]
[    4.090438]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[    4.091991]  drm_drop_master+0x1e/0x30 [drm]
[    4.093861]  drm_dropmaster_ioctl+0x4c/0x90 [drm]
[    4.095717]  ? drm_setmaster_ioctl+0xb0/0xb0 [drm]
[    4.097562]  drm_ioctl_kernel+0xaa/0xf0 [drm]
[    4.099402]  drm_ioctl+0x208/0x390 [drm]
[    4.101220]  ? drm_setmaster_ioctl+0xb0/0xb0 [drm]
[    4.103021]  do_vfs_ioctl+0x40e/0x670
[    4.104812]  ksys_ioctl+0x5e/0x90
[    4.106593]  __x64_sys_ioctl+0x16/0x20
[    4.108378]  do_syscall_64+0x52/0x160
[    4.110157]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    4.111936] RIP: 0033:0x7f24a065b5b7
[    4.113713] Code: 00 00 90 48 8b 05 d9 78 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 78 0c 00 f7 d8 64 89 01 48
[    4.115550] RSP: 002b:00007ffd599f75b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[    4.117399] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f24a065b5b7
[    4.119239] RDX: 0000000000000000 RSI: 000000000000641f RDI: 000000000000000b
[    4.121069] RBP: 000000000000641f R08: 0000562311451550 R09: 000056231144d010
[    4.122885] R10: 0000000000000000 R11: 0000000000000246 R12: 0000562311451f40
[    4.124707] R13: 000000000000000b R14: 00007f24a073e380 R15: 0000000000000000
[    4.126525] ---[ end trace 28781ed746db482c ]---

I’ll admit I’m not exactly great at reading traces, or debugging these kinds of bugs in general, but the last function it points to is nv_drm_atomic_helper_disable_all. This is common throughout the logs I’ve seen posted so far, so I figured I’d take a look at that one.

Doing a quick check on the function, the only thing that immediately stands out is that the comment says that “drm_atomic_helper_disable_all() is copied from linux/drivers/gpu/drm/drm_atomic_helper.c” but in commit e00fb8564ee9 [1], the code that handles the “plane_mask” variable is removed from the kernel. And there’s a switch to using nv_drm_for_each_new_{connector,plane}_in_state instead of the old macros.

If I had to take a guess, then it’d be that this change in the newer kernels causes it. Possibly that the old_fb variable gets freed somewhere else, and then, when the driver accesses it after the fact, the kernel spits out the underflow warning… Though, it feels strange why only some people would have this problem in that case, so I’m not so sure about that.

Still, consider applying this patch and see if there’s any change in this behaviour. Though, I’ve not tested it myself yet, so do take note that Bad Things® might happen – it shouldn’t, but still, use on your own risk.

diff --git a/kernel/nvidia-drm/nvidia-drm-helper.c b/kernel/nvidia-drm/nvidia-drm-helper.c
index 601a9c3..245b13d 100644
--- a/kernel/nvidia-drm/nvidia-drm-helper.c
+++ b/kernel/nvidia-drm/nvidia-drm-helper.c
@@ -102,7 +102,6 @@ int nv_drm_atomic_helper_disable_all(struct drm_device *dev,
     struct drm_plane *plane;
     struct drm_crtc_state *crtc_state;
     struct drm_crtc *crtc;
-    unsigned plane_mask = 0;
     int ret, i;
 
     state = drm_atomic_state_alloc(dev);
@@ -133,40 +132,22 @@ int nv_drm_atomic_helper_disable_all(struct drm_device *dev,
             goto free;
     }
 
-    nv_drm_for_each_connector_in_state(state, conn, conn_state, i) {
+    nv_drm_for_each_new_connector_in_state(state, conn, conn_state, i) {
         ret = drm_atomic_set_crtc_for_connector(conn_state, NULL);
         if (ret < 0)
             goto free;
     }
 
-    nv_drm_for_each_plane_in_state(state, plane, plane_state, i) {
+    nv_drm_for_each_new_plane_in_state(state, plane, plane_state, i) {
         ret = drm_atomic_set_crtc_for_plane(plane_state, NULL);
         if (ret < 0)
             goto free;
 
         drm_atomic_set_fb_for_plane(plane_state, NULL);
-        plane_mask |= BIT(drm_plane_index(plane));
-        plane->old_fb = plane->fb;
     }
 
     ret = drm_atomic_commit(state);
 free:
-    if (plane_mask) {
-        drm_for_each_plane_mask(plane, dev, plane_mask) {
-            if (ret == 0) {
-                plane->fb = NULL;
-                plane->crtc = NULL;
-
-                WARN_ON(plane->state->fb);
-                WARN_ON(plane->state->crtc);
-
-                if (plane->old_fb)
-                    __nv_drm_framebuffer_put(plane->old_fb);
-           }
-           plane->old_fb = NULL;
-       }
-    }
-
 #if defined(NV_DRM_ATOMIC_STATE_REF_COUNTING_PRESENT)
     drm_atomic_state_put(state);
 #else
diff --git a/kernel/nvidia-drm/nvidia-drm-helper.h b/kernel/nvidia-drm/nvidia-drm-helper.h
index 2489924..6ae2174 100644
--- a/kernel/nvidia-drm/nvidia-drm-helper.h
+++ b/kernel/nvidia-drm/nvidia-drm-helper.h
@@ -211,6 +211,20 @@ int nv_drm_atomic_helper_disable_all(struct drm_device *dev,
     for_each_connector_in_state(__state, connector, connector_state, __i)
 #endif
 
+#if !defined(for_each_new_connector_in_state)
+#define nv_drm_for_each_new_connector_in_state(__state,                             \
+                                           connector, new_connector_state, __i)     \
+    for ((__i) = 0;                                                                 \
+         (__i) < (__state)->num_connector;                                          \
+         (__i)++)                                                                   \
+         for_each_if ((__state)->connectors[__i].ptr &&                             \
+             ((connector) = (__state)->connectors[__i].ptr,                         \
+             (new_connector_state) = (__state)->connectors[__i].new_state, 1))
+#else
+#define nv_drm_for_each_new_connector_in_state(__state,                             \
+                                               connector, new_connector_state, __i) \
+    for_each_new_connector_in_state(__state, connector, new_connector_state, __i)
+#endif
 
 /**
  * nv_drm_for_each_crtc_in_state - iterate over all CRTCs in an atomic update
@@ -262,6 +276,19 @@ int nv_drm_atomic_helper_disable_all(struct drm_device *dev,
     for_each_plane_in_state(__state, plane, plane_state, __i)
 #endif
 
+#if !defined(for_each_new_plane_in_state)
+#define nv_drm_for_each_new_plane_in_state(__state, plane, new_plane_state, __i) \
+    for ((__i) = 0;                                                              \
+         (__i) < (__state)->dev->mode_config.num_total_plane;                    \
+         (__i)++)                                                                \
+         for_each_if ((__state)->planes[__i].ptr &&                              \
+             ((plane) = (__state)->planes[__i].ptr,                              \
+             (new_plane_state) = (__state)->planes[__i].new_state, 1))
+#else
+#define nv_drm_for_each_new_plane_in_state(__state, plane, new_plane_state, __i) \
+    for_each_new_plane_in_state(__state, plane, new_plane_state, __i)
+#endif
+
 static inline struct drm_crtc *nv_drm_crtc_find(struct drm_device *dev,
     uint32_t id)
 {
  1. Ref. to kernel commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e00fb8564ee98c5c3a72c74b1a27e441abff6cca

I do not think so. If you used github https://github.com/torvalds/linux/commit/e00fb8564ee98c5c3a72c74b1a27e441abff6cca you’d see that this commit was added back in v4.19 kernel. So… This problem is more modern. But right, the file is correct. See how many changes there were after that https://github.com/torvalds/linux/commits/master/drivers/gpu/drm/drm_atomic_helper.c (and look in other files as well).

You’re absolutely correct. Guess I really should’ve noticed that the changes are quite ancient… Guess I’ll have to blame it on being lazy and somewhat tired at the moment. Sorry about that.

Note, I did not test the patch. It can as well be it because of some other change in the future from that one.

Feels unlikely at this point, looking back on it with a clear head; but I guess removing it from the post was unnecessary. Figured I’d not waste anyone’s time trying. Still, I re-added it.

Did you use https://github.com/torvalds/linux/commits/master/drivers/gpu/drm/drm_atomic_helper.c
there is a fixit https://github.com/torvalds/linux/commit/8de679abc8ae81463d3fec495a21a6ca0a65bade#diff-4b72d79074d15bfc203b0655cccd9a6b which was a bug in the kernel and was fixed in 5.6-rc1

I worked off my local source repository when I copied the relevant parts. Can’t honestly say which version I had checked out when I did, but likely v5.5.3.

However, the only thing the snippet above changes are the nv_drm_atomic_helper_disable_all function, and the addition of the two new macros in “kernel/nvidia-drm/nvidia-drm-helper.h”. The macros comes from “include/drm/drm_atomic.h” in the kernel source, but those will only get used if they’re not found, otherwise they get defined to the kernel versions.

I have a transcoding with signals in 1080i at the input, the output is in 1080p, I need the output to be 1080i, it has the version 418.87.00 do not work, URGENT I need a solution, thank you very much

pls open a new issue instead of hijacking one

inxi -Gx
Graphics:  Device-1: Intel UHD Graphics 630 vendor: Lenovo driver: i915 v: kernel bus ID: 00:02.0 
           Device-2: NVIDIA GP107M [GeForce GTX 1050 3 GB Max-Q] vendor: Lenovo driver: nvidia v: 440.59 bus ID: 01:00.0 
           Display: server: Fedora Project X.org 1.20.6 driver: modesetting,nvidia unloaded: fbdev,nouveau,vesa 
           resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa DRI Intel UHD Graphics 630 (Coffeelake 3x8 GT2) v: 4.5 Mesa 19.2.8 direct render: Yes
uname -rm
5.5.5-200.fc31.x86_64 x86_64
[   15.486705] ------------[ cut here ]------------
[   15.486707] refcount_t: underflow; use-after-free.
[   15.486724] WARNING: CPU: 7 PID: 1935 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
[   15.486725] Modules linked in: cmac bnep sunrpc vfat fat snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_xtensa_dsp snd_sof_intel_hda_common x86_pkg_temp_thermal snd_soc_hdac_hda intel_powerclamp snd_sof_intel_hda snd_sof snd_soc_skl snd_soc_sst_ipc coretemp snd_soc_sst_dsp snd_hda_ext_core kvm_intel nvidia_drm(POE) snd_soc_acpi_intel_match nvidia_modeset(POE) snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_hdmi snd_soc_core nvidia_uvm(OE) kvm snd_compress snd_hda_codec_generic ledtrig_audio ac97_bus irqbypass snd_pcm_dmaengine snd_hda_intel btusb uvcvideo snd_intel_dspcfg mei_hdcp iTCO_wdt 8821ce(OE) iTCO_vendor_support crct10dif_pclmul crc32_pclmul btrtl videobuf2_vmalloc intel_rapl_msr snd_hda_codec btbcm videobuf2_memops ghash_clmulni_intel intel_cstate snd_hda_core btintel videobuf2_v4l2 videobuf2_common snd_hwdep intel_uncore bluetooth intel_rapl_perf snd_seq videodev nvidia(POE) snd_seq_device pcspkr intel_wmi_thunderbolt wmi_bmof ecdh_generic cfg80211 snd_pcm i2c_i801 mc
[   15.486759]  ecc snd_timer mei_me snd processor_thermal_device mei joydev intel_rapl_common idma64 ipmi_msghandler soundcore intel_soc_dts_iosf intel_pch_thermal ideapad_laptop int3403_thermal sparse_keymap int340x_thermal_zone int3400_thermal acpi_thermal_rel acpi_pad acpi_tad vboxnetadp(OE) vboxnetflt(OE) binfmt_misc vboxdrv(OE) ip_tables rfkill i915 i2c_algo_bit hid_rmi drm_kms_helper rmi_core drm crc32c_intel nvme serio_raw nvme_core r8169 i2c_hid pinctrl_cannonlake video wmi pinctrl_intel fuse [last unloaded: ipmi_devintf]
[   15.486780] CPU: 7 PID: 1935 Comm: Xorg Tainted: P           OE     5.5.5-200.fc31.x86_64 #1
[   15.486781] Hardware name: LENOVO 81LK/LNVNB161216, BIOS BGCN24WW 08/19/2019
[   15.486783] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[   15.486785] Code: 05 ee 0e 2e 01 01 e8 ab 95 bc ff 0f 0b c3 80 3d dc 0e 2e 01 00 75 95 48 c7 c7 70 8f 3c 85 c6 05 cc 0e 2e 01 01 e8 8c 95 bc ff <0f> 0b c3 80 3d bb 0e 2e 01 00 0f 85 72 ff ff ff 48 c7 c7 c8 8f 3c
[   15.486786] RSP: 0018:ffffb931c3f0bd80 EFLAGS: 00010282
[   15.486787] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
[   15.486789] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff8fac665d9cc0
[   15.486790] RBP: ffff8fac3af80ce8 R08: 0000000000000446 R09: 0000000000000003
[   15.486791] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8fac5dba2ae8
[   15.486792] R13: ffff8fac5dba2800 R14: 0000000000000008 R15: 0000000000000000
[   15.486793] FS:  00007f5d68ae0f00(0000) GS:ffff8fac665c0000(0000) knlGS:0000000000000000
[   15.486795] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   15.486796] CR2: 000055590f53d6f8 CR3: 0000000237d6a004 CR4: 00000000003606e0
[   15.486797] Call Trace:
[   15.486804]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[   15.486809]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[   15.486829]  drm_drop_master+0x1e/0x30 [drm]
[   15.486846]  drm_master_release+0x9f/0xb0 [drm]
[   15.486863]  drm_file_free.part.0+0x21d/0x270 [drm]
[   15.486879]  drm_release+0xa7/0xe0 [drm]
[   15.486883]  __fput+0xc1/0x250
[   15.486887]  task_work_run+0x8a/0xb0
[   15.486891]  exit_to_usermode_loop+0x102/0x130
[   15.486894]  do_syscall_64+0x1a4/0x1c0
[   15.486897]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   15.486899] RIP: 0033:0x7f5d6903f8e7
[   15.486901] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 fb ff ff
[   15.486902] RSP: 002b:00007ffd8f53b658 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   15.486904] RAX: 0000000000000000 RBX: 000055590f517340 RCX: 00007f5d6903f8e7
[   15.486905] RDX: 0000000000000000 RSI: 000055590f5174b0 RDI: 000000000000000c
[   15.486906] RBP: 000000000000000c R08: 0000000000000000 R09: 000055590f519df0
[   15.486907] R10: fffffffffffff206 R11: 0000000000000246 R12: 000055590f5174b0
[   15.486908] R13: 000055590f517380 R14: 0000000000000000 R15: 0000000000000000
[   15.486922] ---[ end trace 83c6332780b02ed3 ]---
uname -rm
	5.5.6-27.geca1eba-default x86_64

nvidia-settings -v
	nvidia-settings:  version 440.64
	  The NVIDIA X Server Settings tool.

dmesg
	...
	[   26.643102] ------------[ cut here ]------------
	[   26.643103] refcount_t: underflow; use-after-free.
	[   26.643116] WARNING: CPU: 3 PID: 3043 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
	[   26.643117] Modules linked in: nvidia_uvm(OE) ipmi_devintf iscsi_ibft iscsi_boot_sysfs vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) dmi_sysfs bluetooth ecdh_generic ecc cachefiles fscache squashfs loop sch_fq_codel edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel eeepc_wmi asus_wmi crypto_simd battery sparse_keymap cryptd rfkill glue_helper video wmi_bmof sp5100_tco pcspkr i2c_piix4 snd_hda_codec_realtek ccp i2c_dev acpi_cpufreq raid10 uas usb_storage md_mod hid_generic usbhid nvidia_drm(POE) nvidia_modeset(POE) tcp_bbr snd_usb_audio nvidia(POE) snd_usbmidi_lib mc snd_rawmidi snd_seq_device snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel ipmi_msghandler ledtrig_audio snd_intel_dspcfg mxm_wmi sg snd_hda_codec drm_kms_helper nct6775 snd_hwdep hwmon_vid syscopyarea sysfillrect snd_hda_core sysimgblt msr fb_sys_fops crc32c_intel xhci_pci snd_pcm mpt3sas drm xhci_hcd snd_timer k10temp snd raid_class igb scsi_transport_sas
	[   26.643150]  soundcore r8169 sr_mod usbcore cdrom realtek dca libphy i2c_algo_bit wmi pinctrl_amd button sunrpc nbd dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs
	[   26.643160] CPU: 3 PID: 3043 Comm: X Tainted: P           OE     5.5.6-27.geca1eba-default #1 openSUSE Tumbleweed (unreleased)
	[   26.643161] Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
	[   26.643163] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
	[   26.643165] Code: 05 b2 55 05 01 01 e8 7b 04 bb ff 0f 0b c3 80 3d a0 55 05 01 00 75 95 48 c7 c7 48 95 12 b1 c6 05 90 55 05 01 01 e8 5c 04 bb ff <0f> 0b c3 80 3d 7f 55 05 01 00 0f 85 72 ff ff ff 48 c7 c7 a0 95 12
	[   26.643165] RSP: 0018:ffffa271c3b63d88 EFLAGS: 00010282
	[   26.643167] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
	[   26.643167] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff92911eadbdd0
	[   26.643168] RBP: ffff929116ffcce8 R08: 00000000000006b5 R09: 0000000000000003
	[   26.643169] R10: 0000000000000000 R11: 0000000000000001 R12: ffff928247071ae8
	[   26.643169] R13: ffff928247071800 R14: 0000000000000004 R15: 0000000000000000
	[   26.643171] FS:  00007f808dbebec0(0000) GS:ffff92911eac0000(0000) knlGS:0000000000000000
	[   26.643172] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	[   26.643172] CR2: 00007f8817ffd488 CR3: 0000000f98ae4000 CR4: 0000000000340ee0
	[   26.643173] Call Trace:
	[   26.643180]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
	[   26.643183]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
	[   26.643195]  drm_drop_master+0x1e/0x30 [drm]
	[   26.643206]  drm_master_release+0x9f/0xb0 [drm]
	[   26.643217]  drm_file_free.part.0+0x1fe/0x260 [drm]
	[   26.643227]  drm_release+0x9a/0xd0 [drm]
	[   26.643230]  __fput+0xc1/0x250
	[   26.643232]  task_work_run+0xa1/0xc0
	[   26.643235]  exit_to_usermode_loop+0x10c/0x130
	[   26.643238]  do_syscall_64+0x1fa/0x240
	[   26.643240]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
	[   26.643242] RIP: 0033:0x7f808b85bf24
	[   26.643243] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 ca c4 20 00 48 63 ff 85 c0 75 13 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 44 f3 c3 66 90 48 83 ec 18 48 89 7c 24 08 e8
	[   26.643243] RSP: 002b:00007fff56bba8a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
	[   26.643245] RAX: 0000000000000000 RBX: 000000000000000a RCX: 00007f808b85bf24
	[   26.643245] RDX: 0000555717805020 RSI: 0000000000000002 RDI: 000000000000000a
	[   26.643246] RBP: 000055571784a600 R08: 0000555717862650 R09: 00007f808b5cb470
	[   26.643246] R10: 0000000000000079 R11: 0000000000000246 R12: 000055571784a4f0
	[   26.643247] R13: 000055571784a3c0 R14: 0000000000000000 R15: 0000000000000000
	[   26.643249] ---[ end trace 51017c11887d48a8 ]---

make you wonder: has anyone from Nvidia dev even looked at these reports? afaict, they’ve not commented. clearly, not addressed.

Happens for me too

[   14.718249] ------------[ cut here ]------------
[   14.718946] refcount_t: underflow; use-after-free.
[   14.719634] WARNING: CPU: 2 PID: 1184 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
[   14.720326] Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bnep sunrpc vfat fat bcache crc64 intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nvidia_drm(POE) nvidia_modeset(POE) irqbypass nvidia_uvm(OE) iTCO_wdt iTCO_vendor_support mei_hdcp mei_wdt crct10dif_pclmul crc32_pclmul ppdev raid1 ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf pcspkr snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi btusb btrtl snd_hda_intel btbcm btintel snd_intel_dspcfg snd_hda_codec bluetooth lpc_ich nvidia(POE) i2c_i801 snd_hda_core ses enclosure snd_hwdep scsi_transport_sas snd_seq
[   14.720347]  ecdh_generic joydev snd_seq_device usblp ecc snd_pcm ipmi_devintf mei_me ipmi_msghandler mei snd_timer snd parport_pc parport soundcore tpm_infineon ip_tables rfkill hid_logitech_hidpp uas usb_storage i915 mxm_wmi crc32c_intel i2c_algo_bit drm_kms_helper drm e1000e wmi video hid_logitech_dj fuse
[   14.726334] CPU: 2 PID: 1184 Comm: Xorg Tainted: P           OE     5.5.6-201.fc31.x86_64 #1
[   14.727290] Hardware name: MSI MS-7830/CSM-Q87M-E43 (MS-7830), BIOS V10.3 05/30/2014
[   14.728249] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[   14.729200] Code: 05 fe 09 2e 01 01 e8 bb 92 bc ff 0f 0b c3 80 3d ec 09 2e 01 00 75 95 48 c7 c7 08 95 3c b2 c6 05 dc 09 2e 01 01 e8 9c 92 bc ff <0f> 0b c3 80 3d cb 09 2e 01 00 0f 85 72 ff ff ff 48 c7 c7 60 95 3c
[   14.730215] RSP: 0018:ffffa90a83e7fd80 EFLAGS: 00010282
[   14.731230] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
[   14.732253] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff96b30eb19cc0
[   14.733264] RBP: ffff96b2ef56ace8 R08: 00000000000004c6 R09: ffffa90a9091521c
[   14.734272] R10: 0000000000aaaaaa R11: 0000000000000000 R12: ffff96b3077112e8
[   14.735281] R13: ffff96b307711000 R14: 0000000000000008 R15: 0000000000000000
[   14.736295] FS:  00007f122d9f0f00(0000) GS:ffff96b30eb00000(0000) knlGS:0000000000000000
[   14.737322] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   14.738352] CR2: 00005622cacfe138 CR3: 00000008062ee002 CR4: 00000000001606e0
[   14.739389] Call Trace:
[   14.740432]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[   14.741488]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[   14.742553]  drm_drop_master+0x1e/0x30 [drm]
[   14.743610]  drm_master_release+0x9f/0xb0 [drm]
[   14.744670]  drm_file_free.part.0+0x21d/0x270 [drm]
[   14.745730]  drm_release+0xa7/0xe0 [drm]
[   14.746783]  __fput+0xc1/0x250
[   14.747835]  task_work_run+0x8a/0xb0
[   14.748885]  exit_to_usermode_loop+0x102/0x130
[   14.749931]  do_syscall_64+0x1a4/0x1c0
[   14.750973]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   14.752020] RIP: 0033:0x7f122df4f8e7
[   14.753062] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 fb ff ff
[   14.754177] RSP: 002b:00007fff1bdb9a08 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   14.755301] RAX: 0000000000000000 RBX: 00005622cacfc5d0 RCX: 00007f122df4f8e7
[   14.756423] RDX: 0000000000000000 RSI: 00005622cacfc740 RDI: 000000000000000c
[   14.757541] RBP: 000000000000000c R08: 0000000000000000 R09: 00005622cacfd110
[   14.758666] R10: fffffffffffff206 R11: 0000000000000246 R12: 00005622cacfc740
[   14.759791] R13: 00005622cacfc610 R14: 0000000000000000 R15: 0000000000000000
[   14.760917] ---[ end trace d9e61cc6624f267d ]---

EDIT: driver version 440.59

@aplattner

can we get some sort of official comment from Nvidia on kernel 5.5.x fix/support?
are we just to write-off the entire release cycle, and hope/wait for 5.6.x?

We have the same crash on Fedora31 with latest drivers from Fusion repo
https://ask.fedoraproject.org/t/kernel-tainted-after-running-updates/5487/7

The crash was introduced with 5.5.5 upgrade and it’s still here on 5.5.6

[ 7.503213] CPU: 1 PID: 414 Comm: plymouthd Tainted: P OE 5.5.6-201.fc31.x86_64 #1
[ 7.503218] Hardware name: Micro-Star International Co., Ltd. PS42 Modern 8RC/MS-14B2, BIOS E14B2IMS.106 12/06/2018
[ 7.503230] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[ 7.503239] Code: 05 fe 09 2e 01 01 e8 bb 92 bc ff 0f 0b c3 80 3d ec 09 2e 01 00 75 95 48 c7 c7 08 95 3c bb c6 05 dc 09 2e 01 01 e8 9c 92 bc ff <0f> 0b c3 80 3d cb 09 2e 01 00 0f 85 72 ff ff ff 48 c7 c7 60 95 3c
[ 7.503244] RSP: 0018:ffffb290407cbcb8 EFLAGS: 00010286
[ 7.503250] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
[ 7.503254] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff97732ec59cc0
[ 7.503258] RBP: ffff9773150874e8 R08: 0000000000000382 R09: 0000000000000003
[ 7.503262] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9773257282e8
[ 7.503265] R13: ffff977325728000 R14: 0000000000000000 R15: ffff977325750a00
[ 7.503272] FS: 00007fe4f52e9f00(0000) GS:ffff97732ec40000(0000) knlGS:0000000000000000
[ 7.503276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.503280] CR2: 00007f7ed9315a30 CR3: 0000000464a1e001 CR4: 00000000003606e0
[ 7.503284] Call Trace:
[ 7.503313] nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[ 7.503333] nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[ 7.503396] drm_drop_master+0x1e/0x30 [drm]
[ 7.503452] drm_dropmaster_ioctl+0x4c/0x90 [drm]
[ 7.503506] ? drm_setmaster_ioctl+0xb0/0xb0 [drm]
[ 7.503565] drm_ioctl_kernel+0xaa/0xf0 [drm]
[ 7.503631] drm_ioctl+0x208/0x390 [drm]
[ 7.503686] ? drm_setmaster_ioctl+0xb0/0xb0 [drm]
[ 7.503701] ? do_filp_open+0xa5/0x100
[ 7.503718] do_vfs_ioctl+0x461/0x6d0
[ 7.503743] ksys_ioctl+0x5e/0x90
[ 7.503756] __x64_sys_ioctl+0x16/0x20
[ 7.503769] do_syscall_64+0x5b/0x1c0
[ 7.503785] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 7.503794] RIP: 0033:0x7fe4f55a738b
[ 7.503802] Code: 0f 1e fa 48 8b 05 fd 9a 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cd 9a 0c 00 f7 d8 64 89 01 48
[ 7.503806] RSP: 002b:00007ffc4d2ede78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 7.503813] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe4f55a738b
[ 7.503817] RDX: 0000000000000000 RSI: 000000000000641f RDI: 000000000000000b
[ 7.503821] RBP: 000000000000641f R08: 0000555981b9bd50 R09: 00007fe4f56ba380
[ 7.503824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000555981b9bd80
[ 7.503828] R13: 000000000000000b R14: 0000000000000000 R15: 0000000000000000
[ 7.503839] —[ end trace fe605e9abea0643f ]—

We can’t report bugs when a kernel is tainted so let’s hope Nvidia guys could fix it in the future.
The good news that this crash is happening only once during initial boot. Everything is working fine after the initial boot.

This very crash is related to plymouthd

~/# dnf list installed akmod-nvidia
Installed Packages
akmod-nvidia.x86_64 3:440.59-1.fc31 @rpmfusion-nonfree-updates

And we got crash in nv_drm_atomic_helper_disable_all function during plymouth init.

To be clear … do you see this^ (or similar) crash if plymouth is DISabled?

Sorry for confusing message, it was copy-pasted from another forum we were talking a little bit different things. Sure plymouth is not a reason. Without plymouth it is crashing with Xorg.
Call-trace is pretty much the same,

[    9.571753] CPU: 4 PID: 1183 Comm: Xorg Tainted: P           OE     5.5.6-201.fc31.x86_64 #1
[    9.571754] Hardware name: Micro-Star International Co., Ltd. PS42 Modern 8RC/MS-14B2, BIOS E14B2IMS.106 12/06/2018
[    9.571756] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[    9.571758] Code: 05 fe 09 2e 01 01 e8 bb 92 bc ff 0f 0b c3 80 3d ec 09 2e 01 00 75 95 48 c7 c7 08 95 3c 8f c6 05 dc 09 2e 01 01 e8 9c 92 bc ff <0f> 0b c3 80 3d cb 09 2e 01 00 0f 85 72 ff ff ff 48 c7 c7 60 95 3c
[    9.571759] RSP: 0018:ffffaa120174bd80 EFLAGS: 00010282
[    9.571760] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000007
[    9.571761] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff8d706ed19cc0
[    9.571762] RBP: ffff8d7066050ce8 R08: 00000000000003a8 R09: 0000000000000003
[    9.571763] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8d7066086ae8
[    9.571764] R13: ffff8d7066086800 R14: 0000000000000000 R15: dead000000000100
[    9.571765] FS:  00007fdea078bf00(0000) GS:ffff8d706ed00000(0000) knlGS:0000000000000000
[    9.571766] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.571767] CR2: 00005561797269e8 CR3: 0000000413ce6003 CR4: 00000000003606e0
[    9.571768] Call Trace:
[    9.571776]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[    9.571781]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[    9.571801]  drm_drop_master+0x1e/0x30 [drm]
[    9.571816]  drm_master_release+0x9f/0xb0 [drm]
[    9.571831]  drm_file_free.part.0+0x21d/0x270 [drm]
[    9.571847]  drm_release+0xa7/0xe0 [drm]
[    9.571851]  __fput+0xc1/0x250
[    9.571854]  task_work_run+0x8a/0xb0
[    9.571857]  exit_to_usermode_loop+0x102/0x130
[    9.571860]  do_syscall_64+0x1a4/0x1c0
[    9.571864]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    9.571866] RIP: 0033:0x7fdea0cea8e7
[    9.571868] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 fb ff ff
[    9.571869] RSP: 002b:00007ffeced06898 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[    9.571870] RAX: 0000000000000000 RBX: 00005561797134f0 RCX: 00007fdea0cea8e7
[    9.571871] RDX: 0000556179712ce0 RSI: 0000556179713660 RDI: 000000000000000c
[    9.571872] RBP: 000000000000000c R08: 0000000000000006 R09: 0000556179713ea0
[    9.571873] R10: 0000000000000000 R11: 0000000000000246 R12: 0000556179713660
[    9.571873] R13: 0000556179713530 R14: 0000000000000000 R15: 0000000000000000
[    9.571876] ---[ end trace 2c19b3c5d8948c7b ]---

Got a new kernel and lots of new nvidia stuff from rpmfusion this morning, but it apparently had nothing to do with this, still get the same call trace when X first starts. Here’s the latest stuff I’m now running:

Linux tomh 5.5.7-200.fc31.x86_64 #1 SMP Fri Feb 28 17:18:37 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

xorg-x11-drv-nvidia-libs-440.64-1.fc31.x86_64
xorg-x11-drv-nvidia-cuda-libs-440.64-1.fc31.i686
kmod-nvidia-5.5.5-200.fc31.x86_64-440.59-1.fc31.x86_64
kmod-nvidia-5.5.6-201.fc31.x86_64-440.64-1.fc31.x86_64
akmod-nvidia-440.64-1.fc31.x86_64
nvidia-settings-440.64-1.fc31.x86_64
kmod-nvidia-5.5.7-200.fc31.x86_64-440.64-1.fc31.x86_64
xorg-x11-drv-nvidia-440.64-1.fc31.x86_64
xorg-x11-drv-nvidia-kmodsrc-440.64-1.fc31.x86_64
xorg-x11-drv-nvidia-cuda-440.64-1.fc31.x86_64
xorg-x11-drv-nvidia-cuda-libs-440.64-1.fc31.x86_64
nvidia-persistenced-440.64-1.fc31.x86_64
xorg-x11-drv-nvidia-libs-440.64-1.fc31.i686