[BUG] Nvidia 440.64 + kernel 5.5.6/stable -- boot trace; WAS Nvidia 440.59 + kernel 5.5.1/stable -- boot trace

Edit: Right, the one user I had test a couple of changes, first claimed this had fixed it…and then, now, later said it was back… And now another claim that it did nothing, so I’m removing the changeset as it’s pretty conclusive that it doesn’t change anything. Sorry for the noise.

Bug in redhat tracker
https://bugzilla.redhat.com/show_bug.cgi?id=1806257

same here, but with Vulkan drivers 440.66.03 and kernel 5.5.10

└───╼  modinfo nvidia
filename:       /lib/modules/5.5.10-arch1-1/kernel/drivers/video/nvidia.ko.xz
alias:          char-major-195-*
version:        440.66.03
supported:      external
license:        NVIDIA
srcversion:     DC0048D50541FC60098682F
alias:          pci:v000010DEd*sv*sd*bc03sc02i00*
alias:          pci:v000010DEd*sv*sd*bc03sc00i00*
depends:        ipmi_msghandler
retpoline:      Y
name:           nvidia
vermagic:       5.5.10-arch1-1 SMP preempt mod_unload 

[   41.924407] ------------[ cut here ]------------
[   41.924409] refcount_t: underflow; use-after-free.
[   41.924426] WARNING: CPU: 25 PID: 1324 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
[   41.924427] Modules linked in: tun mousedev input_leds pktcdvd hid_generic intel_rapl_msr ucsi_ccg typec_ucsi nvidia_drm(POE) iTCO_wdt nvidia_modeset(POE) typec iTCO_vendor_support cmac algif_hash xt_MASQUERADE algif_skcipher eeepc_wmi iptable_nat usbhid asus_wmi af_alg nf_nat battery hid bnep wmi_bmof sparse_key
map mxm_wmi nvidia(POE) nf_conntrack nct7904 nf_defrag_ipv6 nf_defrag_ipv4 nct6775 libcrc32c msr hwmon_vid iptable_filter intel_rapl_common sb_edac x86_pkg_temp_thermal coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg crct10dif_pclmul crc32_pclmul snd_hd
a_codec ghash_clmulni_intel snd_hda_core nls_iso8859_1 snd_hwdep nls_cp437 snd_pcm aesni_intel vfat crypto_simd drm_kms_helper cryptd fat igb glue_helper snd_timer ipmi_devintf mei_me intel_cstate ipmi_msghandler snd intel_uncore syscopyarea sr_mod sysfillrect i2c_algo_bit sysimgblt intel_rapl_perf pcspkr cdrom i2c_
i801 sd_mod mei lpc_ich dca soundcore i2c_nvidia_gpu
[   41.924466]  fb_sys_fops btusb btrtl btbcm btintel bluetooth wmi acpi_power_meter ecdh_generic rfkill ecc evdev mac_hid vhba(OE) uinput fuse eeprom sg br_netfilter bridge drm stp llc crypto_user agpgart ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 ahci crc32c_intel libahci libata xhci_pci ehci_pci sc
si_mod xhci_hcd ehci_hcd vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio
[   41.924485] CPU: 25 PID: 1324 Comm: Xorg.wrap Tainted: P           OE     5.5.10-arch1-1 #1
[   41.924486] Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 3703 04/13/2018
[   41.924488] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[   41.924490] Code: 05 79 ec 09 01 01 e8 8b 65 c1 ff 0f 0b c3 80 3d 67 ec 09 01 00 75 95 48 c7 c7 78 3b 34 b6 c6 05 57 ec 09 01 01 e8 6c 65 c1 ff <0f> 0b c3 80 3d 46 ec 09 01 00 0f 85 72 ff ff ff 48 c7 c7 d0 3b 34
[   41.924491] RSP: 0018:ffffa66e0a44fd90 EFLAGS: 00010286
[   41.924492] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   41.924493] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 00000000ffffffff
[   41.924493] RBP: ffff8d3cdb4ddce8 R08: 00000009c2e3b7db R09: ffff8d44fff6f680
[   41.924494] R10: 00000000000006e2 R11: 00000000000593f4 R12: ffff8d44db8fc2e8
[   41.924495] R13: ffff8d44db8fc000 R14: 0000000000000008 R15: 0000000000000000
[   41.924496] FS:  00007f2db8523540(0000) GS:ffff8d3cdfb40000(0000) knlGS:0000000000000000
[   41.924497] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   41.924498] CR2: 00007f2db83ac490 CR3: 00000006f8732003 CR4: 00000000003606e0
[   41.924499] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   41.924499] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   41.924500] Call Trace:
[   41.924508]  nv_drm_atomic_helper_disable_all+0xec/0x290 [nvidia_drm]
[   41.924515]  nv_drm_master_drop+0x22/0x60 [nvidia_drm]
[   41.924547]  drm_drop_master+0x1e/0x30 [drm]
[   41.924557]  drm_master_release+0x9f/0xb0 [drm]
[   41.924567]  drm_file_free.part.0+0x1fe/0x260 [drm]
[   41.924577]  drm_release+0x9a/0xd0 [drm]
[   41.924581]  __fput+0xae/0x230
[   41.924585]  task_work_run+0x93/0xb0
[   41.924589]  exit_to_usermode_loop+0xda/0x100
[   41.924592]  do_syscall_64+0x11f/0x150
[   41.924595]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   41.924597] RIP: 0033:0x7f2db844bc37
[   41.924599] Code: ff ff e8 dc e4 01 00 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 d3 4e f9 ff
[   41.924600] RSP: 002b:00007ffd3142dc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   41.924601] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00007f2db844bc37
[   41.924602] RDX: 00007ffd3142dc90 RSI: 00000000c04064a0 RDI: 0000000000000003
[   41.924602] RBP: 00007ffd3142dce0 R08: 0000000000000000 R09: 00007ffd3142db00
[   41.924603] R10: 000055e2c659d64b R11: 0000000000000246 R12: 0000000000000003
[   41.924603] R13: 0000000000000001 R14: 0000000000000000 R15: 00007ffd3142dc90
[   41.924606] ---[ end trace d7c78baae500c2f3 ]---

greetings

Same here.
System: Zen2 TRX40, RTX 2080, NVIDIA 440.64, Ubuntu 20.04, Kernel 5.6.0.

Because of this useless bug, DRM is NOT working…
Are NVIDIA developers on holiday?

[    8.803373] Call Trace:
[    8.803379]  nv_drm_atomic_helper_disable_all+0xed/0x290 [nvidia_drm]
[    8.803380]  nv_drm_master_drop+0x28/0x60 [nvidia_drm]
[    8.803388]  drm_drop_master+0x22/0x30 [drm]
[    8.803393]  drm_dropmaster_ioctl+0x51/0x90 [drm]
[    8.803399]  ? drm_setmaster_ioctl+0xb0/0xb0 [drm]
[    8.803405]  drm_ioctl_kernel+0xae/0xf0 [drm]
[    8.803411]  drm_ioctl+0x234/0x3d0 [drm]
[    8.803417]  ? drm_setmaster_ioctl+0xb0/0xb0 [drm]
[    8.803419]  ? putname+0x4a/0x50
[    8.803421]  ? do_sys_openat2+0x1a9/0x2a0
[    8.803422]  ksys_ioctl+0x9d/0xd0
[    8.803423]  __x64_sys_ioctl+0x1a/0x20
[    8.803425]  do_syscall_64+0x57/0x1b0
[    8.803427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

That particular warning (the underflow one) shouldn’t, to my knowledge, have any effect of the DRM subsystem.
What’s the problem you’re facing, exactly? Missing HW acceleration; decoding; general crashes; or the like? If that the case, I would recommend opening a new topic for that.
You can, at least, test direct rendering by installing whatever package on your distribution got glxinfo, and running glxinfo|grep 'direct rendering'

Lastly, any “holiday” they might be having is probably just due that pesky “flu” that got everyone all riled up at the moment.