Nvidia drivers malfunction on TUXEDO InfinityBook Pro14-Gen9-AMD (several distros/versions)

I have a Gigabyte RTX3090 + ADT-Link UT4G/UT3G eGPU that I’ve been using without any major problems with several Intel-based laptops (Dell XPS13-9360, Dell Latitude7430 [egpu.io post], some Lenovo Thinkpad and probably few others that I don’t currently remember) under various OSes (at least Debian trixie and Windows and probably some other distros that I currently also don’t remember). Recently I’ve tried to use it with an AMD-based latop TUXEDO InfinityBook Pro14-Gen9-AMD and it was a total disaster: I’ve tried several versions of Ubuntu and Debian trixie and in all cases lspci could see the RTX3090:

$ lspci |grep VGA
07:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)
65:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Phoenix3 (rev c5)

…but nvidia-smi was saying things like “No devices were found” or “something something: Unknown Error” and system logs were full of kernel stack-traces (and often included entries like RmInitAdapter failed! (0x26:0x56:1489) ← do these numbers have some human-readable meaning?)

Attached are nvidia-bug-report.log.gz files from a few attempts:
trixie-6.12.9-535.216.03-nvidia-bug-report.log.gz (123.5 KB)
noble-6.8.0-550.120open-nvidia-bug-report.log.gz (135.7 KB)
noble-6.8.0-550.120proprietary-nvidia-bug-report.log.gz (120.1 KB)
oracular-6.11.0-560.35.03-nvidia-bug-report.log.gz (127.7 KB)
oracular-6.11.0-560.35.03-nvidia-bug-report-2.log.gz (292.5 KB) (one of oracular logs is with open and the other with proprietary driver, but I lost track which one is which: sorry…)

I’ve also tried connecting this RTX3090 via an OCulink adapter (Minisforum DEG1 and EXP-GDC M.2 OCulink module) to remove tunneling of PCIe over USB4 from the picture, but the result was similar:
oculink-trixie-6.12.9-535.216.03-nvidia-bug-report.log.gz (147.4 KB)

Is anyone able to say if it is a software problem (like a kernel or driver bug that has some chances of being resolved in the future) or if it is some inherent hardware conflict between this Nvidia GPU and some other component of the laptop and not much can be done with this particular laptop+GPU combination? We were planning to purchase a few of these laptops at my organization, but because of this problem we put it on hold until it’s clear whether it is possible to fix it.
I’ve seen some successful reports of connecting Nvidia eGPUs to AMD-based laptops under Linux (most recently here and here), so I’m concerned it may be a hardware conflict :/ I should also clarify that both the laptop and the eGPU still work perfectly fine if only they are not connected with each other, so it’s unlikely that any of them is somehow faulty.

I’ll be grateful for any help with this!

cross-linking egpu.io post: 2024 14″ TUXEDO InfinityBook Pro14-Gen9-AMD [R7K,8C,HS] + RTX 3090 @ 64Gbps-USB4v1 (ADT-Link UT4G) + Linux Debian trixie // failure | External GPU Builds

During some investigation with egpu.io community, we discovered that Canoncial is unable to even properly copy Debian’s packaging and during most of my attempts on Ubuntu, nouveau driver was in control. Therefore currently only the below logs are usable:

may be related: https://forums.developer.nvidia.com/t/nvidia-driver-installed-but-nvidia-smi-shows-no-devices-found-gpu-nvidia-geforce-gtx-1650-mobile-max-q-ubuntu-24-04-kernel-6-8-0-52-generic

bumping up, so maybe some good Nvidia dev will have a mercy on my errors ;-)

Also here is the summary of what seems to be most important in these logs:

Jan 17 21:13:25 tcl-tuxedo kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x26:0x56:1489)

2025-01-17T22:33:46.172698+01:00 morgwai-xps13 kernel: NVRM: GPU at PCI:0000:07:00: GPU-8aeb4eea-1497-5109-edbd-045d9bd60b8e
2025-01-17T22:33:46.172707+01:00 morgwai-xps13 kernel: NVRM: Xid (PCI:0000:07:00): 79, pid=‘’, name=, GPU has fallen off the bus.
2025-01-17T22:33:46.172708+01:00 morgwai-xps13 kernel: NVRM: GPU 0000:07:00.0: GPU has fallen off the bus.
2025-01-17T22:33:46.172708+01:00 morgwai-xps13 kernel: NVRM: kgspRcAndNotifyAllChannels_IMPL: RC all channels for critical error 79.
2025-01-17T22:33:46.172709+01:00 morgwai-xps13 kernel: NVRM: _threadNodeCheckTimeout: API_GPU_ATTACHED_SANITY_CHECK failed!
2025-01-17T22:33:46.172713+01:00 morgwai-xps13 kernel: message repeated 5 times: [ NVRM: _threadNodeCheckTimeout: API_GPU_ATTACHED_SANITY_CHECK failed!]
2025-01-17T22:33:46.172714+01:00 morgwai-xps13 kernel: NVRM: prbEncStartAlloc: Can’t allocate memory for protocol buffers.
2025-01-17T22:33:46.172717+01:00 morgwai-xps13 kernel: NVRM: A GPU crash dump has been created. If possible, please run
2025-01-17T22:33:46.172717+01:00 morgwai-xps13 kernel: NVRM: nvidia-bug-report.sh as root to collect this data before
2025-01-17T22:33:46.172718+01:00 morgwai-xps13 kernel: NVRM: the NVIDIA kernel module is unloaded.
2025-01-17T22:33:46.172718+01:00 morgwai-xps13 kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 78!
2025-01-17T22:33:46.172719+01:00 morgwai-xps13 kernel: NVRM: nvCheckOkFailedNoLog: Check failed: GPU lost from the bus [NV_ERR_GPU_IS_LOST] (0x0000000F) returned from nvdEngineDumpCallbackHelper(pGpu, pPrbEnc, pNvDumpState, pEngineCallback) @ nv_debug_dump.c:274
2025-01-17T22:33:46.172719+01:00 morgwai-xps13 kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 78!
2025-01-17T22:33:46.172719+01:00 morgwai-xps13 kernel: NVRM: nvCheckOkFailedNoLog: Check failed: GPU lost from the bus [NV_ERR_GPU_IS_LOST] (0x0000000F) returned from nvdEngineDumpCallbackHelper(pGpu, pPrbEnc, pNvDumpState, pEngineCallback) @ nv_debug_dump.c:274
2025-01-17T22:33:46.172720+01:00 morgwai-xps13 kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 78!
2025-01-17T22:33:46.172720+01:00 morgwai-xps13 kernel: NVRM: nvCheckOkFailedNoLog: Check failed: GPU lost from the bus [NV_ERR_GPU_IS_LOST] (0x0000000F) returned from nvdEngineDumpCallbackHelper(pGpu, pPrbEnc, pNvDumpState, pEngineCallback) @ nv_debug_dump.c:274
2025-01-17T22:33:46.172720+01:00 morgwai-xps13 kernel: NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ journal.c:2143
2025-01-17T22:33:46.638715+01:00 morgwai-xps13 kernel: NVRM: Attempting to remove device 0000:07:00.0 with non-zero usage count!
2025-01-17T22:35:01.847722+01:00 morgwai-xps13 kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 78!
2025-01-17T22:35:01.847741+01:00 morgwai-xps13 kernel: NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x0000000f for fn 78!
2025-01-17T23:10:35.419775+01:00 morgwai-xps13 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 509
2025-01-17T23:10:35.463816+01:00 morgwai-xps13 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 550.120 Fri Sep 13 10:10:01 UTC 2024
2025-01-17T23:10:35.580773+01:00 morgwai-xps13 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 550.120 Fri Sep 13 10:01:25 UTC 2024
2025-01-17T23:10:35.744778+01:00 morgwai-xps13 kernel: [drm] [nvidia-drm] [GPU ID 0x00000700] Loading driver
2025-01-17T23:10:37.022767+01:00 morgwai-xps13 kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x26:0x56:1610)
2025-01-17T23:10:37.022794+01:00 morgwai-xps13 kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0
2025-01-17T23:10:37.022796+01:00 morgwai-xps13 kernel: [drm:nv_drm_load [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000700] Failed to allocate NvKmsKapiDevice
2025-01-17T23:10:37.022805+01:00 morgwai-xps13 kernel: [drm:nv_drm_register_drm_device [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000700] Failed to register device
2025-01-17T23:10:38.519757+01:00 morgwai-xps13 kernel: nvidia-uvm: Loaded the UVM driver, major device number 507.
2025-01-17T23:10:50.818798+01:00 morgwai-xps13 kernel: NVRM: GPU 0000:07:00.0: RmInitAdapter failed! (0x26:0x56:1610)
2025-01-17T23:10:50.818815+01:00 morgwai-xps13 kernel: NVRM: GPU 0000:07:00.0: rm_init_adapter failed, device minor number 0

(so there’s RmInitAdapter also, but with different code: 0x26:0x56:1610)

[ 8.322455] ------------[ cut here ]------------
[ 8.322461] WARNING: CPU: 15 PID: 1087 at drivers/gpu/drm/drm_file.c:312 drm_open_helper+0x132/0x150 [drm]
[ 8.322516] Modules linked in: qrtr rfcomm cmac algif_hash algif_skcipher af_alg bnep nvidia_drm(POE) nvidia_modeset(POE) binfmt_misc nls_ascii nls_cp437 vfat fat nvidia(POE) amd_atl intel_rapl_msr intel_rapl_common iwlmvm edac_mce_amd snd_sof_amd_rembrandt snd_sof_amd_acp kvm_amd snd_sof_pci snd_sof_xtensa_dsp mac80211 snd_hda_codec_conexant snd_sof snd_hda_codec_generic kvm libarc4 snd_hda_codec_hdmi snd_sof_utils iwlwifi snd_hda_intel snd_soc_core snd_intel_dspcfg snd_intel_sdw_acpi btusb uvcvideo btrtl snd_hda_codec crct10dif_pclmul snd_compress btintel videobuf2_vmalloc ghash_clmulni_intel snd_pcm_dmaengine uvc sha512_ssse3 btbcm videobuf2_memops videobuf2_v4l2 btmtk sha256_ssse3 snd_pci_ps videodev snd_hda_core cfg80211 sha1_ssse3 snd_rpl_pci_acp6x bluetooth snd_pci_acp6x snd_hwdep videobuf2_common aesni_intel snd_pcm gf128mul mc crypto_simd asus_wmi cryptd sparse_keymap rapl platform_profile snd_pci_acp5x spd5118 wmi_bmof snd_timer snd_rn_pci_acp3x pcspkr ucsi_acpi snd_acp_config k10temp snd snd_soc_acpi
[ 8.322640] typec_ucsi sp5100_tco ccp rfkill snd_pci_acp3x watchdog typec soundcore roles amd_pmc joydev ac evdev serio_raw parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic usbhid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core hid_multitouch drm_ttm_helper hid_generic xhci_pci xhci_hcd sdhci_pci ttm i2c_hid_acpi i2c_hid cqhci hid drm_kms_helper nvme thunderbolt sdhci usbcore nvme_core drm i2c_piix4 crc32_pclmul mmc_core crc32c_intel i2c_smbus nvme_auth crc16 usb_common video battery button wmi
[ 8.322750] CPU: 15 UID: 0 PID: 1087 Comm: Xorg.wrap Tainted: P OE 6.12.9-amd64 #1 Debian 6.12.9-1
[ 8.322758] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 8.322760] Hardware name: TUXEDO TUXEDO InfinityBook Pro AMD Gen9/GXxHRXx, BIOS N.1.14A12 11/12/2024
[ 8.322763] RIP: 0010:drm_open_helper+0x132/0x150 [drm]
[ 8.322794] Code: 7f a4 eb 31 c0 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 db b2 c5 eb 48 89 df 89 44 24 04 e8 64 fa ff ff 8b 44 24 04 eb db <0f> 0b b8 ea ff ff ff eb d2 b8 ea ff ff ff eb cb b8 f0 ff ff ff eb
[ 8.322798] RSP: 0018:ffffadf042473a50 EFLAGS: 00010246
[ 8.322803] RAX: ffffffffc2026220 RBX: ffff9a630f26d1a8 RCX: 0000000000000000
[ 8.322806] RDX: ffff9a63188cc000 RSI: ffff9a630f26d1a8 RDI: ffff9a64699ed8c0
[ 8.322808] RBP: ffff9a64699ed8c0 R08: ffff9a6322c6f678 R09: ffff9a63001c51d0
[ 8.322810] R10: 00000000000000e2 R11: 0000000000000002 R12: ffff9a63094bc000
[ 8.322813] R13: ffffffffc2026220 R14: 00000000ffffffed R15: ffff9a630e7267d0
[ 8.322816] FS: 00007fc076404740(0000) GS:ffff9a71a0180000(0000) knlGS:0000000000000000
[ 8.322819] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.322822] CR2: 00005568536c7fc0 CR3: 0000000113db6000 CR4: 0000000000f50ef0
[ 8.322825] PKRU: 55555554
[ 8.322827] Call Trace:
[ 8.322834]
[ 8.322838] ? drm_open_helper+0x132/0x150 [drm]
[ 8.322864] ? __warn.cold+0x93/0xf6
[ 8.322872] ? drm_open_helper+0x132/0x150 [drm]
[ 8.322903] ? report_bug+0xff/0x140
[ 8.322910] ? handle_bug+0x58/0x90
[ 8.322914] ? exc_invalid_op+0x17/0x70
[ 8.322918] ? asm_exc_invalid_op+0x1a/0x20
[ 8.322930] ? drm_open_helper+0x132/0x150 [drm]
[ 8.322958] drm_open+0x73/0x110 [drm]
[ 8.322986] drm_stub_open+0x9b/0xd0 [drm]
[ 8.323020] chrdev_open+0xb2/0x230
[ 8.323027] ? __pfx_chrdev_open+0x10/0x10
[ 8.323032] do_dentry_open+0x14c/0x440
[ 8.323039] vfs_open+0x2e/0xe0
[ 8.323045] path_openat+0x82e/0x12d0
[ 8.323051] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323055] ? kfree+0x2ec/0x360
[ 8.323062] do_filp_open+0xc4/0x170
[ 8.323075] do_sys_openat2+0xae/0xe0
[ 8.323082] __x64_sys_openat+0x55/0xa0
[ 8.323086] do_syscall_64+0x82/0x190
[ 8.323092] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323096] ? do_syscall_64+0x8e/0x190
[ 8.323101] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323104] ? __pm_runtime_suspend+0x69/0xc0
[ 8.323111] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323115] ? amdgpu_drm_ioctl+0x6e/0x80 [amdgpu]
[ 8.323422] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323426] ? syscall_exit_to_user_mode+0x4d/0x210
[ 8.323432] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323435] ? do_syscall_64+0x8e/0x190
[ 8.323440] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323443] ? count_memcg_events.constprop.0+0x1a/0x30
[ 8.323447] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323450] ? handle_mm_fault+0x1bb/0x2c0
[ 8.323456] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323459] ? do_user_addr_fault+0x36c/0x620
[ 8.323467] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323471] ? srso_alias_return_thunk+0x5/0xfbef5
[ 8.323475] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 8.323480] RIP: 0033:0x7fc076509c3c
[ 8.323516] Code: 83 e2 40 75 51 89 f0 f7 d0 a9 00 00 41 00 74 46 80 3d 37 c4 0e 00 00 74 6a 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 90 00 00 00 48 8b 54 24 28 64 48 2b 14 25
[ 8.323519] RSP: 002b:00007fff50de6720 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
[ 8.323523] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fc076509c3c
[ 8.323526] RDX: 0000000000000002 RSI: 00007fff50de68a0 RDI: 00000000ffffff9c
[ 8.323528] RBP: 00007fff50de68a0 R08: 0000000000000064 R09: 00000000ffffffff
[ 8.323530] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff50de68a0
[ 8.323533] R13: 0000000000000001 R14: 00000000c04064a0 R15: 0000000000000001
[ 8.323541]
[ 8.323543] —[ end trace 0000000000000000 ]—
[ 8.550260] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1489)
[ 8.550419] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 9.298094] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1489)
[ 9.298174] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 11.413079] wlp3s0: authenticate with 60:38:e0:a3:47:4c (local address=08:b4:d2:83:47:94)
[ 11.414330] wlp3s0: send auth to 60:38:e0:a3:47:4c (try 1/3)
[ 11.455542] wlp3s0: authenticated
[ 11.456380] wlp3s0: associate with 60:38:e0:a3:47:4c (try 1/3)
[ 11.459166] wlp3s0: RX AssocResp from 60:38:e0:a3:47:4c (capab=0x411 status=0 aid=3)
[ 11.470978] wlp3s0: associated
[ 19.727896] amdgpu 0000:66:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[ 19.727963] ------------[ cut here ]------------
[ 19.727964] WARNING: CPU: 3 PID: 1087 at drivers/gpu/drm/amd/amdgpu/…/display/dc/hubbub/dcn31/dcn31_hubbub.c:151 dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
[ 19.728216] Modules linked in: ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr rfcomm cmac algif_hash algif_skcipher af_alg bnep nvidia_drm(POE) nvidia_modeset(POE) binfmt_misc nls_ascii nls_cp437 vfat fat nvidia(POE) amd_atl intel_rapl_msr intel_rapl_common iwlmvm edac_mce_amd snd_sof_amd_rembrandt snd_sof_amd_acp kvm_amd snd_sof_pci snd_sof_xtensa_dsp mac80211 snd_hda_codec_conexant snd_sof snd_hda_codec_generic kvm libarc4 snd_hda_codec_hdmi snd_sof_utils iwlwifi snd_hda_intel snd_soc_core snd_intel_dspcfg snd_intel_sdw_acpi btusb uvcvideo btrtl snd_hda_codec crct10dif_pclmul snd_compress btintel videobuf2_vmalloc ghash_clmulni_intel snd_pcm_dmaengine uvc sha512_ssse3 btbcm videobuf2_memops videobuf2_v4l2 btmtk sha256_ssse3 snd_pci_ps videodev snd_hda_core cfg80211 sha1_ssse3 snd_rpl_pci_acp6x bluetooth snd_pci_acp6x snd_hwdep videobuf2_common aesni_intel snd_pcm gf128mul mc crypto_simd asus_wmi cryptd sparse_keymap rapl platform_profile snd_pci_acp5x spd5118 wmi_bmof snd_timer snd_rn_pci_acp3x pcspkr
[ 19.728273] ucsi_acpi snd_acp_config k10temp snd snd_soc_acpi typec_ucsi sp5100_tco ccp rfkill snd_pci_acp3x watchdog typec soundcore roles amd_pmc joydev ac evdev serio_raw parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic usbhid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core hid_multitouch drm_ttm_helper hid_generic xhci_pci xhci_hcd sdhci_pci ttm i2c_hid_acpi i2c_hid cqhci hid drm_kms_helper nvme thunderbolt sdhci usbcore nvme_core drm i2c_piix4 crc32_pclmul mmc_core crc32c_intel i2c_smbus nvme_auth crc16 usb_common video battery button wmi
[ 19.728336] CPU: 3 UID: 0 PID: 1087 Comm: Xorg Tainted: P W OE 6.12.9-amd64 #1 Debian 6.12.9-1
[ 19.728340] Tainted: [P]=PROPRIETARY_MODULE, [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 19.728342] Hardware name: TUXEDO TUXEDO InfinityBook Pro AMD Gen9/GXxHRXx, BIOS N.1.14A12 11/12/2024
[ 19.728344] RIP: 0010:dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
[ 19.728498] Code: 00 48 8b 43 28 8b 88 b0 01 00 00 48 8b 43 20 0f b6 50 6c 48 8b 43 18 8b b0 14 01 00 00 e8 a7 44 0e 00 85 c0 0f 85 33 01 00 00 <0f> 0b 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 0f 85 35 01 00 00
[ 19.728500] RSP: 0018:ffffadf0424733c8 EFLAGS: 00010202
[ 19.728502] RAX: 0000000000000001 RBX: ffff9a6305d87400 RCX: 000000000000001f
[ 19.728504] RDX: 0000000000000000 RSI: 000000000000397a RDI: ffff9a631d580000
[ 19.728505] RBP: 0000000000000004 R08: ffffadf0424733cc R09: 0000000000000019
[ 19.728506] R10: ffffffffaceb4348 R11: 0000000000000003 R12: ffff9a630e000000
[ 19.728507] R13: ffff9a632a000000 R14: ffff9a6305d87400 R15: 0000000000000001
[ 19.728508] FS: 00007f8fb3eacb00(0000) GS:ffff9a719fb80000(0000) knlGS:0000000000000000
[ 19.728509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.728510] CR2: 00007f9d6a5af000 CR3: 00000001037c8000 CR4: 0000000000f50ef0
[ 19.728512] PKRU: 55555554
[ 19.728513] Call Trace:
[ 19.728516]
[ 19.728516]
[ 19.728517] ? dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
[ 19.728643] ? __warn.cold+0x93/0xf6
[ 19.728650] ? dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
[ 19.728747] ? report_bug+0xff/0x140
[ 19.728751] ? handle_bug+0x58/0x90
[ 19.728753] ? exc_invalid_op+0x17/0x70
[ 19.728755] ? asm_exc_invalid_op+0x1a/0x20
[ 19.728760] ? dcn31_program_compbuf_size+0xd1/0x230 [amdgpu]
[ 19.728850] ? dcn31_program_compbuf_size+0xc9/0x230 [amdgpu]
[ 19.728940] dcn20_optimize_bandwidth+0xe4/0x220 [amdgpu]
[ 19.729079] dc_commit_state_no_check+0xc5c/0xeb0 [amdgpu]
[ 19.729206] dc_commit_streams+0x31f/0x420 [amdgpu]
[ 19.729328] amdgpu_dm_atomic_commit_tail+0x738/0x3b40 [amdgpu]
[ 19.729481] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729484] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729485] ? dma_resv_get_fences+0xb2/0x2a0
[ 19.729490] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729491] ? dma_resv_get_singleton+0x44/0x140
[ 19.729494] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729495] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729496] ? wait_for_completion_timeout+0x13b/0x170
[ 19.729499] ? wait_for_completion_interruptible+0x12d/0x1e0
[ 19.729501] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729504] ? ttm_resource_move_to_lru_tail+0x10a/0x130 [ttm]
[ 19.729511] commit_tail+0x91/0x130 [drm_kms_helper]
[ 19.729522] drm_atomic_helper_commit+0x11a/0x140 [drm_kms_helper]
[ 19.729528] drm_atomic_commit+0xa6/0xe0 [drm]
[ 19.729551] ? __pfx___drm_printfn_info+0x10/0x10 [drm]
[ 19.729564] drm_atomic_connector_commit_dpms+0xe9/0x110 [drm]
[ 19.729576] drm_mode_obj_set_property_ioctl+0x1ba/0x3e0 [drm]
[ 19.729593] ? __pfx_drm_connector_property_set_ioctl+0x10/0x10 [drm]
[ 19.729606] drm_connector_property_set_ioctl+0x3d/0x60 [drm]
[ 19.729618] drm_ioctl_kernel+0xad/0x100 [drm]
[ 19.729634] drm_ioctl+0x277/0x4f0 [drm]
[ 19.729645] ? __pfx_drm_connector_property_set_ioctl+0x10/0x10 [drm]
[ 19.729658] amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
[ 19.729756] __x64_sys_ioctl+0x91/0xd0
[ 19.729760] do_syscall_64+0x82/0x190
[ 19.729764] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729765] ? syscall_exit_to_user_mode+0x164/0x210
[ 19.729768] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729769] ? do_syscall_64+0x8e/0x190
[ 19.729772] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729774] ? sock_poll+0x51/0xf0
[ 19.729777] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729778] ? ep_item_poll.isra.0+0x56/0x90
[ 19.729781] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729782] ? do_epoll_ctl+0x1e8/0x1020
[ 19.729786] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729787] ? syscall_exit_to_user_mode+0x164/0x210
[ 19.729789] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729790] ? do_syscall_64+0x8e/0x190
[ 19.729792] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729793] ? do_syscall_64+0x8e/0x190
[ 19.729795] ? ep_item_poll.isra.0+0x56/0x90
[ 19.729797] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729798] ? do_epoll_ctl+0x1e8/0x1020
[ 19.729800] ? do_syscall_64+0x8e/0x190
[ 19.729803] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729804] ? syscall_exit_to_user_mode+0x164/0x210
[ 19.729806] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729807] ? do_syscall_64+0x8e/0x190
[ 19.729810] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729811] ? syscall_exit_to_user_mode+0x172/0x210
[ 19.729813] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729815] ? do_syscall_64+0x8e/0x190
[ 19.729816] ? srso_alias_return_thunk+0x5/0xfbef5
[ 19.729818] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 19.729821] RIP: 0033:0x7f8fb424a33b
[ 19.729850] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[ 19.729851] RSP: 002b:00007ffed39f4bf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 19.729853] RAX: ffffffffffffffda RBX: 0000560eb7bdb940 RCX: 00007f8fb424a33b
[ 19.729855] RDX: 00007ffed39f4c80 RSI: 00000000c01064ab RDI: 000000000000000f
[ 19.729856] RBP: 00007ffed39f4c80 R08: 0000560eb7bdfa00 R09: 0000000000000000
[ 19.729857] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c01064ab
[ 19.729858] R13: 000000000000000f R14: 0000560eb7bdb230 R15: 0000000000000000
[ 19.729861]
[ 19.729861] —[ end trace 0000000000000000 ]—
[ 29.733993] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1489)
[ 29.734073] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 30.450106] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1489)
[ 30.450185] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 31.369222] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[ 31.390848] nvidia-uvm: Loaded the UVM driver, major device number 511.
[ 155.084503] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1489)
[ 155.084583] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 155.802413] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x56:1489)
[ 155.802490] NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

(so there’s a warning stack-trace from drm, then the same RmInitAdapter failed! (0x26:0x56:1489) as in the first log and finally a warning stack-trace from amdgpu).