Failed to get memory pages for NvKmsKapiMemory

Hello,

I have replaced nvidia with a competitors one to get system running, so I can not generate now with Your skript the logs You asks. But I can provide the logs from System and on request more information. Issue accourse after upgrade from fedora 33 to 34.

I use NVIDIA GPU GeForce GTX 1660 (TU116-A) at PCI:45:0:0 (GPU-0)

tested with this drivers:
NVIDIA-Linux-x86_64-465.24.02
NVIDIA-Linux-x86_64-465.27

[drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00002d00] Failed to get memory pages for NvKmsKapiMemory 0x00000000bcbe86fe
BUG: kernel NULL pointer dereference, address: 000000000000000c
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 6 PID: 2312 Comm: gnome-shell Tainted: P OE 5.11.16-300.fc34.x86_64 #1
Hardware name: Micro-Star International Co., Ltd. MS-7C35/MEG X570 UNIFY (MS-7C35), BIOS A.80 01/22/2021
RIP: 0010:drm_gem_map_dma_buf+0x3f/0xb0 [drm]
Code: 00 00 83 fe 03 74 6d 48 8b 87 40 01 00 00 48 8b 40 38 48 85 c0 74 6f 41 89 f5 e8 5c 8d 8f cc 49 89 c4 48 3d 00 f0 ff ff 77 21 <8b> 50 0c 48 8b 7b 08 41 b8 20 00 00 00 44 89 e9 48 8b 30 e8 d9 70
RSP: 0018:ffffb9b3c70afcf8 EFLAGS: 00010207
RAX: 0000000000000000 RBX: ffff9bd8a423b5a0 RCX: 0000000000000000
RDX: ffff9bdf5eba6ba0 RSI: ffff9bdf5eb98ac0 RDI: ffff9bdf5eb98ac0
RBP: ffff9bd8a423b5a0 R08: 0000000000000000 R09: ffffb9b3c70afa98
R10: ffffb9b3c70afa90 R11: ffffffff8db44f08 R12: 0000000000000000
R13: 0000000000000000 R14: ffff9bd880efc000 R15: ffff9bd840fefdd8
FS: 00007fc5f2873d80(0000) GS:ffff9bdf5eb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000000c CR3: 00000001961b8000 CR4: 0000000000350ee0
Call Trace:
dma_buf_map_attachment+0x7e/0xf0
drm_gem_prime_import_dev+0x64/0x150 [drm]
drm_gem_prime_fd_to_handle+0x196/0x1d0 [drm]
? drm_prime_destroy_file_private+0x20/0x20 [drm]
drm_ioctl_kernel+0x86/0xd0 [drm]
drm_ioctl+0x20f/0x3c0 [drm]
? drm_prime_destroy_file_private+0x20/0x20 [drm]
radeon_drm_ioctl+0x49/0x80 [radeon]
__x64_sys_ioctl+0x82/0xb0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fc5f6a8c4eb
Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 b9 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffcd26af708 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffcd26af74c RCX: 00007fc5f6a8c4eb
RDX: 00007ffcd26af74c RSI: 00000000c00c642e RDI: 0000000000000012
RBP: 00000000c00c642e R08: 00007ffcd26af7f0 R09: 00007fc5f6b58a00
R10: 0000000000000000 R11: 0000000000000246 R12: 000056553779d410
R13: 0000000000000012 R14: 0000000000100000 R15: 00007ffcd26aff00
Modules linked in: rfcomm snd_seq_dummy snd_hrtimer xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bridge stp llc cmac bnep sunrpc squashfs vfat fat loop intel_rapl_msr intel_rapl_common usblp snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio iwlmvm snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation mac80211 snd_soc_core snd_compress snd_pcm_dmaengine edac_mce_amd soundwire_cadence libarc4 snd_hda_codec kvm_amd snd_hda_core ucsi_ccg ccp ac97_bus snd_seq btusb typec_ucsi typec iwlwifi snd_seq_device btrtl snd_hwdep snd_pcm btbcm kvm btintel snd_timer irqbypass bluetooth rapl snd cfg80211 k10temp i2c_piix4 joydev soundcore pcspkr wmi_bmof ecdh_generic i2c_nvidia_gpu ecc rfkill binfmt_misc zram ip_tables dm_crypt hid_logitech_hidpp nvidia_drm(POE)
nvidia_modeset(POE) nvidia(POE) hid_logitech_dj radeon i2c_algo_bit drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel cec nvme mxm_wmi drm ghash_clmulni_intel sp5100_tco e1000e r8169 nvme_core wmi pinctrl_amd fuse
CR2: 000000000000000c
—[ end trace 186b69729a9a88bc ]—

5.11.16-300.fc34.x86_64 …ernel01.iad2.fedoraproject.org) (gcc (GCC) 11.0.1 20210324 (Red Hat 11.0.1-0), GNU ld version 2.35.1-41.fc34) #1 SMP Wed Apr 21 13:18:33 UTC 2021

journal_gnome-shell.txt (12.9 KB)
journal_nvidia.txt (3.5 MB)
nvidia_issue_fc34.txt (17.6 KB)

It’s rather a guessing game but it seems with the upgrade fedora was switched from Xorg to Wayland and the necessary module parameter was not set
nvidia-drm.modeset=1

Or disable wayland in /etc/gdm/custom.conf

“nvidia-drm.modeset=1” was enabled in configs. Wayland in /etc/gdm/custom.conf disabled and tried with Nvidia GPU. Works well.
the issue I have with 2 GPUs, one from Nvidia and second one was from old HP workstation, from competitor. So without this second GPU all works now also well also with enabled Wayland. Looks like GPU driver collision.

Case is closed. thx.