550.54.14 - Cannot create sg_table for NvKmsKapiMemory spammed when launching chrome on Wayland

amrits · June 3, 2024, 7:59am

Hi @wassou93
It is still under investigation.
I wanted to check if you know the last passing driver.

does.not.kompute · June 7, 2024, 2:10am

Still happening on 555.52.04-1. I am also using CachyOS with the 6.9.3-3-cachyos-lto kernel. Discord takes forever to load and spams dmesg with this message. Unsure if it’s related or not, but occasionally while playing video with discord on the other screen, the screen freezes but audio still plays. The only way to get out of it is locking the desktop with a keyboard shortcut and waiting. While trying to debug that I found these dmesg logs.

opposite34 · June 7, 2024, 9:02pm

I also noticed this happening with Discord. Even when following @orxcyd 's solution of adding i915 to the initramfs configuration, the same issue still persist on my system. I have the following specs:

Laptop: Dell G3 3500
CPU: Intel(R) Core™ i7-10750H
GPU: NVIDIA GeForce GTX 1650 Ti Mobile
OS: Arch Linux
Kernel Version: 6.9.3
NVIDIA Driver: 550.78
Window Manager: Hyprland

I have only recently switched to Wayland so I do not know when the last non-issue version is.

UbiquitousPhoton · June 12, 2024, 5:39pm

Hi all, I first reported this issue on September 18, 2023 (with Kernel 6.5.1) to linux-bugs@nvidia.com

I was told at the time that this was an incorrect error message and that it would be downgraded to a warning, and the ticket was closed. I’m not sure which driver version I was running back then, however.

I do find that if you do not early load the nvidia modules, then it goes away. I’m seeing the error line get spammed 100s of times a second, so its really not practical to have them loaded for me. Unfortunately I require a few electron based apps for work, and run a chromium based browser as well.

Given the timespans, I’m not holding out a whole lot of hope.

opposite34 · June 25, 2024, 4:37am

Hi, an update on this. so I forgot to regenerate initramfs (which is done with mkinitcpio on Arch Linux). After doing so the problem goes away, so it really seems like something to do with early loading.

wassou93 · June 25, 2024, 7:10am

I think it’s an early loading issue I found a fix, I removed kms from mkinitcpio HOOK
and added nvidia and i915 modules to MODULES

then re-ran mkinitcpio -P and rebooted and made sure drm is enabled in grub and everything worked

in my /etc/mkinitcpio.conf:

MODULES=(i915 nvidia nvidia_modeset nvidia_uvm nvidia_drm)

HOOKS=(base udev autodetect microcode modconf block keyboard keymap consolefont plymouth filesystems fsck)

then ran sudo mkinitcpio -P

these are the packages I have isntalled for my cachyos

❯ paru -Qs nvidia
local/egl-wayland 2:1.1.13-3
EGLStream-based Wayland external platform
local/lib32-libvdpau 1.5-2
Nvidia VDPAU library
local/lib32-nvidia-utils 555.52.04-1
NVIDIA drivers utilities (32-bit)
local/lib32-opencl-nvidia 555.52.04-1
OpenCL implemention for NVIDIA (32-bit)
local/libva-nvidia-driver 0.0.12-1.1
VA-API implementation that uses NVDEC as a backend
local/libvdpau 1.5-2.1
Nvidia VDPAU library
local/libxnvctrl 555.42.02-2
NVIDIA NV-CONTROL X extension
local/linux-cachyos-nvidia 6.9.6-2
nvidia module of 555.52.04 driver for the linux-cachyos kernel
local/nvidia-prime 1.0-4
NVIDIA Prime Render Offload configuration and utilities
local/nvidia-settings 555.42.02-2
Tool for configuring the NVIDIA graphics driver
local/nvidia-utils 555.52.04-3
NVIDIA drivers utilities
local/opencl-nvidia 555.52.04-3
OpenCL implemention for NVIDIA

I hope this helps.

ast.rix.1357 · August 7, 2024, 4:42pm

adding amdgpu(amdgpu is for amd igpu and i915 for intel for who don’t know)
just uses the iGPU for rendering rather than NVIDIA or CPU(software rendering)
so it doesn’t really fix the problem

before adding amdgpu to MODULES
brave won’t load 3D websites (eg. bruno-simon.com)
but firefox would (using iGPU)

adding amdgpu enabled WebGL using iGPU as hardware accelerator
now brave can run 3D websites but won’t run on NVIDIA

and btw i was using nvidia 535 drivers

i’m using 535 drivers as the newer drivers are causing crash while upgrading packages
in ArchLinux

At first the browser would take long time to launch
but now the browser launches quickly but doesn’t use nvidia as the gpu
tested that using nvtop

nvtop shows both gpus
NVIDIA GeForce RTX 3050 Laptop GPU
and
AMD Radeon Graphics

prime-run used to work previously (don’t know when)

ioanndev · September 6, 2024, 3:06am

Hello! Exactly the same problems with driver 560.35.03. It looks like the problem is in the kernel configuration.

For example, a stock image works great cfg_default.txt (270.1 KB)

Also works well on the new kernel version cfg_custom_desktop.txt (253.6 KB)

but there is a problem with the server implementation cfg_custom_server_100hz.txt (240.2 KB)

On hybrid systems (nvidia+intel), setting the BIOS parameters such as Aperture Size and DVMT Pre-allocated can help.

godvvino · September 8, 2024, 7:56am

I am getting the error kernel: [drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_tabl e for NvKmsKapiMemory 0x000000006527f86e on an AMD laptop with switchable graphics when running gamescope which causes it to crash. This is on driver version 560.35.03 and it happens with both the proprietary and open source kernel modules.

as.asaw · October 11, 2024, 12:36pm

I encountered this issue after upgrading to KDE Neon 24.04.1 and switching to Wayland. Google Chrome wouldn’t display any windows with --ozone-platform=wayland, and the kernel log would be filled with messages like [ 6767.216373] [drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_table for NvKmsKapiMemory 0x00000000aa338886 [. I fixed this by adding i915 to /etc/initramfs-tools/modules, running update-initramfs -c -k all, and rebooting. Thanks to everyone who discovered this workaround!

elman · October 12, 2024, 10:23am

I’m having the same issue with apps that try to use NVIDIA card — all electron based + some more like Darktable or wine.

Operating System: EndeavourOS
KDE Plasma Version: 6.2.0
Kernel Version: 6.11.3-zen1-1-zen (64-bit)
Graphics Platform: Wayland
Processors: 16 × AMD Ryzen 7 5800H with Radeon Graphics
Graphics Processor: AMD Radeon Graphics
Manufacturer: ASUSTeK COMPUTER INC.
GPU: NVIDIA GeForce RTX 3060 Laptop GPU

Errors in dmesg are [drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_table for NvKmsKapiMemory

I’m using NVIDIA drivers 560.35.03-6.

Since I’m using EndeavourOS with systemd and dracut, I had to create file /etc/dracut.conf.d/myflags.conf, put in:

force_drivers+=" amdgpu nvidia nvidia_modeset nvidia_uvm nvidia_drm "

and then run sudo reinstall-kernels.

Now Darktable finds NVIDIA card, Upscayl runs OK, and electron apps start without delay. Thanks for the tip guys, I was getting quite annoyed by that bug.

EDIT: Looks like I spoke too soon. I still have crash in dmesg:

[ 1023.198191] NVRM: GPU at PCI:0000:01:00: GPU-7312c96f-4f21-eb4c-15a0-dab57f44a76d
[ 1023.198198] NVRM: Xid (PCI:0000:01:00): 119, pid=76164, name=Typora, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 76 (GSP_RM_CONTROL) (0x2080205b 0x4).
[ 1023.198205] NVRM: GPU0 GSP RPC buffer contains function 76 (GSP_RM_CONTROL) and data 0x000000002080205b 0x0000000000000004.
[ 1023.198210] NVRM: GPU0 RPC history (CPU -> GSP):
[ 1023.198212] NVRM:     entry function                   data0              data1              ts_start           ts_end             duration actively_polling
[ 1023.198215] NVRM:      0    76   GSP_RM_CONTROL        0x000000002080205b 0x0000000000000004 0x000624459ebf6219 0x0000000000000000          y
[ 1023.198222] NVRM:     -1    47   UNLOADING_GUEST_DRIVE 0x0000000000000000 0x0000000000000000 0x000624459cb59a97 0x000624459cb8b269 202706us  
[ 1023.198229] NVRM:     -2    10   FREE                  0x00000000c1e00055 0x0000000000000000 0x000624459cb597eb 0x000624459cb59a3f    596us  
[ 1023.198234] NVRM:     -3    10   FREE                  0x000000000000000a 0x0000000000000000 0x000624459cb5948e 0x000624459cb597e8    858us  
[ 1023.198239] NVRM:     -4    10   FREE                  0x000000000000000b 0x0000000000000000 0x000624459cb59128 0x000624459cb5928a    354us  
[ 1023.198244] NVRM:     -5    10   FREE                  0x0000000000000006 0x0000000000000000 0x000624459cb58cc1 0x000624459cb59115   1108us  
[ 1023.198248] NVRM:     -6    10   FREE                  0x0000000000000002 0x0000000000000000 0x000624459cb57f41 0x000624459cb58be2   3233us  
[ 1023.198253] NVRM:     -7    10   FREE                  0x0000000000000005 0x0000000000000000 0x000624459cb57801 0x000624459cb57f35   1844us  
[ 1023.198257] NVRM: GPU0 RPC event history (CPU <- GSP):
[ 1023.198260] NVRM:     entry function                   data0              data1              ts_start           ts_end             duration during_incomplete_rpc
[ 1023.198263] NVRM:      0    4108 UCODE_LIBOS_PRINT     0x0000000000000000 0x0000000000000000 0x000624459cb65537 0x000624459cb65539      2us  
[ 1023.198268] NVRM:     -1    4128 GSP_POST_NOCAT_RECORD 0x0000000000000002 0x0000000000000028 0x000624459cb5f562 0x000624459cb5f566      4us  
[ 1023.198274] NVRM:     -2    4111 PERF_BRIDGELESS_INFO_ 0x0000000000000000 0x0000000000000000 0x000624459cb5f3a0 0x000624459cb5f3a1      1us  
[ 1023.198279] NVRM:     -3    4108 UCODE_LIBOS_PRINT     0x0000000000000000 0x0000000000000000 0x000624459b6f9728 0x000624459b6f9728           
[ 1023.198283] NVRM:     -4    4108 UCODE_LIBOS_PRINT     0x0000000000000000 0x0000000000000000 0x000624459b6f95a7 0x000624459b6f95a9      2us  
[ 1023.198288] NVRM:     -5    4128 GSP_POST_NOCAT_RECORD 0x0000000000000002 0x0000000000000027 0x000624459b6f7cb6 0x000624459b6f7cba      4us  
[ 1023.198293] NVRM:     -6    4098 GSP_RUN_CPU_SEQUENCER 0x000000000000061c 0x0000000000003fe2 0x000624459b6eadf5 0x000624459b6ec004   4623us  
[ 1023.198298] NVRM:     -7    4108 UCODE_LIBOS_PRINT     0x0000000000000000 0x0000000000000000 0x00062445886ae1ef 0x00062445886ae1f0      1us  
[ 1023.198304] CPU: 1 UID: 1000 PID: 76164 Comm: Typora Tainted: P           OE      6.11.3-zen1-1-zen #1 1400000003000000474e5500d4154c511b9cdca1
[ 1023.198312] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 1023.198314] Hardware name: ASUSTeK COMPUTER INC. ProArt StudioBook H5600QM_H5600QM/H5600QM, BIOS H5600QM.321 05/10/2023
[ 1023.198316] Call Trace:
[ 1023.198320]  <TASK>
[ 1023.198323]  dump_stack_lvl+0x5d/0x80
[ 1023.198334]  _nv012948rm+0x4ee/0x590 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.198970]  _nv012865rm+0x77/0x330 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.199440]  _nv048628rm+0x49f/0x7f0 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.199885]  _nv051992rm+0xa4/0x150 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.200446]  _nv047909rm+0x1a1/0x1b0 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.200880]  _nv049933rm+0x3ff/0x500 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.201312]  _nv014741rm+0x42e/0x690 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.201649]  _nv048046rm+0x29/0x30 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.201978]  ? _nv049936rm+0x60/0x60 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.202312]  _nv000762rm+0x58/0x70 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.202677]  _nv000761rm+0x21b/0x220 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.203024]  _nv000713rm+0x1a3/0x300 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.203384]  rm_transition_dynamic_power+0xd7/0x13f [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.203732]  nv_pmops_runtime_resume+0xb9/0xf0 [nvidia 1400000003000000474e550096469435de778d30]
[ 1023.204010]  ? __pfx_pci_pm_runtime_resume+0x10/0x10
[ 1023.204014]  __rpm_callback+0x44/0x170
[ 1023.204019]  ? __pfx_pci_pm_runtime_resume+0x10/0x10
[ 1023.204023]  rpm_resume+0x5bb/0x850
[ 1023.204029]  pm_runtime_barrier+0x86/0x90
[ 1023.204033]  pci_config_pm_runtime_get+0x3a/0x60
[ 1023.204038]  pci_read_config+0x99/0x2f0
[ 1023.204045]  kernfs_fop_read_iter+0xab/0x1b0
[ 1023.204051]  vfs_read+0x347/0x470
[ 1023.204058]  __x64_sys_pread64+0x98/0xd0
[ 1023.204063]  do_syscall_64+0x82/0x190
[ 1023.204069]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1023.204073]  ? syscall_exit_to_user_mode+0x10/0x1e0
[ 1023.204077]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1023.204080]  ? do_syscall_64+0x8e/0x190
[ 1023.204083]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1023.204087]  ? do_syscall_64+0x8e/0x190
[ 1023.204090]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1023.204093]  ? do_syscall_64+0x8e/0x190
[ 1023.204097]  ? exc_page_fault+0x81/0x190
[ 1023.204101]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 1023.204105] RIP: 0033:0x70a99ec051f7
[ 1023.204126] Code: 00 00 00 0f 05 f7 d8 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 75 0e 10 00 00 49 89 ca 74 10 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 55 48 89 e5 48 83 ec 20 48 89 55 e8 48
[ 1023.204129] RSP: 002b:00007ffd23d5b9c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000011
[ 1023.204134] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 000070a99ec051f7
[ 1023.204136] RDX: 0000000000000001 RSI: 00007ffd23d5ba07 RDI: 000000000000000f
[ 1023.204138] RBP: 00007ffd23d5b9f0 R08: 0000000000000073 R09: 0000000000000000
[ 1023.204141] R10: 0000000000000008 R11: 0000000000000202 R12: 0000000000000001
[ 1023.204143] R13: 0000313c00080fc0 R14: 00007ffd23d5ba07 R15: 0000313c00020000
[ 1023.204150]  </TASK>
[ 1029.205013] NVRM: Xid (PCI:0000:01:00): 119, pid=76164, name=Typora, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 76 (GSP_RM_CONTROL) (0x20800a81 0x4).
[ 1035.205851] NVRM: Xid (PCI:0000:01:00): 119, pid=72494, name=kworker/2:0, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 76 (GSP_RM_CONTROL) (0x20802092 0x4).
[ 1041.206690] NVRM: Rate limiting GSP RPC error prints for GPU at PCI:0000:01:00 (printing 1 of every 30).  The GPU likely needs to be reset.

amrits · October 21, 2024, 12:11pm

Hi All,
So far, it does not look like the hang happens in our drivers. There is an chromium bug filed for the same and can be tracked further as below. Chromium

elman · October 25, 2024, 9:27pm

Thank you for trying to fix this, but for me it’s not related only to Electron applications. And weird part is that it works for a short while after restart and then starts having problems. For example when I ran darktable-cltest 5 seconds after booting into KDE, I get

     0.0220 [opencl_init] opencl disabled via darktable preferences
     0.0221 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
     0.8577 [opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'NVIDIA GeForce RTX 3060 Laptop GPU'
   CONF KEY:                 cldevice_v5_nvidiacudanvidiageforcertx3060laptopgpu
   PLATFORM, VENDOR & ID:    NVIDIA CUDA, NVIDIA Corporation, ID=4318
   CANONICAL NAME:           nvidiacudanvidiageforcertx3060laptopgpu
   DRIVER VERSION:           560.35.03
   DEVICE VERSION:           OpenCL 3.0 CUDA, SM_20 SUPPORT
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          5834 MB
   MAX MEM ALLOC:            1459 MB
   MAX IMAGE SIZE:           32768 x 32768
   MAX WORK GROUP SIZE:      1024
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 64 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   USE HEADROOM:             400Mb
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/elman/.cache/darktable/cached_v3_kernels_for_NVIDIACUDANVIDIAGeForceRTX3060LaptopGPU_5603503
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math  -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       0.0753 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init]           0       'NVIDIA CUDA NVIDIA GeForce RTX 3060 Laptop GPU'
     1.0607 [opencl_init] FINALLY: opencl PREFERENCE=OFF is AVAILABLE and NOT ENABLED.

But when I tried 20 seconds later, I got error after 108 seconds:

     0.0202 [opencl_init] opencl disabled via darktable preferences
     0.0203 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
   108.2123 [opencl_init] 0 platforms detected, error: Unknown OpenCL error
   108.2123 [opencl_init] FINALLY: opencl PREFERENCE=OFF is NOT AVAILABLE and NOT ENABLED.

Another issue I found is with System Monitor, where I get error “This page is missing some sensors and will not display correctly.” when trying to view Nvidia memory usage, GPU usage and GPU Frequency.

Unfortunately at this point I have some many issues that I was forced to switch from hybrid mode to integrated so that I can at least work.

wassou93 · November 2, 2024, 7:06pm

Hi @amrits , Sorry didn’t notice your question, I know this might be late but the last passing driver was 535.171.04 anything beyond that chrome can’t use under wayland and it will hit the DRM error in journalctl logs.

elman · November 17, 2024, 3:22am

Hi. I just found out that if I start my laptop with external screen connected via HDMI, I don’t have this issue. Everything is working as expected, CL is detected in Darktable and I have no delay when starting Electron apps. After boot I can disconnect my display and things keep working. Curious…

bas4 · January 7, 2025, 12:24pm

I was experiencing this issue on Ubuntu 22.04 with KDE/Wayland but thanks to the comments here I was able to work out something that fixed it for me:

Edit the file /etc/initramfs-tools/modules and add the following lines at the end:

i915
nvidia
nvidia_modeset
nvidia_uvm
nvidia_drm

Now run sudo update-initramfs -c -k all, and then reboot.

Note, one thing I noticed: a bit later I updated my linux kernel command line with update-grub, which broke it for me again (even when I changed the command line back to what it was before). But then, when I ran update-initramfs again, it was fixed again. I’m not 100% sure why but I guess update-grub overwrites something written by update-initramfs. In any case, I guess make sure update-initramfs is the last command that updates something related to booting.

edit: I should also mention that I havenvidia_drm.modeset=1 on the kernel command line (but I’m not sure if that has any effect on this bug)

bodescu · February 9, 2025, 11:27am

Hi there,

Suffering the bug, for me Bas4 fix solved, by the moment at least. Thanks man :)

Have a nice day

Topic		Replies	Views
DRM Kernel Error For Chromium Based Apps On Wayland Linux	8	3134	March 31, 2024
555 release feedback & discussion Linux	277	43417	February 3, 2025
Nvidia, please get it together with external monitors on Wayland Linux wayland , linux , linux-driver	61	12308	May 4, 2025
Nvidia 470 gdm(wayland) fail to start when a monitor is plugged in Linux	21	7176	January 11, 2022
Non-existent shared VRAM on NVIDIA Linux drivers Linux	65	15049	May 4, 2025
Reproducible: NVRM: GPU at 0000:01:00.0 has fallen off the bus. -- Both screens black, Xorg at 100% Linux	24	51004	December 16, 2015
PRIME option does not appear in nvidia-settings Linux ubuntu , driver	9	2312	March 20, 2024
RmInitAdapter failed! since kernel > 6.4 Linux kernel	28	3900	November 5, 2024
570 Random Freeze: GPU has fallen off the bus Linux	7	478	May 7, 2025
575 BETA release feedback & discussion Linux	137	8054	May 13, 2025

550.54.14 - Cannot create sg_table for NvKmsKapiMemory spammed when launching chrome on Wayland

Related topics