550.54.14 - Cannot create sg_table for NvKmsKapiMemory spammed when launching chrome on Wayland

When opening a Google Chrome window on Wayland with a reverse-prime setup (External monitors on NVIDIA GPU, Laptop monitor on the IGPU), chrome will take about 1:30 to launch, all the while consuming high CPU usage, and seeing the following message spammed into kernel log:

[drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_table for NvKmsKapiMemory 0x00000000df7ad469

Once chrome has opened, opening new windows works fine. But once closing chrome completely, relaunching it will yield the same issue.

System:

System76 Oryp8
CPU: i7-11800H
GPU: RTX 3070 (With Intel UHD on the CPU)
OS: Archlinux
NVIDIA Driver: 540.54.14
Kernel version: 6.7.8
Desktop: Plasma 6.0

Note: I couldn’t attach a bug-report. Even when running in safe mode, the script hangs when trying to print journalctl messages

2 Likes

I am hitting a similar issue when I try launching chromium in wayland mode with chromium --ozone-platform-hint=wayland (it starts in XWayland by default and seems to work just fine).

It prints a bunch of EGL-related errors and seems to start in failsafe mode with software rendering after several seconds (not 1:30, like OP).

Here’s a log file: http://0x0.st/H7ct.txt

System:
CPU: AMD Ryzen 7 5800H
GPU: NVIDIA GeForce RTX 3070 Mobile (and an AMD integrated card)
OS: Archlinux
NVIDIA Driver: 540.54.14
Kernel version: 6.7.6
Desktop: Plasma 5.27

The bug-report script worked, here it is:
nvidia-bug-report.log.gz (1.3 MB)

I acknowledged the issue and filed a bug 4545633 internally for tracking purpose.
I shall attempt for local repro first which will help us to debug issue and will get back to you if needed any additional information.

2 Likes

@amrits , I’ve been able to generate a bug-report after waiting for over an hour.

nvidia-bug-report.log.gz (10.9 MB)

@ashcon50

Thanks for sharing the logs.
I did not observer repro on Lenovo Thinkpad Gen 3 notebook with T2000 GPU and driver 550.54.14 using Wayland protocol in reverse prime setup.

I tried opening chrome browser multiple times and it opened instantly.

I will look for similar notebook and will retry for local repro.
Just checking if you have any other repro steps which leads to such error messages.
Also please help to share the output of “xrandr --listproviders”.

1 Like

any update on this? google-chrome, chromium and even thorium has the same issue after changing the Preferred Ozone platform flag to wayland. Arch Linux[gnome], i5-12500h, rtx 3050ti mobile [550.54.14]
The issue originally started on my original arch install from which I then hopped through debian, fedora, debian again and then finally on my current arch setup and the issue still exists.

1 Like

You need to force wayland to hit the issue because chrome will default to XWayland
use these options with chrome: -enable-features=UseOzonePlatform --ozone-platform=wayland

same thing happends with discord when trying to run it in wayland with the ozone flags, running it in xwayland it works , setting ELECTRON_OZONE_PLATFORM_HINT=“auto” when launching discord 0.0.47 makes it spam.

Hi @wassou93
Spent few more cycles on another notebook but still not able to repro issue on reverse prime setup.
I use these options with chrome: -enable-features=UseOzonePlatform --ozone-platform=wayland and it opened immediately on external monitor using wayland protocol.

Acer Nitro + Intel 12th Gen Intel(R) Core™ i5-12500H + Debian GNU/Linux 12 + kernel 6.1.0-17-amd64 + NVIDIA GeForce RTX 3060 Laptop GPU + Driver 550.54.14 + GBT AORUS FI27Q-P connected via HDMI connection

Can you share once repro video for my reference.

1 Like

Hi @amrits ,
My Kernel is 6.8 and I’m using Arch Linux and Cachy OS (Based on Arch) I’m hitting the issue always I have even tried fully new installs with different desktop envirements (KDE and Gnome),
I’m now trying to install Nobara OS (Fedora based) and will update the status.

Just installed Fedora based distro with below setup and I’m not hitting the issue anymore,

Legion 5i Pro
CPU: 12th Gen Intel(R) Core™ i7-12700H
GPU: NVIDIA GeForce RTX 3060 (With Intel UHD on the CPU)
OS: Nobara OS (Fedora based)
NVIDIA Driver: 550.67
Kernel version: 6.7.6
Desktop: Plasma 6.0.3

I have tried also on Ubuntu 23.10 with nvidia 550.67 driver and not hitting the issue, this seems to happen to me only on arch or arch based distros so far.

On arch, setting the preferred ozone flag does it for me. It’s been like this for more than a month at this point. Took one video with my phone to show the tty as well. In the other video, you can see the difference more clearly and tried showing my specs/drivers as well. (I mindless cursor movement in the middle is me trying to find thorium)

Hi @spaisen
I was finally able to repro issue locally, we will be able to debug it now.
Shall keep updated on the status.
Thanks for all the support.

2 Likes

Update: After series of updates, now Nobara OS is also hitting the issue.

I’m also having the same problem on Arch Linux getting the error “[drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_table for NvKmsKapiMemory”

this issue appears in chromium/electron based applications in Wayland after changing the Preferred Ozone platform flag to wayland (–ozone-platform=wayland)

Acer Nitro 5
CPU: 8th Gen Intel(R) Core™ i7-8750H
GPU: NVIDIA GeForce GTX 1060 (With Intel UHD on the CPU)
OS: Arch Linux
NVIDIA Driver: 550.76
Kernel version: 6.8.7
Desktop: Plasma 6.0.4

workarounds:
so far the solution that works for me is to remove “nvidia nvidia_modeset nvidia_uvm nvidia_drm” in MODULES=() inside /etc/mkinitcpio.conf as shown in arch wiki “early loading” section.

Update:
Set up early KMS for both Intel and NVidia also working, by adding i915 to MODULES array. MODULES=(i915 nvidia nvidia_modeset nvidia_uvm nvidia_drm).
this seems to be a race condition problem maybe.

1 Like

no matter what i do it doesnt help, im stuck with electron in xwayland i guess

Repro’d here as well, in my case induced by Discord, with the --ozone-platform-hint=auto flag.

I’m also on Arch Linux, with a GTX 1660 Ti Mobile, on driver 550.76 and running the Hyprland compositor. Also confirmed that adding the i915 module to my initramfs solves this (probably since I have an Intel iGPU in addition to my discrete NVIDIA card). Thanks for that, @orxcyd!

okay since im using dracut “add_drivers+=” wasnt enough to make it load early enough, had to add both i915 and nvidia drivers to force_drivers+= then yeah it works again

Can reproduce on NixOS unstable. I tried adding nividia and i915 to the initrd to no avail. Driver 550.78

@amrits Do you have any updates on this bug ?