550.54.14 - Cannot create sg_table for NvKmsKapiMemory spammed when launching chrome on Wayland

Hi @wassou93
It is still under investigation.
I wanted to check if you know the last passing driver.

Still happening on 555.52.04-1. I am also using CachyOS with the 6.9.3-3-cachyos-lto kernel. Discord takes forever to load and spams dmesg with this message. Unsure if it’s related or not, but occasionally while playing video with discord on the other screen, the screen freezes but audio still plays. The only way to get out of it is locking the desktop with a keyboard shortcut and waiting. While trying to debug that I found these dmesg logs.

I also noticed this happening with Discord. Even when following @orxcyd 's solution of adding i915 to the initramfs configuration, the same issue still persist on my system. I have the following specs:

Laptop: Dell G3 3500
CPU: Intel(R) Core™ i7-10750H
GPU: NVIDIA GeForce GTX 1650 Ti Mobile
OS: Arch Linux
Kernel Version: 6.9.3
NVIDIA Driver: 550.78
Window Manager: Hyprland

I have only recently switched to Wayland so I do not know when the last non-issue version is.

Hi all, I first reported this issue on September 18, 2023 (with Kernel 6.5.1) to linux-bugs@nvidia.com

I was told at the time that this was an incorrect error message and that it would be downgraded to a warning, and the ticket was closed. I’m not sure which driver version I was running back then, however.

I do find that if you do not early load the nvidia modules, then it goes away. I’m seeing the error line get spammed 100s of times a second, so its really not practical to have them loaded for me. Unfortunately I require a few electron based apps for work, and run a chromium based browser as well.

Given the timespans, I’m not holding out a whole lot of hope.

Hi, an update on this. so I forgot to regenerate initramfs (which is done with mkinitcpio on Arch Linux). After doing so the problem goes away, so it really seems like something to do with early loading.

I think it’s an early loading issue I found a fix, I removed kms from mkinitcpio HOOK
and added nvidia and i915 modules to MODULES

then re-ran mkinitcpio -P and rebooted and made sure drm is enabled in grub and everything worked

in my /etc/mkinitcpio.conf:

MODULES=(i915 nvidia nvidia_modeset nvidia_uvm nvidia_drm)

HOOKS=(base udev autodetect microcode modconf block keyboard keymap consolefont plymouth filesystems fsck)

then ran sudo mkinitcpio -P

these are the packages I have isntalled for my cachyos

❯ paru -Qs nvidia
local/egl-wayland 2:1.1.13-3
EGLStream-based Wayland external platform
local/lib32-libvdpau 1.5-2
Nvidia VDPAU library
local/lib32-nvidia-utils 555.52.04-1
NVIDIA drivers utilities (32-bit)
local/lib32-opencl-nvidia 555.52.04-1
OpenCL implemention for NVIDIA (32-bit)
local/libva-nvidia-driver 0.0.12-1.1
VA-API implementation that uses NVDEC as a backend
local/libvdpau 1.5-2.1
Nvidia VDPAU library
local/libxnvctrl 555.42.02-2
NVIDIA NV-CONTROL X extension
local/linux-cachyos-nvidia 6.9.6-2
nvidia module of 555.52.04 driver for the linux-cachyos kernel
local/nvidia-prime 1.0-4
NVIDIA Prime Render Offload configuration and utilities
local/nvidia-settings 555.42.02-2
Tool for configuring the NVIDIA graphics driver
local/nvidia-utils 555.52.04-3
NVIDIA drivers utilities
local/opencl-nvidia 555.52.04-3
OpenCL implemention for NVIDIA

I hope this helps.

adding amdgpu(amdgpu is for amd igpu and i915 for intel for who don’t know)
just uses the iGPU for rendering rather than NVIDIA or CPU(software rendering)
so it doesn’t really fix the problem

before adding amdgpu to MODULES
brave won’t load 3D websites (eg. bruno-simon.com)
but firefox would (using iGPU)

adding amdgpu enabled WebGL using iGPU as hardware accelerator
now brave can run 3D websites but won’t run on NVIDIA

and btw i was using nvidia 535 drivers

i’m using 535 drivers as the newer drivers are causing crash while upgrading packages
in ArchLinux

At first the browser would take long time to launch
but now the browser launches quickly but doesn’t use nvidia as the gpu
tested that using nvtop

nvtop shows both gpus
NVIDIA GeForce RTX 3050 Laptop GPU
and
AMD Radeon Graphics

prime-run used to work previously (don’t know when)

I apologize beforehand for my obscene levels of ignorance and stupidity but someone convinced me to post on this thread with my findings as it may have an off chance of shedding some light on the matter. I encountered this error while attempting to run Fedora 40 and some of its spins and derivatives, all of them having this issue with nvidia driver 550. However, I wasn’t experiencing it on Nixos 24.05, which uses that very same 550.78 driver according to Nvidia Xserver settings. I found it odd that on Fedora, chromium was trying to run with the nvidia card at all, as I have an igpu that I use for most applications. Which is how I believe I’ve set it up on Nixos, I only run specific applications with the dgpu, so why trying to run chromium on Fedora causes the nvidia driver to have issues is a mystery to me, as I would expect it to run using only the igpu. Either way, these are all the settings I believe could be relevant to the situation in my nix config file, where the dgpu works with no issues and I can even run electron based applications, even chromium itself, with nvidia-offload, and it works perfectly fine:

services.xserver.videoDrivers = [“nvidia”];
hardware.nvidia = {
modesetting.enable = true;
powerManagement.enable = false;
powerManagement.finegrained = false;
open = false;
nvidiaSettings = true;
package = config.boot.kernelPackages.nvidiaPackages.beta;
};
hardware.nvidia.prime = {
offload = {
enable = true;
enableOffloadCmd = true;
};
intelBusId = “PCI:0:2:0”;
nvidiaBusId = “PCI:1:0:0”;
};

This is all probably completely useless information and it’s obvious to everyone but me, but there.

*the dgpu is pascal arch

Hello! Exactly the same problems with driver 560.35.03. It looks like the problem is in the kernel configuration.

For example, a stock image works great cfg_default.txt (270.1 KB)

Also works well on the new kernel version cfg_custom_desktop.txt (253.6 KB)

but there is a problem with the server implementation cfg_custom_server_100hz.txt (240.2 KB)

On hybrid systems (nvidia+intel), setting the BIOS parameters such as Aperture Size and DVMT Pre-allocated can help.

I am getting the error kernel: [drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_tabl e for NvKmsKapiMemory 0x000000006527f86e on an AMD laptop with switchable graphics when running gamescope which causes it to crash. This is on driver version 560.35.03 and it happens with both the proprietary and open source kernel modules.