Hello everyone,
I decided unwisely to upgrade from Debian 12 (stable) to Debian 13 (trixie) because the iwlwifi driver kept dropping the wifi connection on stable (due to some regression, so far it’s working on trixie) but a bunch of other problems started cropping up.
The main one right now is that I cannot start gnome.
When looking at the output of dmesg it’s full of this error
[drm:__nv_drm_gem_nvkms_memory_prime_get_sg_table [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Cannot create sg_table for NvKmsKapiMemory 0x000000003c30ef86
Here is the system information:
GPU: RTX 3070 (Laptop), so this is in hybrid mode
OS: Debian 13 (trixie), current testing
Driver: 535.183.01
Kernel version: 6.9.10-amd64
Currently using Plasma with Wayland but the FPS on the external monitor is garbage so I’m hoping to be able to fix gnome.
Secure boot is off because if I turn it on the Nvidia driver doesn’t load at all.
This issue only appeared after I did the upgrade. If I turn secure boot on and the Nvidia driver subsequently doesn’t load then gnome is starting without problems but I cannot use the GPU and external monitors are not working (because Lenovo).
I found another thread (550.54.14 - Cannot create sg_table for NvKmsKapiMemory spammed when launching chrome on Wayland) on here where the guy was using Arch but I cannot replicate the solution because there is no mkinitcpio on Debian (whatever that is).
Also the guy in the other thread mentioned that this issue occurs for him when he starts Chrome. I tried that and I don’t get that error message when I start Chrome. Only appears if I try to run Gnome and the whole system (keyboard etc.) just soft freezes and I have to ssh into the machine to restart gdm.
Hopefully someone can help.
Cheers
[edit]
Oh, another curious issue that appeared. Whenever I run nvidia-smi
the first part
$ nvidia-smi
Mon Jul 29 09:18:16 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
appears rather quickly but everything after that
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3070 ... Off | 00000000:01:00.0 On | N/A |
| N/A 46C P8 16W / 90W | 17MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
appears with some delay. That definitely was not like that before the upgrade.
Also whenever I do that I get this in dmesg
[Mon Jul 29 09:17:58 2024] __vm_enough_memory: pid: 13787, comm: nvidia-smi, bytes: 51539607552 not enough memory for the allocation
[Mon Jul 29 09:17:58 2024] __vm_enough_memory: pid: 13787, comm: nvidia-smi, bytes: 51539644416 not enough memory for the allocation
[Mon Jul 29 09:17:58 2024] __vm_enough_memory: pid: 13787, comm: nvidia-smi, bytes: 51539742720 not enough memory for the allocation
[Mon Jul 29 09:18:18 2024] __vm_enough_memory: pid: 14189, comm: nvidia-smi, bytes: 51539607552 not enough memory for the allocation
[Mon Jul 29 09:18:18 2024] __vm_enough_memory: pid: 14189, comm: nvidia-smi, bytes: 51539644416 not enough memory for the allocation
[Mon Jul 29 09:18:18 2024] __vm_enough_memory: pid: 14189, comm: nvidia-smi, bytes: 51539742720 not enough memory for the allocation
What is going on?