Hi,
I’m the friend in question. Here’s a sudo nvidia-bug-report.sh --extra-system-data
immediately after Discord froze and subsequently crash-looped 3 times – I can reproduce this any time with discord-canary
with hardware acceleration on. (Discord stable, with hardware acceleration off, does not exhibit the same issue). When this issue happens, the following error is observed:
[Tue Apr 8 20:38:06 2025] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to map NvKmsKapiMemory 0x0000000041e744b9
[Tue Apr 8 20:38:22 2025] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to map NvKmsKapiMemory 0x000000006ddb51d5
[Tue Apr 8 20:38:38 2025] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to map NvKmsKapiMemory 0x00000000d7f128b0
[Tue Apr 8 20:38:52 2025] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to map NvKmsKapiMemory 0x000000002b03a0af
[Tue Apr 8 21:04:56 2025] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to map NvKmsKapiMemory 0x00000000af0cc908
[Tue Apr 8 21:05:12 2025] [drm:__nv_drm_gem_nvkms_map [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to map NvKmsKapiMemory 0x000000005507ab12
nvidia-bug-report.log.old.gz (1.4 MB)
And here is a more “normal” exhibition of the bug, playing DIRT Rally 2, and the game froze with the last drawn state at some point (presumably when the game tried to allocate some VRAM and failed to because of the VRAM being nearly full at the time, per nvtop
)
This one is preceded by this message in dmesg:
[Tue Apr 8 21:33:56 2025] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to allocate NVKMS memory for GEM object
[Tue Apr 8 21:33:56 2025] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to allocate NVKMS memory for GEM object
[Tue Apr 8 21:33:56 2025] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to allocate NVKMS memory for GEM object
[Tue Apr 8 21:33:56 2025] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to allocate NVKMS memory for GEM object
[Tue Apr 8 21:33:56 2025] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to allocate NVKMS memory for GEM object
[Tue Apr 8 21:33:56 2025] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000b00] Failed to allocate NVKMS memory for GEM object
nvidia-bug-report.log.gz (1.3 MB)
I was watching nvtop
the entire time while gaming trying to trigger the bug the second time, and can definitely confirm this bug seems most likely to occur at the very high end of memory usage – which is especially brutal on my 3080 with only 10Gi of vram.