I was opening some youtube video tabs and got a gpu driver crash that froze all screens and couldn’t recover graphics via ssh without rebooting the system.
RTX 5070 Ti
arch linux 6.14.6-arch1-1
nvidia-open 570.144-5
I saved this from the journal over ssh:
May 18 23:55:00 kernel: NVRM: dmaAllocMapping_GM107: can't alloc VA space for mapping.
May 18 23:55:00 kernel: NVRM: dmaAllocMapping_GM107: can't alloc VA space for mapping.
May 18 23:55:00 kernel: NVRM: dmaAllocMapping_GM107: can't alloc VA space for mapping.
May 18 23:55:00 kernel: NVRM: dmaAllocMapping_GM107: can't alloc VA space for mapping.
May 18 23:55:00 kernel: x86/PAT: ThreadPoolForeg:1595890 conflicting memory types 400f4a0000-4010090000 uncached-minus<->write-combining
May 18 23:55:00 kernel: x86/PAT: memtype_reserve failed [mem 0x400f4a0000-0x401008ffff], track write-combining, req write-combining
May 18 23:55:00 kernel: ioremap memtype_reserve failed -16
May 18 23:55:00 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Failed to ioremap_wc NvKmsKapiMemory 0x00000000cccceb75
May 18 23:55:00 kernel: NVRM: GPU at PCI:0000:01:00: GPU-c56c7bb9-04bc-5ad6-729d-7cdf1783f039
May 18 23:55:00 kernel: NVRM: GPU Board Serial Number: 0
May 18 23:55:00 kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=1582480, name=chromium, Ch 00000045, intr 00000000. MMU Fault: ENGINE GR_HOST0 HUBCLIENT_ESC0 faulted @ 0x0_0402a000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
May 18 23:55:00 kernel: NVRM: krcCheckBusError_KERNEL: PCI-E corelogic status has pending errors (CL_PCIE_DEV_CTRL_STATUS = 00010020):
May 18 23:55:00 kernel: NVRM: krcCheckBusError_KERNEL: _CORR_ERROR_DETECTED
May 18 23:55:01 chromium[1582480]: [53:53:0519/065500.999681:ERROR:gpu/command_buffer/service/shared_context_state.cc:1325] SharedContextState context lost via ARB/EXT_robustness. Reset status = GL_GUILTY_CONTEXT_RESET_KHR
May 18 23:55:01 chromium[1582480]: [53:53:0519/065501.001099:ERROR:components/viz/service/gl/gpu_service_impl.cc:1173] Exiting GPU process because some drivers can't recover from errors. GPU process will restart shortly.
May 18 23:55:01 chromium[1582419]: [2:2:0519/065501.019676:ERROR:content/browser/gpu/gpu_process_host.cc:956] GPU process exited unexpectedly: exit_code=8704
May 18 23:55:23 kwin_wayland[1476032]: kwin_scene_opengl: A graphics reset attributable to the current GL context occurred.
I saw elsewhere in this thread a few mentions of “dmaAllocMapping_GM107” too