PRIME usage of external HDMI monitor results in Xid errors with dxvk/vkd3d

Hi so im running on an acer nitro AN515-45 ryzen 5800h + nvidia 3060 on archlinux using kwin and wayland. im getting Xid errors whenever i run an external monitor connected to the hdmi port and begin running various vkd3d/dxvk games it only seems to occur once they try to fullscreen themself on launch. for example this happends when i launch hogwarts legacy.

tom-acer kernel: NVRM: GPU at PCI:0000:01:00: GPU-58e586ab-a95c-b7fb-4f87-143605fb6aa2
tom-acer kernel: NVRM: GPU Board Serial Number: 0
tom-acer kernel: NVRM: Xid (PCI:0000:01:00): 56, pid='<unknown>', name=<unknown>, CMDre 00000001 00000200 00000001 00000005 0000001d

the very weird thing is this doesnt occur if i dont run an external monitor or if i run the games through gamescope.
did some driver “bisecting” and came down to this.

530.30.02 Xid 56 on launch. always

525.89.02 Xid 56 on launch. always.

525.85.05 Xid 56 on launch. always.

525.78.01 hogwarts launches but crashes on shader compilation, a wine/game engine? window appears "Not enough video memory to allocate a render" on second launch. Xid 56.

525.60.11 gives a different Xid on launch.
NVRM: Xid (PCI:0000:01:00): 32, pid=2724, name=HogwartsLegacy., Channel ID 00000028 intr1 00000008 HCE_DBG0 00001b00 HCE_DBG1 00000001
NVRM: Xid (PCI:0000:01:00): 32, pid=2724, name=HogwartsLegacy., Channel ID 00000028 intr1 00000008 HCE_DBG0 00001b04 HCE_DBG1 00ce8010

520.56.06 runs the games and no xid errors for hours of gameplay. however this appears in dmesg on launch.
[drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to allocate fence signaling event

but the games do run on 520.56.06

it seems the games trigger this somehow when they try to fullscreen themself at native resolution of the external monitor. and so far ive only managed to trigger this with dxvk/vkd3d games.

1 Like

noticed in my journal i do still get a Xid error at random intervals

NVRM: GPU at PCI:0000:01:00: GPU-58e586ab-a95c-b7fb-4f87-143605fb6aa2
NVRM: GPU Board Serial Number: 0
NVRM: Xid (PCI:0000:01:00): 31, pid=254585, name=Game.exe, Ch 00000016, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_ROP_0 faulted @ 0x0_24c4f000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE

even when using gamesope on 530.30.02

1 Like

and with 530.41.03 now released, i still get this if i dont use gamescope on the external monitor. this is also with the recent vkd3d nvidia workarounds merged , Workaround spurious GPU hangs on NV with concurrent submissions to different queues by HansKristian-Work · Pull Request #1484 · HansKristian-Work/vkd3d-proton · GitHub so those did not help.

NVRM: GPU at PCI:0000:01:00: GPU-58e586ab-a95c-b7fb-4f87-143605fb6aa2
NVRM: GPU Board Serial Number: 0
NVRM: Xid (PCI:0000:01:00): 56, pid='<unknown>', name=<unknown>, CMDre 00000007 00000200 00000001 00000005 0000001d
1 Like

so the months fly by, drivers releases and not a single fix for either this nor https://forums.developer.nvidia.com/t/multiple-cuda-rtx-vulkan-application-crashing-with-xid-13-109-errors/ practically rendering nvidia gpus useless, and with the closed source nature of nvidia the only information we get is “I have filed a bug 3959156 internally for tracking purpose.” “Shall update once it is incorporated in future drivers.” i for sure have made up my mind on what brand im getting next time im buying. seems the only serious companies trying for linux is valve and amd.

1 Like

So true!