Vulkan multi-GPU

Hello,

Could you please help me with multi-GPU system setup for Vulkan development.

The system configuration is Ryzen 1800X on X370 with two GeForce GTX 1080 Ti.
vkEnumeratePhysicalDeviceGroups reports about only one physical GPU in the system.

Under Windows 10 Vulkan multi-GPU working well on that system.

X configuration with --multigpu=On produces the following X server crash:

[   496.705] (EE) NVIDIA(GPU-0): Failed to initialize DMA.
[   496.705] (EE) NVIDIA(0): Failed to allocate push buffer
[  322.955811] resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
[  322.955903] caller os_map_kernel_space.part.7+0xd8/0x120 [nvidia] mapping multiple BARs
[  324.648865] nvidia 0000:1e:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x00000000fffd3400 flags=0x0000]
[  324.649622] NVRM: GPU at PCI:0000:1e:00: GPU-132c5b38-ac3d-3332-345f-7cb49e2daa25
[  324.649623] NVRM: GPU Board Serial Number: 0323817097284
[  324.649625] NVRM: Xid (PCI:0000:1e:00): 32, Channel ID 00000000 intr 00008000
[  324.649631] nvidia 0000:1e:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x00000000fffd3400 flags=0x0000]
[  324.649699] NVRM: Xid (PCI:0000:1e:00): 32, Channel ID 00000000 intr 00008000
[  325.001185] nvidia 0000:1e:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x00000000ffe70000 flags=0x0000]
[  325.001216] NVRM: Xid (PCI:0000:1e:00): 32, Channel ID 00000008 intr 00008000
[  325.001314] NVRM: Xid (PCI:0000:1e:00): 32, Channel ID 00000008 intr 00008000

Is it a problem of specific hardware configuration (if yes please tell me correct spec)?
Or maybe I should use specific kernel/driver version to run the system with multi-GPU Vulkan support.
Explicit multi-GPU support is not required. But please don’t tell me that it’s impossible :)
nvidia-bug-report.log.gz (137 KB)

You could try with an xorg.conf like

Section "Device"
    Identifier     "nvidia0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:29:0:0"
    Option         "AllowEmptyInitialConfiguration"
EndSection

Section "Device"
    Identifier     "nvidia1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:30:0:0"
    Option         "AllowEmptyInitialConfiguration"
EndSection

To rule out a hw issue,

  • connect monitor to the other card and see if the failing card changes.
  • check for ryzen bug
  • check/remove memory

I have this options in my xorg.conf and xserver works fine but Vulkan reports one physical GPU.

  • Monitor on the other card produces same DMA error and infinite xserver reloading.
  • The system has Ryzen bug (ryzen-test [loop-14] TIME TO FAIL: 38 s).
  • Different memory modules don’t help to solve the problem.

Then you should RMA your cpu first as that can lead to XID 32.

I disabled the BIOS OpCache option and kill-ryzen.sh script works without fault. So Ryzen bug is “fixed”.
But it didn’t help with xserver problem when multi-GPU option is enabled.

@frustum

Does your application need to present to an X11 swapchain?

We currently have a limitation where multiple GPUs can only be exposed to applications that won’t be interacting with X. We are hoping to fix that soon.

X11 swapchain access is required for applications.

Please let me know when the development driver will be available.

Is there any alternative variant to display Vulkan rendered content on multi-GPU system without X11?

Thank you.

If your application cannot connect to X (by e.g. unsetting DISPLAY) then it will be able to query and use all the GPUs in your system. You could conceivably have such an application do the rendering then share the frames to be presented with another application through an X11 swapchain.

I realize this is not ideal and we are working on this issue.

Finally, I have two devices in the system :) Thank you.

I will wait for the new driver :)

hi, I have the same with you, vulkuninfo cannot find all my video cards, I have 2 GTX1080, but ubuntu vulkaninfo only find one 1080, and it can find all in tty, can you fix it?