GPU sharing PCIe bus

Dear all,

in the context of multi-GPU computing using OpenCL,
is there anyway to understand whether two GPUs are sharing the same PCIe channel?

Best regards,

Daniele

This is complicated, since the ways that PCI-Express resources can be shared vary:

  • Some motherboards cut the number of lanes to some slots in half when multiple cards are inserted. Then each card gets an x8 link, which has half the bandwidth of the normal x16, and that bandwidth is fixed at that half-max level whether or not both devices are active at the same time.

  • Some motherboards use a PCI-Express switch (like the NF200) to share an x16 link between two cards. In this case, the bandwidth available to each card separately is the full x16, but can be cut in half if both devices are using it at the same time. This is the closest approximation to a “bus-style” sharing you see in PCI-e, however I’m not aware of any motherboard that shares an x16 link with more than two slots.

  • If you are using a GTX 295, then the two GPUs are already using an NF200 to share the single slot, much like above.

Figuring out which scenario you are in is probably OS-specific, and requires some way to query the low-level PCI-Express topology of the system.

Some of the nodes in our cluster share up to 6 GPUs connected pair-wise through PCIe.

I wonder if OpenCL offers a way that I don’t know to understand which GPUs are connected

through the same PCIe link so to select decoupled GPUs when using 2/4 GPUs instead of 6.

Daniele