How to enable P2P access?

kangliwei98 · February 5, 2023, 3:15pm

Our server has 8 RTX 3090 GPUs, they are unable to peer access each other, which results in very slow p2p bandwidth (~3GB/s).

Some details of the server, please let me know if any information is needed:

result of “nvidia-smi topo -m”
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 CPU Affinity NUMA Affinity
GPU0 X PIX PIX PIX PXB PXB PXB PXB 0-23,48-71 0
GPU1 PIX X PIX PIX PXB PXB PXB PXB 0-23,48-71 0
GPU2 PIX PIX X PIX PXB PXB PXB PXB 0-23,48-71 0
GPU3 PIX PIX PIX X PXB PXB PXB PXB 0-23,48-71 0
GPU4 PXB PXB PXB PXB X PIX PIX PIX 0-23,48-71 0
GPU5 PXB PXB PXB PXB PIX X PIX PIX 0-23,48-71 0
GPU6 PXB PXB PXB PXB PIX PIX X PIX 0-23,48-71 0
GPU7 PXB PXB PXB PXB PIX PIX PIX X 0-23,48-71 0

Result of “nvidia-smi topo -p2p r”
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7
GPU0 X CNS CNS CNS CNS CNS CNS CNS
GPU1 CNS X CNS CNS CNS CNS CNS CNS
GPU2 CNS CNS X CNS CNS CNS CNS CNS
GPU3 CNS CNS CNS X CNS CNS CNS CNS
GPU4 CNS CNS CNS CNS X CNS CNS CNS
GPU5 CNS CNS CNS CNS CNS X CNS CNS
GPU6 CNS CNS CNS CNS CNS CNS X CNS
GPU7 CNS CNS CNS CNS CNS CNS CNS X

CUDA version 11.3, nvidia driver version 460.91.03
Server model: ASUS ESC8000 G4

It seems that it’s “chipset not supported”, but I thought these GPUs are PIX or PXB connected and have the same architecture, it should be able to peer access?

VT-d is disabled, but p2p bandwidth still very low, and training with 8 GPU is almost as slow as training with 1 GPU, due to the communication overhead.

rs277 · February 5, 2023, 6:31pm

Seems to be somewhat of a challenging process. See this thread:

kangliwei98 · February 6, 2023, 12:16am

Thanks for the pointer. The server is not using NVLink, does RTX 3090 have to use nvlinks to have p2p access?

rs277 · February 6, 2023, 1:15am

It’s certainly by far the best way if supported.

I was primarily replying to a previous post, since removed, that suggested using it and so I offered the thread above.

In light of apparent NVlink difficulties, (and even if you can get it working, it appears to be limited to 2 cards only), you’re stuck with PCIe transfers.

The ASUS ESC8000 G4 only supports PCIe Gen 3 x16 for the GPU’s, the 3090 has a Gen4 interface and so there’s a limitation to start. I can’t offer detailed advice, as I have no direct experience with large mutiGPU setups - perhaps njuffa will respond.

Depending on the nature of your workload, this thread might be worth checking as well:

Topic		Replies	Views
RTX 3090 + NVLink + CUDA P2P - not working on Linux or Windows, in different ways? CUDA Programming and Performance	9	7054	May 24, 2023
Issue with P2P connection using two RTX A4500 CUDA Programming and Performance cuda , ubuntu	7	2368	March 31, 2023
NV-Link Setup Troubleshooting and NV-Link Status Output Help CUDA Setup and Installation	7	10472	April 13, 2023
Partial fail of peer access in 8 Volta GPU instance (p3.16xlarge) on AWS -> huge slowdown CUDA Programming and Performance	32	3508	March 10, 2018
P2P Transfers Across Single PCIe Switch Fail CUDA Programming and Performance	5	1288	April 15, 2024
Low P2P GPU bandwidth performance between GeForce GPUs CUDA Programming and Performance	20	500	October 9, 2024
simpleP2P example and multi-GPU network training causes system freeze and ERR in nvidia-smi Linux	7	3756	October 14, 2021
CUDA 12.1 SimpleP2P Verification Errors CUDA Setup and Installation	2	45	November 26, 2024
How can I improve the 'p2p enabled' bandwidth when testing NCCL performance with two A5000 GPU using PCIe 4.0 x16? CUDA Programming and Performance cuda	2	1096	September 15, 2023
2080 Tis cudaDeviceCanAccessPeer failure without NVLink bridge Linux	6	2644	May 9, 2019

How to enable P2P access?

Related topics