Hi,
I have two servers, one with two P100s on a Dell R730 host and three P100s on Dell R740 host. For some reason the R740 is peer to peer enabled, but not the R730. If anyone could advise how to enable it or perhaps just let me know that it’s impossible to enable it on the R730, that would be helpful. Here is what you get when you run the following commands:
=======================
nvidia-smi topo -m # ON R730
GPU0 GPU1 CPU Affinity
GPU0 X SYS 0-0,2-2,4-4,6-6,8-8,10-10,12-12,14-14,16-16,18-18,20-20,22-22,24-24,26-26,28-28,30-30,32-32,34-34,36-36,38-38
GPU1 SYS X 1-1,3-3,5-5,7-7,9-9,11-11,13-13,15-15,17-17,19-19,21-21,23-23,25-25,27-27,29-29,31-31,33-33,35-35,37-37,39-39
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX = Connection traversing a single PCIe switch
NV# = Connection traversing a bonded set of # NVLinks
=======================
nvidia-smi topo -m # ON R740
GPU0 GPU1 GPU2 CPU Affinity
GPU0 X SYS SYS 0-0,2-2,4-4,6-6,8-8,10-10,12-12,14-14,16-16,18-18,20-20,22-22,24-24,26-26,28-28,30-30,32-32,34-34,36-36,38-38,40-40,42-42
GPU1 SYS X NODE 1-1,3-3,5-5,7-7,9-9,11-11,13-13,15-15,17-17,19-19,21-21,23-23,25-25,27-27,29-29,31-31,33-33,35-35,37-37,39-39,41-41,43-43
GPU2 SYS NODE X 1-1,3-3,5-5,7-7,9-9,11-11,13-13,15-15,17-17,19-19,21-21,23-23,25-25,27-27,29-29,31-31,33-33,35-35,37-37,39-39,41-41,43-43
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX = Connection traversing a single PCIe switch
NV# = Connection traversing a bonded set of # NVLinks
=====================================================
/usr/local/cuda-9.1/samples/0_Simple/simpleP2P/simpleP2P # ON R730
[/usr/local/cuda-9.1/samples/0_Simple/simpleP2P/simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
> GPU0 = "Tesla P100-PCIE-16GB" IS capable of Peer-to-Peer (P2P)
> GPU1 = "Tesla P100-PCIE-16GB" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access...
> Peer access from Tesla P100-PCIE-16GB (GPU0) -> Tesla P100-PCIE-16GB (GPU1) : No
> Peer access from Tesla P100-PCIE-16GB (GPU1) -> Tesla P100-PCIE-16GB (GPU0) : No
Two or more GPUs with SM 2.0 or higher capability are required for /usr/local/cuda-9.1/samples/0_Simple/simpleP2P/simpleP2P.
Peer to Peer access is not available amongst GPUs in the system, waiving test.
=====================================================
/usr/local/cuda-9.1/samples/0_Simple/simpleP2P/simpleP2P # ON R740
Checking for multiple GPUs...
CUDA-capable device count: 3
> GPU0 = "Tesla P100-PCIE-16GB" IS capable of Peer-to-Peer (P2P)
> GPU1 = "Tesla P100-PCIE-16GB" IS capable of Peer-to-Peer (P2P)
> GPU2 = "Tesla P100-PCIE-16GB" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access...
> Peer access from Tesla P100-PCIE-16GB (GPU0) -> Tesla P100-PCIE-16GB (GPU1) : Yes
> Peer access from Tesla P100-PCIE-16GB (GPU0) -> Tesla P100-PCIE-16GB (GPU2) : Yes
> Peer access from Tesla P100-PCIE-16GB (GPU1) -> Tesla P100-PCIE-16GB (GPU0) : Yes
> Peer access from Tesla P100-PCIE-16GB (GPU1) -> Tesla P100-PCIE-16GB (GPU2) : Yes
> Peer access from Tesla P100-PCIE-16GB (GPU2) -> Tesla P100-PCIE-16GB (GPU0) : Yes
> Peer access from Tesla P100-PCIE-16GB (GPU2) -> Tesla P100-PCIE-16GB (GPU1) : Yes
Enabling peer access between GPU0 and GPU1...
Checking GPU0 and GPU1 for UVA capabilities...
> Tesla P100-PCIE-16GB (GPU0) supports UVA: Yes
> Tesla P100-PCIE-16GB (GPU1) supports UVA: Yes
Both GPUs can support UVA, enabling...
Allocating buffers (64MB on GPU0, GPU1 and CPU Host)...
Creating event handles...
cudaMemcpyPeer / cudaMemcpy between GPU0 and GPU1: 8.20GB/s
Preparing host buffer and memcpy to GPU0...
Run kernel on GPU1, taking source data from GPU0 and writing to GPU1...
Run kernel on GPU0, taking source data from GPU1 and writing to GPU0...
Copy data back to host from GPU0 and verify results...
Disabling peer access...
Shutting down...
Test passed