P2P is apparently not working. The machine is a Dell Precision Workstation (T3500) with an Intel Xeon X5670 and an Intel X58 IOH. The distro is Debian 10. There are two GeForce GTX 1050 graphics’ cards. As per https://www.intel.com/content/dam/doc/datasheet/x58-express-chipset-datasheet.pdf, the local peer-to-peer appears to not cross the QPI (at least under given circumstances)–see also Figure 7-3 on pg 114. The Dell bios option appears to be “High Performance IO”. This was switched on.
Can you kindly advise?
Is there an approved system’s (board) listing for P2P communications?
cuda and driver version
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
compatibility level:
6.1
$ nvidia-smi topo -m (partial)
GPU0 GPU1 CPU Affinity NUMA Affinity
GPU0 X PHB 0-11 N/A
GPU1 PHB X 0-11 N/A
$ ./simpleP2P
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
Checking GPU(s) for support of peer to peer memory access...
> Peer access from GeForce GTX 1050 (GPU0) -> GeForce GTX 1050 (GPU1) : No
> Peer access from GeForce GTX 1050 (GPU1) -> GeForce GTX 1050 (GPU0) : No
Two or more GPUs with Peer-to-Peer access capability are required for ./simpleP2P.
Peer to Peer access is not available amongst GPUs in the system, waiving test.
$ lspci -tv (partial)
-+-[0000:3f]-+-00.0 Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers
| +-00.1 Intel Corporation Xeon 5600 Series QuickPath Architecture System Address Decoder
| +-02.0 Intel Corporation Xeon 5600 Series QPI Link 0
| +-02.1 Intel Corporation Xeon 5600 Series QPI Physical 0
| +-02.2 Intel Corporation Xeon 5600 Series Mirror Port Link 0
| +-02.3 Intel Corporation Xeon 5600 Series Mirror Port Link 1
| +-02.4 Intel Corporation Xeon 5600 Series QPI Link 1
| +-02.5 Intel Corporation Xeon 5600 Series QPI Physical 1
| +-03.0 Intel Corporation Xeon 5600 Series Integrated Memory Controller Registers
| +-03.1 Intel Corporation Xeon 5600 Series Integrated Memory Controller Target Address Decoder
| +-03.2 Intel Corporation Xeon 5600 Series Integrated Memory Controller RAS Registers
| +-03.4 Intel Corporation Xeon 5600 Series Integrated Memory Controller Test Registers
| +-04.0 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Control
| +-04.1 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Address
| +-04.2 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Rank
| +-04.3 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Thermal Control
| +-05.0 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Control
| +-05.1 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Address
| +-05.2 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Rank
| +-05.3 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Thermal Control
| +-06.0 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Control
| +-06.1 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Address
| +-06.2 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Rank
| \-06.3 Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Thermal Control
\-[0000:00]-+-00.0 Intel Corporation 5520/5500/X58 I/O Hub to ESI Port
+-01.0-[01]--+-00.0 Intel Corporation 82571EB Gigabit Ethernet Controller
| \-00.1 Intel Corporation 82571EB Gigabit Ethernet Controller
+-03.0-[02]--+-00.0 NVIDIA Corporation GP107 [GeForce GTX 1050]
| \-00.1 NVIDIA Corporation GP107GL High Definition Audio Controller
+-07.0-[03]--+-00.0 NVIDIA Corporation GP107 [GeForce GTX 1050]
| \-00.1 NVIDIA Corporation GP107GL High Definition Audio Controller
.
.
.
$ sudo lspci -s 0000:00:03.0 -vvvv | grep -i acs
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
$ sudo lspci -s 0000:00:07.0 -vvvv | grep -i acs
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-