I am running machine learning applications on my server namely tensorflow.
I have 2 GPUs. I notice only one gpu is used for my training on models. After i looked into it, i notice peer2peer is not enabled. Could someone help me enable this option for utilizing both the cards ?
system details
os: ubuntu 16.04 kernel: 4.10.0-33
nvidia driver: 381.22
cuda: 8.0.61
cudnn: 6
mobo: asrock x99 ws-e
tensorflow output
tensorflow/core/common_runtime/gpu/gpu_device.cc:847] Peer access not supported between device ordinals 0 and 1
tensorflow/core/common_runtime/gpu/gpu_device.cc:847] Peer access not supported between device ordinals 1 and 0
tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 1
tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0: Y N
tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 1: N Y
running cuda/samples/1_Utilities/p2pBandwidthLatencyTest/p2pBandwidthLatencyTest
root [p2pBandwidthLatencyTest]$ ./p2pBandwidthLatencyTest
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, GeForce GTX 1080 Ti, pciBusID: 4, pciDeviceID: 0, pciDomainID:0
Device: 1, GeForce GTX TITAN X, pciBusID: 9, pciDeviceID: 0, pciDomainID:0
Device=0 CANNOT Access Peer Device=1
Device=1 CANNOT Access Peer Device=0
nvidia-smi topology info
root [~]$ nvidia-smi topo -m
GPU0 GPU1 CPU Affinity
GPU0 X PHB 0-11
GPU1 PHB X 0-11
Legend:
X = Self
SOC = Connection traversing PCIe as well as the SMP link between CPU sockets(e.g. QPI)
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
PIX = Connection traversing a single PCIe switch
NV# = Connection traversing a bonded set of # NVLinks
lspci info
root [~]$ lspci | grep NVIDIA
04:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
04:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
09:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX TITAN X] (rev a1)
09:00.1 Audio device: NVIDIA Corporation Device 0fb0 (rev a1)