Hello Forum,
I’m just starting to get my feet wet with Multi-GPU. I’m running GNU/Linux x86_64 (Ubuntu 14.04 LTS)
I have installed the latest version of the CUDA Toolkit (CUDA 7.5)
I attempted to run the simpleP2P from the CUDA samples and I’m getting the following:
/usr/local/cuda-7.5/samples/bin/x86_64/linux/release$ ./simpleP2P
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
> GPU0 = " Tesla K20c" IS capable of Peer-to-Peer (P2P)
> GPU1 = " Tesla C2070" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access...
> Peer access from Tesla K20c (GPU0) -> Tesla C2070 (GPU1) : No
> Peer access from Tesla C2070 (GPU1) -> Tesla K20c (GPU0) : No
Two or more GPUs with SM 2.0 or higher capability are required for ./simpleP2P.
Peer to Peer access is not available amongst GPUs in the system, waiving test.
The GPUs are both SM 2.0 or higher so I guess that’s not the problem. This is a partial output from the deviceQuery code:
Detected 2 CUDA Capable device(s)
Device 0: "Tesla K20c"
CUDA Driver Version / Runtime Version 7.5 / 7.5
CUDA Capability Major/Minor version number: 3.5
...
Device 1: "Tesla C2070"
CUDA Driver Version / Runtime Version 7.5 / 7.5
CUDA Capability Major/Minor version number: 2.0
I’ve also run the multi GPU sample and it fails, which I assume is due to P2P. Is my assumption correct?
/usr/local/cuda-7.5/samples/bin/x86_64/linux/release$ ./simpleMultiGPU
Starting simpleMultiGPU
CUDA-capable device count: 2
Generating input data...
CUDA error at simpleMultiGPU.cu:121 code=2(cudaErrorMemoryAllocation) "cudaStreamCreate(&plan[i].stream)"
Another developer experiencing a similar problem also posted the following diagnostics:
- Checking the GPU cards share the same PCI-E root.
$ lspci | grep NVIDIA
03:00.0 VGA compatible controller: NVIDIA Corporation GF100GL [Tesla C2050 / C2070] (rev a3)
03:00.1 Audio device: NVIDIA Corporation GF100 High Definition Audio Controller (rev a1)
04:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
Relevant output from lspci -t
\-[0000:00]-+-00.0
+-01.0-[01-02]----00.0-[02]--
+-03.0-[03]--+-00.0
| \-00.1
+-07.0-[04]----00.0
In addition the information given by the nvidia-smi tool about topology.
$ nvidia-smi topo -m
GPU0 GPU1 CPU Affinity
GPU0 X PHB
GPU1 PHB X 0-5
Legend:
X = Self
SOC = Path traverses a socket-level link (e.g. QPI)
PHB = Path traverses a PCIe host bridge
PXB = Path traverses multiple PCIe internal switches
PIX = Path traverses a PCIe internal switch