I’m just starting to get my feet wet with Multi-GPU. I’m running GNU/Linux x86_64 (Ubuntu 14.04 LTS)
I have installed the latest version of the CUDA Toolkit (CUDA 7.5)
I attempted to run the simpleP2P from the CUDA samples and I’m getting the following:
/usr/local/cuda-7.5/samples/bin/x86_64/linux/release$ ./simpleP2P [./simpleP2P] - Starting... Checking for multiple GPUs... CUDA-capable device count: 2 > GPU0 = " Tesla K20c" IS capable of Peer-to-Peer (P2P) > GPU1 = " Tesla C2070" IS capable of Peer-to-Peer (P2P) Checking GPU(s) for support of peer to peer memory access... > Peer access from Tesla K20c (GPU0) -> Tesla C2070 (GPU1) : No > Peer access from Tesla C2070 (GPU1) -> Tesla K20c (GPU0) : No Two or more GPUs with SM 2.0 or higher capability are required for ./simpleP2P. Peer to Peer access is not available amongst GPUs in the system, waiving test.
The GPUs are both SM 2.0 or higher so I guess that’s not the problem. This is a partial output from the deviceQuery code:
Detected 2 CUDA Capable device(s) Device 0: "Tesla K20c" CUDA Driver Version / Runtime Version 7.5 / 7.5 CUDA Capability Major/Minor version number: 3.5 ... Device 1: "Tesla C2070" CUDA Driver Version / Runtime Version 7.5 / 7.5 CUDA Capability Major/Minor version number: 2.0
I’ve also run the multi GPU sample and it fails, which I assume is due to P2P. Is my assumption correct?
/usr/local/cuda-7.5/samples/bin/x86_64/linux/release$ ./simpleMultiGPU Starting simpleMultiGPU CUDA-capable device count: 2 Generating input data... CUDA error at simpleMultiGPU.cu:121 code=2(cudaErrorMemoryAllocation) "cudaStreamCreate(&plan[i].stream)"
Another developer experiencing a similar problem also posted the following diagnostics:
- Checking the GPU cards share the same PCI-E root.
$ lspci | grep NVIDIA 03:00.0 VGA compatible controller: NVIDIA Corporation GF100GL [Tesla C2050 / C2070] (rev a3) 03:00.1 Audio device: NVIDIA Corporation GF100 High Definition Audio Controller (rev a1) 04:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
Relevant output from lspci -t
\-[0000:00]-+-00.0 +-01.0-[01-02]----00.0--- +-03.0---+-00.0 | \-00.1 +-07.0-----00.0
In addition the information given by the nvidia-smi tool about topology.
$ nvidia-smi topo -m GPU0 GPU1 CPU Affinity GPU0 X PHB GPU1 PHB X 0-5 Legend: X = Self SOC = Path traverses a socket-level link (e.g. QPI) PHB = Path traverses a PCIe host bridge PXB = Path traverses multiple PCIe internal switches PIX = Path traverses a PCIe internal switch