Sorry for bothering. I am using Tesla K80 and the simpleP2P failed. The problem is the same as:
I searched on the internet and found that this problem may be solved by disabling ACSCtl. However, I am using Windows server 2012 R2. Is there any solution for windows? many thanks!
[C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0
[C:\ProgramData\NVIDIA Corporation\CUDA Samples\v8.0\0_Simple\simpleP2P../…/bi
n/win64/Debug/simpleP2P.exe] - Starting…
Checking for multiple GPUs…
CUDA-capable device count: 4
GPU0 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
GPU1 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
GPU2 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
GPU3 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access…
Peer access from Tesla K80 (GPU0) → Tesla K80 (GPU1) : Yes
Peer access from Tesla K80 (GPU0) → Tesla K80 (GPU2) : No
Peer access from Tesla K80 (GPU0) → Tesla K80 (GPU3) : No
Peer access from Tesla K80 (GPU1) → Tesla K80 (GPU0) : Yes
Peer access from Tesla K80 (GPU1) → Tesla K80 (GPU2) : No
Peer access from Tesla K80 (GPU1) → Tesla K80 (GPU3) : No
Peer access from Tesla K80 (GPU2) → Tesla K80 (GPU0) : No
Peer access from Tesla K80 (GPU2) → Tesla K80 (GPU1) : No
Peer access from Tesla K80 (GPU2) → Tesla K80 (GPU3) : Yes
Peer access from Tesla K80 (GPU3) → Tesla K80 (GPU0) : No
Peer access from Tesla K80 (GPU3) → Tesla K80 (GPU1) : No
Peer access from Tesla K80 (GPU3) → Tesla K80 (GPU2) : Yes
Enabling peer access between GPU0 and GPU1…
Checking GPU0 and GPU1 for UVA capabilities…
Tesla K80 (GPU0) supports UVA: Yes
Tesla K80 (GPU1) supports UVA: Yes
Both GPUs can support UVA, enabling…
Allocating buffers (64MB on GPU0, GPU1 and CPU Host)…
Creating event handles…
cudaMemcpyPeer / cudaMemcpy between GPU0 and GPU1: 1.05GB/s
Preparing host buffer and memcpy to GPU0…
Run kernel on GPU1, taking source data from GPU0 and writing to GPU1…
Run kernel on GPU0, taking source data from GPU1 and writing to GPU0…
Copy data back to host from GPU0 and verify results…
Verification error @ element 0: val = 1.#QNAN0, ref = 0.000000
Verification error @ element 1: val = 1.#QNAN0, ref = 4.000000
Verification error @ element 2: val = 1.#QNAN0, ref = 8.000000
Verification error @ element 3: val = 1.#QNAN0, ref = 12.000000
Verification error @ element 4: val = 1.#QNAN0, ref = 16.000000
Verification error @ element 5: val = 1.#QNAN0, ref = 20.000000
Verification error @ element 6: val = 1.#QNAN0, ref = 24.000000
Verification error @ element 7: val = 1.#QNAN0, ref = 28.000000
Verification error @ element 8: val = 1.#QNAN0, ref = 32.000000
Verification error @ element 9: val = 1.#QNAN0, ref = 36.000000
Verification error @ element 10: val = 1.#QNAN0, ref = 40.000000
Verification error @ element 11: val = 1.#QNAN0, ref = 44.000000
Disabling peer access…
Shutting down…
Test failed!
_Simple\simpleP2P\../../bi
n/win64/Debug/simpleP2P.exe] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 4
> GPU0 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
> GPU1 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
> GPU2 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
> GPU3 = " Tesla K80" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access...
> Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU1) : Yes
> Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU2) : No
> Peer access from Tesla K80 (GPU0) -> Tesla K80 (GPU3) : No
> Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU0) : Yes
> Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU2) : No
> Peer access from Tesla K80 (GPU1) -> Tesla K80 (GPU3) : No
> Peer access from Tesla K80 (GPU2) -> Tesla K80 (GPU0) : No
> Peer access from Tesla K80 (GPU2) -> Tesla K80 (GPU1) : No
> Peer access from Tesla K80 (GPU2) -> Tesla K80 (GPU3) : Yes
> Peer access from Tesla K80 (GPU3) -> Tesla K80 (GPU0) : No
> Peer access from Tesla K80 (GPU3) -> Tesla K80 (GPU1) : No
> Peer access from Tesla K80 (GPU3) -> Tesla K80 (GPU2) : Yes
Enabling peer access between GPU0 and GPU1...
Checking GPU0 and GPU1 for UVA capabilities...
> Tesla K80 (GPU0) supports UVA: Yes
> Tesla K80 (GPU1) supports UVA: Yes
Both GPUs can support UVA, enabling...
Allocating buffers (64MB on GPU0, GPU1 and CPU Host)...
Creating event handles...
cudaMemcpyPeer / cudaMemcpy between GPU0 and GPU1: 1.05GB/s
Preparing host buffer and memcpy to GPU0...
Run kernel on GPU1, taking source data from GPU0 and writing to GPU1...
Run kernel on GPU0, taking source data from GPU1 and writing to GPU0...
Copy data back to host from GPU0 and verify results...
Verification error @ element 0: val = 1.#QNAN0, ref = 0.000000
Verification error @ element 1: val = 1.#QNAN0, ref = 4.000000
Verification error @ element 2: val = 1.#QNAN0, ref = 8.000000
Verification error @ element 3: val = 1.#QNAN0, ref = 12.000000
Verification error @ element 4: val = 1.#QNAN0, ref = 16.000000
Verification error @ element 5: val = 1.#QNAN0, ref = 20.000000
Verification error @ element 6: val = 1.#QNAN0, ref = 24.000000
Verification error @ element 7: val = 1.#QNAN0, ref = 28.000000
Verification error @ element 8: val = 1.#QNAN0, ref = 32.000000
Verification error @ element 9: val = 1.#QNAN0, ref = 36.000000
Verification error @ element 10: val = 1.#QNAN0, ref = 40.000000
Verification error @ element 11: val = 1.#QNAN0, ref = 44.000000
Disabling peer access...
Shutting down...
Test failed!