Dear garywang,
P2P calls are not supported on Drive platform.
As CPU and dGPU are connected via NvLink, If you are transferring data from CPU to dGPU, it goes via NVLink. You can check running bandwidth test CUDA sample.
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 29250.3
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 29504.7
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 96765.0
Result = PASS
…
Sorry I’m little confused on “Device to Device Bandwidth”, could you help to explain its diffident between CUDA_VISIBLE_DEVICES=1 and CUDA_VISIBLE_DEVICES=0 (i.e. not transfer by NvLINK or not)? Sorry for bothering on it.
Dear Garywang,
Device to device bandwidth refers to data transfer bandwidth with in the GPU(from one memory location to other memory location in the same GPU. This does not involve NvLink).
CUDA_VISIBLE_DEVICES is flag is used to select the GPU devices on system. When you set CUDA_VISIBLE_DEVICES=0, you systems behaves as if it has only dGPU and similarly if you set CUDA_VISIBLE_DEVICES=1, it shows only iGPU.
@SivaRamaKrishna,
So, from my point of view to conclude them, is it correct?
for CUDA_VISIBLE_DEVICES=0, Host to Device Bandwidth is 18G/s transferred via NvLink.
for CUDA_VISIBLE_DEVICES=1, Host to Device Bandwidth is 29G/s transferred via Share Memory.