NvLink (V100)

I am working on a project of Multi-GPU computation with bulky data (arround 200GB) . NVIDIA tech support ask to use NVLink with 3U+8X(16Gb) GPU Card (V100).

Is there any one is using NVLink and give me some hints about the Data transmission speed between CPU and GPU (HostToDevice/DeviceToHost) on using NvLink over PCIe


Hi wswong10,

CPU to GPU transfers do not use NVLINK. The exception is the IBM Power9 system that has a NVLINK interconnect to the Power9 processor.

In most Workstations, Servers and even NVIDIA DGX products, CPU to GPU data transfers are across the PCIe bus (@about 985MB/s for a 16 lane PCIe Gen3 slot).

GPU to GPU transfer speeds are much faster via a pair of NVLINK bridges connecting a pair of Quadro GV100 cards. (@about 200GB/s)


Ryan Park

Hi Ryan,

Thanks for your reply. This is exactly what i want to know about NvLink. Cheers!!!


Hi, I write here cause my question is related in a certain way. Hope to find an answer.

Which are the different NVLink capabilities between the Tesla V100 and the Quadro GV100?

If I’m correct about the Quadro GV100, it is possible to connect GPUs just in pair of two (for example, on a 4 GPU workstation, there will be 2 couples of Quadro GV100 connected through the proper bridge) and the speed of the conncetion can reach 200 Gb/s.

What about the Tesla V100? The speed can reach up to 300 Gb/s, but how many GPU connections can be exploited? On the same 4 GPU workstation, each Tesla V100 will be connected with the 3 other GPUs?

Is this the relevant difference between the two GPUs? Cause for the other main characteristics, there seem to be no differences.

Thank you in advance