DGX-1 using PCIe only instead of NVLink


as part of my thesis on on-node GPU interconnects for Deep Learning I have been looking for a possibility to disable the NVLink connection to compare bandwidth and latency when performing memcopies over NVLink and over PCIe.

So far, I haven’t found a solution to force the system to only use PCIe instead of NVLink. Does anybody know about a possibility to do so?

Thanks for your support!

I don’t believe there is a way, if you intend to use P2P copies.

However if you don’t intend to use P2P copies, then just make sure your code doesn’t enable P2P. In that case, all device-device transfers will flow through the CPU socket, travelling over PCIE to get to/from the CPU socket.

The P2PBandwidthLatencyTest sample code demonstrates this, and outputs a measured bandwidth matrix for device-to-device transfers when P2P is not enabled (as one piece of its overall output).

If you now ask for instructions on how to disable P2P for a large software stack such as TF or Pytorch, I wouldn’t be able to tell you how to do that. But if you are writing your own code, it should be fairly straightforward.

Better late than never, but to change an entire application to deactivate peer2peer use NVBIT to hook into the context creation and call cuCtxDisablePeerAccess, You might have to go hooking around other driver calls just in case it is reactivated through an application call to enablePeerAccess.