Need example to disable nvlink

boki.digital · February 25, 2019, 9:44am

I am doing test that comparing nvlink enabled and disabled.

I only know need to write some value into the register, but I don’t know how and the exact way to disable nvlink.

saulocpp · February 25, 2019, 1:40pm

[url]https://devtalk.nvidia.com/default/topic/1043497/cuda-programming-and-performance/dgx-1-using-pcie-only-instead-of-nvlink/[/url]

N00b-N00b · August 5, 2021, 8:06pm

Has there been any feature addition for this as of late? I was wondering if there is an implicit way to have the CUDA driver transparently switching from NVLink to GPUDirect-RDMA? I am looking to do some benchmarking between the two utilizing cuda-aware MPI (the openmpi impl).

Robert_Crovella · August 5, 2021, 8:12pm

There is no way to disable NVLink. If Peer activity is enabled between two GPUs that are directly connected by NVLink, then NVLink will be used for peer transfers. There are no alternatives.

N00b-N00b · August 5, 2021, 10:00pm

What happens if I call cudaDeviceDisablePeerAccess? Will the driver honor that setting, and bypass nvlink? Will also bypass PCIe peer-to-peer?

Robert_Crovella · August 5, 2021, 10:31pm

Yes, it will honor it for the process/context that called it (wouldn’t it be broken otherwise?) That doesn’t mean it will apply to other activity, from, say, other processes. In that case, for transfers within that process/context, the data would flow as described already in the previously linked thread. I also wouldn’t be surprised if it returned an error code if you had not previously called the enable function. But I haven’t tested that. It certainly would not make a lot of sense to call the disable function if you had not previously called the enable function.

N00b-N00b · August 5, 2021, 10:38pm

Cool- And just to confirm, if I were to use a managed allocation (using USM) and try to do a memcpy (kind=cudaMemcpyDefault) to allocations on different devices, your saying the driver WILL do an intermediate copy back to the host instead of a DMA transfer from one GPU to another directly, when peer functionality is deactivated?

Since the forum limits the replies then suggest to edit the previous submission, let me elaborate here on the below response:

3:55pm : Source and Destination are managed memory. The managed memory I would hope would do exactly what an application code would do to determine what type of copy behavior is best and would prioritize (NVLink, PCIE P2P, or host copies) the most optimal way to move forward. Are you saying there is a considerable disparity between performance in using managed and unmanaged memory? If so, do you recommend not using USM?

Robert_Crovella · August 5, 2021, 10:47pm

The managed memory system may make its own decisions about what to do. I wouldn’t use managed memory if I were interested in the most precise control over copy behavior. For a complicated scenario like that, I’m not going to try to read a single english sentence and assume that my picture of the code is the same as yours. I’ll just give one example, it’s not clear to me if you are talking about a single managed allocation to other non-managed allocations, or if all the allocations in question are managed. Anyway, I probably won’t be able to respond further here.

florin.andrei · August 16, 2021, 11:16pm

Do you have a working example of it, using NVBit?

N00b-N00b · August 19, 2021, 9:47pm

Hi Andrei,

Unfortunately, I dont have an example because we stopped pursuing the NVBIT route do to an easier way using ompi. Ompi has flags to turn it off when using cuda-aware MPI. If you are using cuda-aware MPI you can use the -mca btl_smcuda_use_cuda_ipc 0 flag.

caifeng_zhu · April 11, 2024, 11:21am

Maybe the driver can disable the NVLink. Try the following steps

touch /etc/modprobe.d/nviadia.conf
echo “options nvidia NVreg_NvLinkDisable=1” >> /etc/modprobe.d/nviadia.conf