Is there a way to perform cross-node GPU memory copying without using NCCL?

307166554 · May 7, 2025, 11:25am

I want to implement cross-node GPU memory copying without using NCCL’s P2P, as it would be more efficient. However, I haven’t found a suitable method to achieve this functionality, so I want to ask if you have any recommended approaches?

Thank you for all the replies.

ssimcoejr · May 8, 2025, 7:21pm

Hi 308166554,

Thank you for posting your inquiry to the NVIDIA Developer Forums.

You’ll want to look into GPUDirect as a starting point:
https://developer.nvidia.com/gpudirect

These APIs and libraries allow direct communication between your network adapter and GPU via GPUDirect RDMA.

More information can be found at that link - and by reaching out to the mailing list (gpudirect@nvidia.com).

Best,
NVIDIA Enterprise Experience

system · May 22, 2025, 7:21pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
about NVIDIA GPUDirect for Video.. CUDA Programming and Performance	1	1434	September 19, 2011
Copy data from one GPU to another CUDA Programming and Performance	2	2210	July 1, 2010
how to best transfer memory between GPUs sitting on different PCI controllers CUDA Programming and Performance	0	1886	February 20, 2012
Data copy between multi-GPUs CUDA Programming and Performance	2	1606	October 14, 2008
Direct memory transfer from GPU to GPU? CUDA Programming and Performance	3	1660	February 22, 2011
GPU Direct 2.0 (GPU to external device) GPUDirect CUDA Programming and Performance	0	9294	June 8, 2011
Direct copy from ethernet to GPU? Is it possible? CUDA Programming and Performance	1	3227	January 6, 2012
RDMA using GPUDirect CUDA Programming and Performance	0	758	March 24, 2014
How to communicate beetween two GPUs Tesla D870 : two tesla C870 GPUs CUDA Programming and Performance	2	1646	April 10, 2008
Usefulness of GPUDirect Usefullnes of GPUDirect to transfer Render Scene to host CUDA Programming and Performance	1	5149	October 12, 2010

Is there a way to perform cross-node GPU memory copying without using NCCL?

Related topics