peer-to-peer copy using cuMemcpy rather than cuMemcpyPeer

lebedov · August 7, 2011, 3:53am

On Fermi GPUs that support UVA but don’t support peer-to-peer memory copies (i.e., a GF100 and a GF104) with CUDA 4.0.17, it seems that it is possible to perform peer-to-peer copies using a simple cuMemcpy when the source and destination pointers correspond to memory on different GPUs. Does cuMemcpyPeer effectively perform the same operation as cuMemcpy, or is there some performance difference? (I can’t perform any comparisons at the present because I don’t have access to multiple Teslas on a system with CUDA 4.0.17. (The system I have contains a GTX 460 and GTX 470 and runs version 280.13 of the Linux NVIDIA driver.)

seb · August 9, 2011, 8:38am

This is mentioned in section 3.2.6.5 of the Programming Guide:

Peer-to-Peer Memory Copy

Memory copies can be performed between the memories of two different devices. When a unified address space is used for both devices (see Section 3.2.7), this is done using the regular memory copy functions mentioned in Section 3.2.2. Otherwise, this is done using cudaMemcpyPeer(), cudaMemcpyPeerAsync(), cudaMemcpy3DPeer(), or cudaMemcpy3DPeerAsync() […]

If your devices do not support peer-to-peer memory access or if it is not enabled with cudaDeviceEnablePeerAccess(), the peer-to-peer copies are staged through the host which entails a performance penalty.

Topic		Replies	Views
cudaMemcpyPeer fails with error 11 (invalid argument) CUDA Programming and Performance	11	3236	October 22, 2014
problem with cudaMemcpyPeer() - won't do copying CUDA Programming and Performance	0	1472	February 16, 2012
identical code on multiple GPUs attached to the same board. how to do p2p memaccess? CUDA Programming and Performance	2	960	June 12, 2013
openMP+CUDA, need help! CUDA Programming and Performance	7	2079	November 23, 2012
How can I pass data across two contexts cuMemcpyPeer across contexts CUDA Programming and Performance	1	3207	June 28, 2011
Peer-to-Peer access GPUDirect CUDA Programming and Performance	0	6156	August 8, 2011
Peer-To-Peer Access with cudaPitchedPtr CUDA Programming and Performance	3	1133	October 19, 2011
cuda 4.0rc2 cudaMemcpyPeer(Async) performance issues CUDA Programming and Performance	11	13138	May 3, 2011
about NVIDIA GPUDirect for Video.. CUDA Programming and Performance	1	1434	September 19, 2011
cudaMemcpyPeer across OpenMP threads I want to copy from one thread/gpu to another using cudaMemcpyP CUDA Programming and Performance	3	6219	May 23, 2011

peer-to-peer copy using cuMemcpy rather than cuMemcpyPeer

Related topics