cudaMemcpyAsync behavior

dmsuvorov · August 17, 2015, 12:44pm

Hi.

In cudaMemcpyAsync API function reference ([url]http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g85073372f776b4c4d5f89f7124b7bf79[/url]) it is written that:

“If kind is cudaMemcpyHostToDevice or cudaMemcpyDeviceToHost and the stream is non-zero, the copy may overlap with operations in other streams”

And it is strange, because I was able to achieve overlapping between kernel and device-to-device copy in different (non-default) streams. Sorry if it is a silly question, but I can’t understand the meaning of the above quote. And I’m used to think that each and every word in a documentation is meaningful.

Many thanks.

little_jimmy · August 18, 2015, 4:46am

“operations in other streams”

i suppose one should then define ‘operations’

in plain words, i think the paragraph intends to point out that a) the copy engines of a device may operate independently of the SMs of the device, b) a device may have more than one copy engine, and may use both simultaneously

Topic		Replies	Views
How to overlap execution of kernels in different streams with copy operations CUDA Programming and Performance	9	1084	February 1, 2022
cudaMemcpyAsync Question Overlap HostToDevice and DeviceToHost trasfers CUDA Programming and Performance	2	5685	April 2, 2009
async memcpy only seems to overlap device->host CUDA Programming and Performance	0	973	August 17, 2009
Concurrent copy & execution problem Device to host memory copy is not overlapped with kernel exe CUDA Programming and Performance	1	1825	June 23, 2010
How to Overlap Data Transfers in CUDA C/C++ Technical Blog	23	2491	January 18, 2023
about streaming style sample code in Programming Guide ... why such a style? CUDA Programming and Performance	5	1479	January 23, 2009
cudaMemcpyAsync same direction overlap CUDA Programming and Performance	1	359	June 29, 2023
cudaMemcpyAsync with cudaMemcpyHostToDevice does not implicitly synchronize with stream CUDA Programming and Performance	2	92	July 2, 2025
cudaMemcpyAsync CUDA Programming and Performance	10	21791	October 16, 2015
Help with CUDA streams CUDA Programming and Performance	1	1641	April 2, 2010

cudaMemcpyAsync behavior

Related topics