Does cudaMemcpyAsync require host memory to be pinned?

lehuyduc4 · October 6, 2022, 11:16am

While reading the guide about overlapping data transfer, I notice this

The host memory involved in the data transfer must be pinned memory.

However, this info is not shown at all in Nvidia doc: https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html

My program has multiple independent threads, 1 thread = 1 job, so each thread use a separate cudaStream_t. In this case, do I need pinned memory for every host transfer that uses cudaMemcpyAsync ?

striker159 · October 6, 2022, 11:21am

cudaMemcpyAsync also works with ordinary pageable memory, but it will probably be a blocking call in that case. Only with pinned memory, the call can be asynchronous.

See CUDA Runtime API :: CUDA Toolkit Documentation

Topic		Replies	Views
Overlapping computation and data transfers must use pinned memory or UVA? CUDA Programming and Performance	1	601	August 13, 2018
Pinned Host Memory and CC 1.1 Device CUDA Programming and Performance	1	5512	May 6, 2010
Does cudaMemcpyAsync require pinned memory? CUDA Programming and Performance	2	5070	November 25, 2015
cudaMemcpyAsync same direction overlap CUDA Programming and Performance	1	305	June 29, 2023
Overlap cudaMemcpyAsync with CPU execution CUDA Programming and Performance	2	1128	April 3, 2009
Async copy CUDA Programming and Performance	4	3013	September 18, 2009
Is it possible to use pinned memory? Outside of CUDA CUDA Programming and Performance	14	6240	January 22, 2025
cudaMemcpyAsync Device to Host : Need to synchronize before using data on host CUDA Programming and Performance	7	2185	October 7, 2022
CUDA streams questions CUDA Programming and Performance	1	1015	May 17, 2015
Searching some infos on cudaStreams CUDA Programming and Performance	6	6134	January 26, 2012

Does cudaMemcpyAsync require host memory to be pinned?

Related topics