accessing device memory during kernel execution

m_habanero · March 9, 2010, 5:41pm

I’m curious, what happens when issuing a cudaMalloc, cudaMemcpy, or cudaFree while a kernel is executing on the device? Does the memory call block until the kernel completes, or does CUDA allow concurrent host and device memory accesses? Thanks for any answers :)

tmurray · March 9, 2010, 6:01pm

cudaMalloc and cudaFree require the GPU to complete all pending work before completing (for the most part). cudaMemcpyAsync can overlap with kernels in different streams, which implies that the developer is guaranteeing that you are not accessing the same region from a memcpy and a kernel simultaneously (unless they’re both only reading).

m_habanero · March 9, 2010, 8:06pm

OK, I see in the CUDA programming guide how to use streams, but only a way to separate the streams that cudaMemcpy acts on. Is there any way to use streams with cudaMallocs or cudaFrees?

gshi · March 10, 2010, 4:44pm

Not that I know of.

You don’t find function calls like cudaMallocAsync() but you can find cudaMemcpyAsync()

After all cudaMalloc() usually takes no time and it is not worth to overlap it with something else.

Topic		Replies	Views
The impact of cudaMalloc(）and cudaFree() on the overlapping of kernel executions and data transfer CUDA Programming and Performance	0	1034	July 22, 2020
cudaFree in parallel with CUDA kernel CUDA Programming and Performance	1	4245	December 29, 2010
Asynchronous Memcpy's not overlapping with asynchronous kernel execution despite using cuda streams? CUDA Programming and Performance cuda	3	1267	October 31, 2022
Asynchronous problem with cudaMalloc CUDA Programming and Performance	2	1085	May 22, 2023
Memory copy/set async to kernel execution in different stream CUDA Programming and Performance	5	1313	December 15, 2022
How to overlap execution of kernels in different streams with copy operations CUDA Programming and Performance	8	1141	January 18, 2022
Concurrent copy & execution problem Device to host memory copy is not overlapped with kernel exe CUDA Programming and Performance	1	1844	June 23, 2010
Asynchronicity of kernel execution and cuMemcpy CUDA Programming and Performance	2	3334	March 23, 2009
Question about streams CUDA Programming and Performance	1	1039	August 6, 2009
cudaFree while kernel is executing CUDA Programming and Performance	1	9154	February 15, 2011

accessing device memory during kernel execution

Related topics