Accesing memory from both kernel and host side

flori · February 14, 2008, 3:35pm

Hi all,
my question goes into the same direction as this one, but this was posted in November. And now there is CUDA 1.1
I understand that 1.1 makes it possible to overlap asynchronous MemCpy and kernel execution by using multiple streams. But all the examples I have found so far are operating on strictly separate segments of device memory.

I wonder what happens if you write device memory via MemCpyAsync while a running kernel is reading it, and vice versa? Like in the case of GPU<->CPU polling? Or updating the enviromental conditions of an evolutionary simulation?

I was not able to find the answer in the CUDA 1.1 programming guide or other posts. So any help would be highly appreciated :)

Thanks in advance,
flori

MisterAnderson42 · February 17, 2008, 3:52pm

If a kernel is reading the same memory that cudaMemcpyAsync is writing to, expect race conditions to cause problems. There are no synchronization mechanisms for this.

Async memory copies are useful in pipeline situations where you can have kernel A working on memory A while copying memory B for the next kernel at the same time.

Topic		Replies	Views
Memory copy/set async to kernel execution in different stream CUDA Programming and Performance	5	1118	December 15, 2022
Concurrent copy & execution problem Device to host memory copy is not overlapped with kernel exe CUDA Programming and Performance	1	1778	June 23, 2010
Concurrent data copying and kernel execution CUDA Programming and Performance	10	8237	September 18, 2010
How to overlap execution of kernels in different streams with copy operations CUDA Programming and Performance	9	1002	February 1, 2022
Problem regarding data transfer overlap between multiple asynchronous streams CUDA Programming and Performance	8	807	September 11, 2016
MemCpyAsync with DevToDev Flag CUDA Programming and Performance	6	13269	February 7, 2008
Asynchronous memory copy from Host to Device CUDA Programming and Performance	5	3073	June 12, 2008
Asynchronous Memcpy's not overlapping with asynchronous kernel execution despite using cuda streams? CUDA Programming and Performance cuda	4	1073	October 31, 2022
Memcopy while Kernels Running? Performance hit? CUDA Programming and Performance	2	3881	June 5, 2008
Asynchronicity of kernel execution and cuMemcpy CUDA Programming and Performance	2	3283	March 23, 2009

Accesing memory from both kernel and host side

Related topics