cudaThreadSynchronize() and Device-Mapped Host Memory in CUDA 2.2

seibert · March 18, 2009, 4:29pm

I’m thinking about using the new feature to map host memory into the device space for an output buffer I read on the host at the end of my kernel. Is it sufficient to call cudaThreadSynchronize() after the kernel to ensure all writes by the device have been flushed back to the host memory? The documentation mentions streams and events, but cudaThreadSynchronize() would be more straightforward for my simple CUDA usage.

tmurray · March 18, 2009, 5:10pm

I believe the answer is yes–I’ll double check, but it should be fine.