Memory-safety of async memcpy

Hi all,

I am wondering whether a code snippet like this:

void call_h2d_async(obj* d_out) {
  HostObjs host_obj; // some RAII structure: malloc or cudaMallocHost called here
 
  cudaMemcpyAsync(d_out, host_obj.data(), cudaMemcpyHostToDevice, stream);
} // free or cudaFreeHost called here.

is safe if the host memory inside HostObjs is page-locked?

This document CUDA Driver API :: CUDA Toolkit Documentation
states that if the memory is pagable, then the memcpy will return only after staging the data on a page-locked buffer. So presumably, the data will not be yanked before pagable-to-host transfer.

But if host_obj.data() return a pointer to page-lock host memory and cudaFreeHost is called (by HostObjs’s destructor. Can the data transfer still complete?

Cheers,

– vmz

If you require that the host_obj should not be destroyed until the memcpy is complete, you should express this requirement by explicitely synchronizing the stream before returning from the function.

That being said, it should currently be safe to not synchronize the stream because cudaFreeHost currently synchronizes the device, and copies from pageable memory should be blocking as per the linked document.

1 Like

Thank you.

Is it documented anywhere that cudaFree/cudaFreeHost implies device synchronization?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.