Memory Migration to Host

Would doing cudaMemcpyAsync or cudaMemPrefetchAsync be destructive without using synchronization?

Let’s say I do something like this:

cudaMalloc(a);
kernel<<<...>>>(a)
cudaMemcpyAsync(a, ToHost);

// then doing check on host on a
checkFunction(a);

or this:

cudaMalloc(a);
kernel<<<...>>>(a)
cudaMemPrefetchAsync(a, ToHost);

// then doing check on host on a
checkFunction(a);

I’m still a beginner in this. Trying to wrap my head around memory migration.

If an operation on the host depends on the result of an asynchronous operation, you need to wait until the asynchronous operation is finished.

1 Like

Though I tested out by creating some kernel to do some operations. I did something like this:

cudaMallocManaged(a)
kernel<<<...>>>()
cudaDeviceSynchronize()
cudaMemPrefetchAsync(a)
hostFunc(a)

hostFunc() does some checks

and it works. Could Unified Memory be the cause of this program to not crash?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.