hubdw
1
Would doing cudaMemcpyAsync or cudaMemPrefetchAsync be destructive without using synchronization?
Let’s say I do something like this:
cudaMalloc(a);
kernel<<<...>>>(a)
cudaMemcpyAsync(a, ToHost);
// then doing check on host on a
checkFunction(a);
or this:
cudaMalloc(a);
kernel<<<...>>>(a)
cudaMemPrefetchAsync(a, ToHost);
// then doing check on host on a
checkFunction(a);
I’m still a beginner in this. Trying to wrap my head around memory migration.
If an operation on the host depends on the result of an asynchronous operation, you need to wait until the asynchronous operation is finished.
1 Like
hubdw
3
Though I tested out by creating some kernel to do some operations. I did something like this:
cudaMallocManaged(a)
kernel<<<...>>>()
cudaDeviceSynchronize()
cudaMemPrefetchAsync(a)
hostFunc(a)
hostFunc() does some checks
and it works. Could Unified Memory be the cause of this program to not crash?
system
Closed
4
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.