Is it necessary to do cudaDeviceSynchronize() after the cudaMemcpy(host, device, size, cudaMemcpyDeviceToHost);
cudaMemcpy(host, device, size, cudaMemcpyDeviceToHost);
cudaDeviceSynchronize(); //Is it required here ??
I am asking because:-
- cudaMemcpy() is a blocking statement which is different than cudaMemcpyAsync(), where the device copy takes place without the interference of host.
I understand cudaMemcpyAsync() would definitely require cudaDeviceSynchronize(), but does the normal cudaMemcpy() also require sync.
If someone can clarify this it would be very helpful.
Till today I have not used the cudaDeviceSynchronize() after the normal cudaMemcpyAsync() and never got any error or issue for the default stream.
Is cudaDeviceSyncronize() mandatory after cudaMemcpy ?