Most lectures say: cudaMemcpy() is synchronous. This is what I understand.
But when I read “CUDA C Programming Guide Version 3.2”, got really confused by $220.127.116.11:
Asynchronous Concurrent Execution:
“Host <-> device memory copies of a memory block of 64 KB or less”.
Do these two conflict? Thanks.