cudaThreadSyncronize and cudaMemcpy

Hi, all.
I have a question of timing about cudaThreadSyncronize and cudaMemcpy.

Some sample programs don’t use cudaThreadSyncronize before cudaMemcpy.
In this case, does cudaMemcpy function wait for finishing GPUkernnel ? or does it works asynchronous ?
Can I found specific about it in programming guid ?


Yes, the memory copy will wait for the kernel to finish before running. Take a look at section in the guide.

I found descriptions and found out that programmin guide 1.1’s section has been extended.

Thanks a lot !