during the copy, can cpu and gpu work?

During the data transfer from cpu to gpu or reverse, can cpu work at the same time? and can gpu?

thanks :P

Yes it can!

Check out cudaStreams and async memcpy – there have been very similar threads to this one actually

deja vu :wacko:

Yes to both questions, with some restructions
see cudaMemcpyAsync in the programming guide for the details.

If they can run at the same time, there might be some minor or major impact on performance, say by putting load on the memory bus system. Is it true? Maybe for GPU the impact is minor but for CPU it is huge?

Also the cudaMemcpyAsync in the programming guide gives how to use GPU when transferring data. But how can I use CPU at the same time, is it the only way to fork a new thread?

Thanks :P

The cudaMemcpyAsync call is asynchronous. It returns immediately after you call it. Just put the instructions to run on the CPU after the call to the memcpy.

Then how can I know if GPU finishes its work?

it is explained in the programming guide. You should also be able to see an example in the SDK, asyncApi.cu