Memory Copy Threads

I was wondering is if you can copy memory from the GPU when the kernel is running.

In other words, I would like to have the GPU have two result buffers and read the unused one from the the GPU while the GPU writes to the other.

The CPU threads will be synced in order to avoid read and writes at the same time to the same buffer.


Currently CUDA memory transfers are blocking. So, the answer is not currently.


Are you ever planning to support this option in the future, and if so how long do you think that will be?