Asynchronous H2D transfer while kernel execution

RedBull · April 25, 2011, 12:56pm

Hi,

I am trying to develop an application where I need to refresh a buffer which the kernel is called with. I tried using asynch transfer with streams and double buffering… but I guess that its not possible to make a kernel use the refreshed state while in execution… I might be wrong but I feel async transfer helps only for D2H(device to host) transfers.

Does async transfer work only for D2H or vice-versa as well ? If it does, can I do the following :

Call_kernel<<<…,…,stream0>>>(d_a,d_b)
while( cudaEventQuery(stop) == cudaErrorNotReady )
{
cudaMemcpyAsync(d_a,&h_a,sizeof(int),cudaMemcpyHostToDevice,stream0);
}
Is it possible to call cudaMemCpy() from CPU threads while kernel is in execution to do the same as above…

Please advise…

Thanks

seibert · April 25, 2011, 2:18pm

Note that CUDA preserves the ordering of operations in the same CUDA stream. If you want your async copy to overlap with kernel execution, it needs to be on a different stream.

RedBull · April 26, 2011, 7:09am

Agreed… Its my mistake to put same stream in the code…
So, if I do something like this :

Call_kernel<<<…,…,stream0>>>(d_a,d_b)

while( cudaEventQuery(stop) == cudaErrorNotReady )
{
cudaMemcpyAsync(d_a,&h_a,sizeof(int),cudaMemcpyHostToDevice, stream1);
}

How can the kernel see/access the fresh value of d_a after h_a is copied to it asynchronously ?
Do I have to use a separate buffer for this purpose ?

Thanks for the suggestions…

Topic		Replies	Views
How to use streams for asynch transfers CUDA Programming and Performance	3	873	February 18, 2011
Memory copy/set async to kernel execution in different stream CUDA Programming and Performance	5	940	December 15, 2022
cudaThreadSynchronize CUDA Programming and Performance	1	2391	February 1, 2009
cudaMemcpyAsync blocks and has long Runtime API duration CUDA Programming and Performance	0	443	December 10, 2016
Memcopy while Kernels Running? Performance hit? CUDA Programming and Performance	2	3874	June 5, 2008
How to overlap execution of kernels in different streams with copy operations CUDA Programming and Performance	9	949	February 1, 2022
Synchronization of cudaMemcpyAsync for pageable memory CUDA Programming and Performance	2	1595	October 3, 2021
cudaMemcpyAsync clarification required & help needed CUDA Programming and Performance	0	1749	October 17, 2009
Unknown concurrency issue in CUDA/C++ program CUDA Programming and Performance cuda , profiling	1	609	September 10, 2021
during the copy, can cpu and gpu work? CUDA Programming and Performance	6	5213	June 11, 2008

Asynchronous H2D transfer while kernel execution

Related topics