Data transfers are not overlapping

martin-wr · February 6, 2018, 7:13pm

Hello,

I am currently working on a project which is implementing a abstraction layer for the CUDA API.
There I am trying implement an example with it which focusses on overlapping data transfers:
[url]https://github.com/parallel-forall/code-samples/blob/master/series/cuda-cpp/overlap-data-transfers/async.cu[/url]

The implementation is working, but the problem is that the data transfers are not overlapping, as you can see in the results of the nvidia visual profiler below:

Reference (code in the link above):

Implementation:

The implementation of the abstraction layer is not using any synchronize methods and the memCpy is asynchronous. But somehow the data transfers are not overlapping. Does someone know, without looking at the code, what may cause this problem?

Robert_Crovella · February 6, 2018, 7:37pm

if you are not using pinned buffers, you will not get overlap

martin-wr · February 7, 2018, 12:31pm

We are already using pinned buffers. We are allocating them with “cuMemAllocHost”.

Any other ideas why it is not overlapping?

Topic		Replies	Views
Overlapping CPU<->GPU trasnfer and kernel computation only for pinned memory CUDA Programming and Performance	3	1014	March 29, 2011
Bug when overlapping tranfert & data CUDA Programming and Performance	1	625	February 11, 2011
Overlapping computation and data transfers must use pinned memory or UVA? CUDA Programming and Performance	1	677	August 13, 2018
Strange behavior with overlap of transfer and compute CUDA Programming and Performance	4	4039	October 19, 2011
How to Overlap Data Transfers in CUDA C/C++ Technical Blog	23	2612	January 18, 2023
Overlapping kernel execution and data transfer CUDA Programming and Performance	9	3649	May 10, 2017
Maxwell. Overlapping data transfers CUDA Programming and Performance	6	1289	January 29, 2015
cudaMemcpyAync with pageable memory overlap with kernal CUDA Programming and Performance cuda	2	833	January 9, 2023
cudaMemcpyAsync same direction overlap CUDA Programming and Performance	1	397	June 29, 2023
Asynchronous multi streaming: not working... CUDA Programming and Performance	2	578	May 13, 2018

Data transfers are not overlapping

Related topics