cudaDeviceSynchronize is very slow

cheer37 · July 31, 2014, 8:18am

Hi.
I have to use the result from cuda kernel function at following cpu host code,
so at the just below of kernel function, put the cudaDeviceSynchronize function.
then it’s very slow, so the time saving gained by using kernel function has gone.
time saved to below 100ms using cuda kernel, but cudaDeviceSynchronize takes 150ms.
it makes me doubt there is no need to cuda programming.
Please clarify me, thanks in advance.

Milliarde · July 31, 2014, 9:47pm

Are you using managed memory? I’ve found that when I use managed memory (such as with cudaMallocManaged) the cudaDeviceSynchronize forces all the contents to be copied over.

So if you are using a larger than necessary buffer size to work with your data, try using as small a vector as possible. I’m not sure if 150 ms is normal or not, but with enough managed memory allocated I can see that being possible.

Topic		Replies	Views
cudaDeviceSyncrhonize takes too long CUDA Programming and Performance	1	722	September 9, 2020
is cudaThreadSynchronize() will take 600+ms to execute? CUDA Programming and Performance	3	1538	April 21, 2009
Unable to understand the time unwanted time taken by cudaDeviceSynchronise() CUDA Programming and Performance tensorrt , cuda	1	342	April 12, 2022
cudaDeviceSynchronize() returns cudaErrorMemoryAllocation CUDA Programming and Performance	1	507	February 2, 2018
very slow function next to kernel CUDA Programming and Performance	3	3942	August 10, 2008
Large allocations with cudaMallocManaged slow down synchronization CUDA Programming and Performance	11	1578	October 26, 2020
Putting the host process to sleep during kernel execution CUDA Programming and Performance	3	2892	September 27, 2009
Cuda 11.4: CUDA Programming and Performance	5	266	November 5, 2023
cudaStream with Managed memory CUDA Programming and Performance	5	761	October 23, 2018
cudaDeviceSynchronize() doesn't wait for cudaMemcpy to finish? CUDA Programming and Performance cuda , synchronization	3	2871	February 17, 2021

cudaDeviceSynchronize is very slow

Related topics