Cuda context and cudaDeviceSynchronize

mickael.gpu · February 27, 2023, 4:36pm

Hello,

My program creates two CPU threads using the same GPU.
The thread 1 launches CUDA kernel on the default CUDA context and the default stream.
The thread 2 creates a new CUDA context (with the driver API) and a new stream. It performs only H2D data transfers.

My goal is that the operations launched on the GPU by the two CPU threads occurs concurrently

Does a cudaDeviceSynchronize of thread 1 waits for CUDA calls of thread 2 to end?

Moreover, how many concurrent cudaMemCopy
(H2D or D2H) can be performed between a given CPU and a given GPU?

Thanks you for your help

Robert_Crovella · February 27, 2023, 6:10pm

Generally speaking, for the GPU to switch from doing work for one context to work for another context, a context switch is required. This is an expensive operation, meaning it does not happen in a nanosecond or picosecond. It might take microseconds or even milliseconds.

A design involving 2 different contexts is not a smart design choice, if your desire is to overlap or run concurrently activities is both contexts.

cudaDeviceSynchronize() certainly waits until the device is idle.

And that is a runtime API call, not a driver API call. I’d generally advise non-experts not to try to carefully interleave driver API and runtime API. It brings additional complexity, and in the general case I don’t know why it would be needed. Certainly there is no indication in your posting why 2 different contexts are needed.

Overall this looks like a bad design to me.

Topic		Replies	Views
cudaDeviceSynchronize() doesn't wait for kernels launched by other CPU threads, why? CUDA Programming and Performance synchronization	7	2258	October 12, 2021
When multiple CPU threads launch their own kernels, do they share the same CUDA context? CUDA Programming and Performance	3	927	October 12, 2021
CUDA,Context and Threading CUDA Programming and Performance	6	19499	May 29, 2012
Concurrent execution of more than one CUDA application CUDA Programming and Performance	5	2986	May 1, 2009
Using CUDA/CudaContexts simultanously from multiple CPU threads CUDA Programming and Performance	4	5448	February 3, 2010
How is the laptop GPU able to do the rendering and execute a cuda program at the same time CUDA Programming and Performance	6	750	August 15, 2023
Do i really need to use cudaDeviceSynchronize in this scenario ? CUDA Programming and Performance	2	1021	February 11, 2019
Got wrong result when not using cudaDeviceSynchronize in threads CUDA Programming and Performance	6	838	February 1, 2024
unable to get the cpu and gpu to run in parallel CUDA Programming and Performance	34	23202	October 7, 2010
The behavior of cuCtxSynchronize CUDA Programming and Performance	4	690	March 8, 2022

Cuda context and cudaDeviceSynchronize

Related topics