cudnnCreate() / cublasCreate() blocked while gpu kernels run in parallel (irrespective of process)

vivekguptha · June 24, 2021, 9:38am

I am running a gpu based algorithm that is executing the kernel on a gpu. At the same time, I am running an other cudnn / cublas based algorithm in parallel. I see that the cudnnCreate() / cublasCreate() is blocked until the gpu kernel in the same or another process completes.

From cudnn/cublas documentation it is clear that these functions call cudaDeviceSynchronize() from within and hence they would block until the gpu completes all the tasks in queue. But the cudaDeviceSynchronize() will wait only for the tasks from the current context to complete, right? So why do these functions block even when the kernels run in different context and even while running in different process?

Topic		Replies	Views
cudnnCreate() / cublasCreate() blocked while CUDA kernels run in parallel (irrespective of process) GPU-Accelerated Libraries cudnn , cublas	3	1939	July 5, 2021
Using GPU and CPU at the same time CUDA Programming and Performance	5	6955	March 4, 2009
Question regarding cudaThreadSynchronize() Does it act like a barrier? CUDA Programming and Performance	1	1142	September 16, 2008
Allow kernel to wait for completion of gpu code CUDA Programming and Performance	1	2207	August 19, 2009
cudaDeviceSynchronize - blocks only GPU for the host (CPU) thread in which it is called, or does it CUDA Programming and Performance	3	4177	January 12, 2014
cudaDeviceSynchronize() doesn't wait for kernels launched by other CPU threads, why? CUDA Programming and Performance synchronization	7	2274	October 12, 2021
GPU Direct Storage: cuFileWrite concurrently to kernel execution CUDA Programming and Performance	0	428	January 7, 2022
Cudnn function and cudaMalloc will block in dead lock in parallel cuDNN cuda	1	597	December 31, 2023
Newbie: async kernel, so I can do stuff on the CPU meanwhile, yeah? CUDA Programming and Performance	2	374	January 13, 2019
CPU core is busy while GPU runs its kernel CUDA Programming and Performance	11	5240	February 11, 2018

cudnnCreate() / cublasCreate() blocked while gpu kernels run in parallel (irrespective of process)

Related topics