Multi thread calling leads to kernel not launch (CUDA 10.0)

I have two different shared libraries (such as lib_a, lib_b) with same kernel name (such as sameKernel).
In the main process, I will create two threads, one calls sameKernel in lib_a, the other calls sameKernel in lib_b.
Then I use nvvp to check status, I find just one sameKernel has been launched.

When sameKernels have been declared static, they will be launched correctly.
When sameKernels have different paramter list, they will also be launched correctly.

Attachment is my test codes in ubuntu 16.04 cuda10.

CudaKernelIssue.tar.gz (3.57 KB)

wow that must be confusing

you may wish to file a bug

instructions are contained in a sticky post at the top of this forum

I have added a bug report [BUG ID 2543695]