Thread safety of cudaMalloc and cudaFree with multiple GPUs

I experience host memory stack corruption when calling cudaMalloc and cudaFree from multiple host threads of single process. The threads use cudaSetDevice() call to use specific GPU (0, 1, 2) and device memory allocated in one thread on specific device is then, later, released in different thread, using the same device. Basically, I have a few worker threads allocating the memory and passing them in thread safe queue to other threads, which, after doing some work, will release them. Currently, no work is done, no host to device or device to host copying, only allocation in one thread, then, later, deallocation in another. There may be simultaneous calls to cudaMalloc and cudaFree in different threads with different device being set.

When I use only single GPU device in my process, there are no problems.
When I put mutex around cudaMalloc and cudaFree calls, there are no problems.

Without mutex lock, I can randomly see host memory heap corruption, SIGSEGV in libcuda.so, or cudaErrorAlreadyMapped error code from cudaMalloc.

My belief was that cudaMalloc and cudaFree are THREAD SAFE. Is it really true? What I am doing wrong?

Thanks for help
Radek

cudaMalloc and cudaFree should be thread safe.
defects are always possible. I generally recommend that if you suspect a defect, a good first step is to move your software stack forward to the latest GPU driver and latest CUDA version, and repeat the test.

Beyond that, I cannot surmise what is wrong based on your description.

I am testing this on CUDA 11.0, drivers 450.80.02.

I minimized the example and the problem prevails. Trying to debug the session with cuda-gdb resulted in:

cuda-gdb/8.2/gdb/cuda-context.c:64: internal-error: uint64_t context_get_id(context_t): Assertion `ctx' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.

I was finally able to upgrade our GPU servers.

On the 455.45.01 version, I am not able to reproduce the above issue, so it has to got fixed. Still would love to see what was the problem.