Thread safety of cudaMalloc and cudaFree with multiple GPUs

Freeman.ix · December 1, 2020, 5:59pm

I experience host memory stack corruption when calling cudaMalloc and cudaFree from multiple host threads of single process. The threads use cudaSetDevice() call to use specific GPU (0, 1, 2) and device memory allocated in one thread on specific device is then, later, released in different thread, using the same device. Basically, I have a few worker threads allocating the memory and passing them in thread safe queue to other threads, which, after doing some work, will release them. Currently, no work is done, no host to device or device to host copying, only allocation in one thread, then, later, deallocation in another. There may be simultaneous calls to cudaMalloc and cudaFree in different threads with different device being set.

When I use only single GPU device in my process, there are no problems.
When I put mutex around cudaMalloc and cudaFree calls, there are no problems.

Without mutex lock, I can randomly see host memory heap corruption, SIGSEGV in libcuda.so, or cudaErrorAlreadyMapped error code from cudaMalloc.

My belief was that cudaMalloc and cudaFree are THREAD SAFE. Is it really true? What I am doing wrong?

Thanks for help
Radek

Robert_Crovella · December 1, 2020, 7:40pm

cudaMalloc and cudaFree should be thread safe.
defects are always possible. I generally recommend that if you suspect a defect, a good first step is to move your software stack forward to the latest GPU driver and latest CUDA version, and repeat the test.

Beyond that, I cannot surmise what is wrong based on your description.

Freeman.ix · December 3, 2020, 11:14am

I am testing this on CUDA 11.0, drivers 450.80.02.

I minimized the example and the problem prevails. Trying to debug the session with cuda-gdb resulted in:

cuda-gdb/8.2/gdb/cuda-context.c:64: internal-error: uint64_t context_get_id(context_t): Assertion `ctx' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.

Freeman.ix · December 4, 2020, 11:31am

I was finally able to upgrade our GPU servers.

On the 455.45.01 version, I am not able to reproduce the above issue, so it has to got fixed. Still would love to see what was the problem.

Topic		Replies	Views
cudaHostRegister on multiple threads CUDA Programming and Performance	15	253	June 18, 2024
cudaMalloc, cudaFree from different threads CUDA Programming and Performance	6	10894	August 27, 2007
Is it thread-safe to malloc in threads of a kernel function? CUDA Programming and Performance	7	2274	December 8, 2017
Multiple GPUs host thread safety? CUDA Programming and Performance	6	14063	July 15, 2010
cudaHostAlloc and thread safety problems with pinned, portable memory CUDA Programming and Performance	2	1799	April 8, 2011
Strange behavior with multiple host threads using cuFFT CUDA Programming and Performance	5	1586	March 21, 2014
Is cuda API serial inner the drive level CUDA Programming and Performance	4	452	March 1, 2019
Reporting a problem with CUDA memory access in multiple OS threads CUDA Programming and Performance	4	4893	April 30, 2007
cudaSetDevice bug? CUDA Programming and Performance	13	7506	November 16, 2010
CUDAFreeHost() not clearing allocated host memory, when multiple devices are used. CUDA Programming and Performance	2	1170	November 13, 2019

Thread safety of cudaMalloc and cudaFree with multiple GPUs

Related topics