cudaHostAlloc and thread safety problems with pinned, portable memory

dgwsoft · March 28, 2011, 6:46pm

I am trying to do stuff with pinned, portable memory. (Cuda runtime 2.3)

The example I am working from allocates pinned memory (with cudaHostAlloc((void**)&addr, n_bytes, cudaHostAllocPortable) )
in the main thread, then launches separate host threads for each GPU.

That is OK. But I get problems when I try to allocate pinned/portable memory within each host thread. So:

A) is it OK to allocate/free pinned memory within several different host threads? (And what if each had a different CUDA context?)
B) is it OK for one thread to allocate, and another thread to free a chunk of pinned/portable memory?

B. may seem like a strange thing to do. Basically I have a class that caches arrays, up to a global limit on the cached data.
Cached data is freed up when the limit is reached. Cached data may be used by different threads, and any thread may do the freeing.
(Users of the data must always check to see if it has been freed up and if so recalculate the array. But if it is still there,
time is saved). Anyway, this is all made thread safe with mutexes, and works fine as long as I use “new” and “delete”.
It passes its unit tests.

Then I thought, why waste time and space copying that data into pinned memory before copying to the device? Why not
put it in pinned memory to start with? So I replaced “new” with “cudaHostAlloc(… cudaHostAllocPortable)” and “delete”
with “cudaFreeHost()”.

And now it passes all the unit tests apart from the one that exercises thread safety, where it crashes (seg faults) randomly.

NB in this unit test I am not even copying to the device - just repeatedly allocating, writing, reading and freeing, in
two different threads. (I call cudaSetDevice(0) before launching threads so in fact the CUDA context should be the same).

After scratching my head a bit, I wonder if perhaps cudaHostAlloc() is not quite thread safe to the extent I need?
Though, since I am (trying to) provide thread safety with mutexes, unless CUDA is explicitly using the thread id in some
way, it should be OK.

All I can find out for sure is that pinned/portable memory can be accessed by multiple devices which are controlled
from different host threads. If it is safe to allocate/free in different host threads, and am not sure.

Does anyone know?

seibert · March 29, 2011, 1:02pm

I have no idea what the allowed behavior was for this in CUDA 2.3, but I could imagine the pinned memory allocation information getting attached to the CUDA context associated with the allocating thread.

This almost certainly has to be fixed in CUDA 4.0. The CUDA runtime implementation (and semantics) have been totally restructured for thread-safety. If you have a registered developer account, I would take a look at the CUDA 4.0 rc1.

dgwsoft · April 8, 2011, 10:52am

It seems to be OK to allocate and deallocate pinned memory in different threads as long as you allocate/deallocate a given chunk in he same thread. So you may well be right about pinned memory being tied to the CUDA context. Also, allocating pinned memory in lots of small chunks seems to be very slow, so this was not a good solution. Best to do that with malloc/free (or new/delete) and copy everything to a big chunk of pinned memory later.

Not tried it yet - next iteration maybe. Thanks.

Topic		Replies	Views
Portable pinned memory deallocation CUDA Programming and Performance	1	1288	January 26, 2010
Contexts and cudaMallocHost Same rules? CUDA Programming and Performance	17	11383	November 15, 2008
How to pass two flags to cudaHostAlloc()? CUDA Programming and Performance	5	9297	June 17, 2009
Error with pinned memory and threads on the host Legacy PGI Compilers	3	3769	August 10, 2017
cudaMallocHost and pthreads issues with accessing memory from different threads CUDA Programming and Performance	3	3378	November 14, 2008
Mapped memory across multiple GPUs CUDA Programming and Performance	3	8792	October 28, 2010
do cudaMallocHost and cudaHostAlloc implicitly create a context? CUDA Programming and Performance	4	1295	January 17, 2011
Is pinned memory possible in mixed cpp and cuda CUDA Programming and Performance	3	2906	January 29, 2009
Thread safety of cudaMalloc and cudaFree with multiple GPUs CUDA Programming and Performance	3	2024	December 4, 2020
portable host memory CUDA Programming and Performance	0	2805	March 5, 2010

cudaHostAlloc and thread safety problems with pinned, portable memory

Related topics