Cuda runtime calls that require a context implicitly create one if it doesn’t already exist.
I was wondering if pinned memory allocation functions, cudaMallocHost and cudaHostAlloc do this implicit context creation or not. I’m looking to create portable pinned memory if that makes a difference.
Also, if they do create a context, does that mean that I need to free the memory before the context dies or not? What happens if I don’t?
It’s an issue with correctly working with a multi threaded program.
For what i think, it’s a cuda runtime call so it should create a context. I think you can test it by using cudasetdevice just after it. If a context has already been created, the cudasetdevice will return an error (see CUDA Toolkit Documentation 12.3 Update 1 )
Took me a couple of minutes after writing the post to think of how to test that. Tested only under linux thus far, but the two conclusions I came to:
a. These calls do create a context implicitly (driver API seems to implicitly say that based on the possible error codes)
b. Memory is implicitly released on call to cudaThreadExit (which was more surprising, but expected once you know that the allocation requires a context)
Regarding the memory freeing, what happens if you release the context from the host thread which have created the portable pinned memory? Can the others contexts still access it?
I haven’t checked that, but I don’t see how they can. All threads in the process (and thus all contexts) see the same host memory space. A segmentation fault is sent from the memory management unit, which works at the process level, not the thread level. This means that all threads no longer have access to that memory.
I’m not sure if there is any other way for the driver to manage that without causing memory leaks.