How to share memory between different cuda *runtime* contexts?

Greetings. I’m using OpenMP parallel sections for functional ||ism. The code fails when different cuda contexts try to access memory allocated in another context.
Up to now, I just used the OMP master directive to tell only thread 0 to execute the code. I heard the driver API allows shared mem between contexts, but how
can I do this in the runtime API?

cudaHostAlloc with the cudaHostAllocPortable flag.

I want to share device RAM, not host RAM. Sorry for not being clear.

Device memory is always limited to a single context. Because the runtime API doesn’t support context migration at the moment, the only way to do this is to use the driver API with cuCtxPush/PopCurrent.

I see, just as a process’s memory is local. I’ll try cuCtxPopCurrent/cuCtxAttach, assuming these functions can be mixed with CUDA run time.

They cannot–the runtime will become very confused and not work correctly. To be fixed in a future release…