Sharing CUDA memory between processes

As stated in CUDA docs, IPC functions do not work on Tegra platforms, including, of course, Jetsons.

Is there any other option how to create CUDA memory buffer that is shared between two separate processes?


Could you share more detail about your use case.

Do you want to share the GPU buffer while both processes still alive?
Or it is possible that one of the process will terminate earlier before the other access the buffer?


Yes, two processes are still alive. The use case is like one process is a “producer”, and second is a “consumer”, so the first process fills shared CUDA buffer and signals other process that buffer is ready, and after it second process reads it.

E.g. it’s just a zero-copying issue, two processes need to communicate large data, but copying over conventional IPC’s is rather expensive thing.

I am completely ignorant of tegra/jetson.
Is this a SoC architecture?
For SoC architectures, Vulkan has a special memory category called ‘coherent or cohesive or something’ (I think).
This is memory that is directly addressable by both CPU and GPU
If so, is tegra/jetson RAM shared via local address/data bus.
Can applications mmap this type of memory and share between apps as cuda device memory and eliminate the need to use cudaIPCxyz()
It may be possible to address this memory via a vulkan binding but only use it in your kernels.

Also would like to know what is right way to share NvBuffers between multiple processes without copying?

So one producer e. g. a decoder process and multiple other processes that are processing the decodeded buffers for example running in different containers


Jetson is a platform that with integrated GPU, which is different from the desktop GPU that connects to host with PCIE.
For memory issue, you can find some information below:

You can try to use EGLStream or NvSci to communicate between CUDA contexts in two processes.


As I understand, any EGL* functions require that desktop environment should be running, which is not our case.

Thank you for your suggestion. I’m also trying to use shared GPU memory across several processes on Jetson Xavier and it seems that NvSci is the alternative to go. However, I couldn’t find the lib. It seems that at the time we’re speaking NvSci is embedded exclusively in the DRIVE package which requires a membership for autonomous driving development. Am I mistaken?