Share Cuda memory between different system processes

I have some PyTorch tensors in a single script and want to share them with each other scripts.
How I can do that?

I have tensors in Cuda memory ( in a and I think the best solution was to get something like an address in Cuda memory that I could turn to to get (in a these tensors.

Any ideas?
PS. Looks like task for triton inference server

Hi @kuskov.stanislav
This question might be better suited for CUDA Programming and Performance - NVIDIA Developer Forums forum branch. I have moved it there.

1 Like

CUDA IPC mechanism allows for sharing of device memory between processes. There are CUDA sample codes that demonstrate it. I won’t be able to give you a roadmap for whatever you are trying to do in pytorch. However a simple google search of “pytorch cuda ipc” turned up articles like this which may be of interest.

1 Like

Thank You for this link! It’s looks same of my task. I will try and write about the results.

@Robert_Crovella Hi Robert! I read the article and it tells how to create a handle, but does not tell you how to read it. I also tried to implement transfer using numba, but it gave me a strange result. when I read the headline with numba, I get an incorrect result from the network from time to time. I added a delay after converting tensor and it started working correctly.

Sample of my code

backbone_features =, dim=0)
desc = backbone_features.__cuda_array_interface__
shape = desc["shape"]
strides = desc.get("strides")
dtype = np.dtype(desc["typestr"])
shape, strides, dtype = _prepare_shape_strides_dtype(shape, strides, dtype, order="C")
size = cuda.driver.memory_size_from_info(shape, strides, dtype.itemsize)
devptr = cuda.driver.get_devptr_for_active_ctx(
data = cuda.driver.MemoryPointer(
            current_context(), devptr, size=size, owner=backbone_features
ipch = devices.get_context().get_ipc_handle(data)
desc = dict(shape=shape, strides=strides, dtype=dtype)
handle = pickle.dumps([ipch, desc])

and send this handle to another process

You might want to study a CUDA IPC sample code. Sorry, I won’t be able to debug your torch/python/CUDA/IPC code for you.

@Robert_Crovella . I create another topic about it. This code is just of example with numba library, maybe on of numba developers provide me