Hi!
I have some PyTorch tensors in a single script and want to share them with each other scripts.
How I can do that?
I have tensors in Cuda memory ( in a backbone.py) and I think the best solution was to get something like an address in Cuda memory that I could turn to to get (in a head.py) these tensors.
Any ideas?
PS. Looks like task for triton inference server
Hi @kuskov.stanislav
This question might be better suited for CUDA Programming and Performance - NVIDIA Developer Forums forum branch. I have moved it there.
1 Like
CUDA IPC mechanism allows for sharing of device memory between processes. There are CUDA sample codes that demonstrate it. I won’t be able to give you a roadmap for whatever you are trying to do in pytorch. However a simple google search of “pytorch cuda ipc” turned up articles like this which may be of interest.
1 Like
Thank You for this link! It’s looks same of my task. I will try and write about the results.
@Robert_Crovella Hi Robert! I read the article and it tells how to create a handle, but does not tell you how to read it. I also tried to implement transfer using numba, but it gave me a strange result. when I read the headline with numba, I get an incorrect result from the network from time to time. I added a delay after converting tensor and it started working correctly.
Sample of my code
backbone_features = torch.cat(backbone_features_list, dim=0)
desc = backbone_features.__cuda_array_interface__
shape = desc["shape"]
strides = desc.get("strides")
dtype = np.dtype(desc["typestr"])
shape, strides, dtype = _prepare_shape_strides_dtype(shape, strides, dtype, order="C")
size = cuda.driver.memory_size_from_info(shape, strides, dtype.itemsize)
devptr = cuda.driver.get_devptr_for_active_ctx(
backbone_features.__cuda_array_interface__["data"][0]
)
data = cuda.driver.MemoryPointer(
current_context(), devptr, size=size, owner=backbone_features
)
ipch = devices.get_context().get_ipc_handle(data)
desc = dict(shape=shape, strides=strides, dtype=dtype)
handle = pickle.dumps([ipch, desc])
and send this handle to another process
You might want to study a CUDA IPC sample code. Sorry, I won’t be able to debug your torch/python/CUDA/IPC code for you.
@Robert_Crovella . I create another topic about it. This code is just of example with numba library, maybe on of numba developers provide me