How can I share TensorGPU object in multiple processes in python

I am using the NVIDIA Dali pipeline to get an output as a TensorGPU object,
I like to share those TensorGPU objects (nvidia.dali.backend_impl.TensorGPU) among the other processes,
I following the steps like

  • Create a pointer for the TensorGPU object.
  • Sharing those the pointer with handle in IPC.
  • And use those pointers to retrieve the actual TensorGPU object in other processes.

Please refer to the following code for reference:


from nvidia.dali.pipeline import Pipeline
import nvidia.dali.fn as fn
import pycuda.driver as drv

pipe = Pipeline(batch_size=batch_size, num_threads=1, device_id=0)
with pipe:
        for i in eii_list:
            images = fn.external_source(source=i, num_outputs=1,device="gpu",batch=False)           
            enhance = fn.brightness_contrast(images, contrast=1)            
            pipe.set_outputs(enhance)

pipe.build()
data=batch_gpu.at(0)
#this data is of type <class 'nvidia.dali.backend_impl.TensorGPU'>
h = drv.mem_get_ipc_handle(data.data_ptr()) #data_ptr is the pointer to the TensorGPU object

but while creating a handle for data_ptr it gives the error: cuIpcGetMemHandle failed: invalid argument

I would like to ask for help on two points:

  • How can I create a handle for the TensorGPU object with IPC so I can share that pointer handle to other processes and,
  • How I can get the value again from that GPU pointer?

Hi,
The below link might be useful for you
https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#thread-safety
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#stream-priorities
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html
For multi threading/streaming, will suggest you to use Deepstream or TRITON
For more details, we recommend you to raise the query to the Deepstream or TRITON forum.

Thanks!