Triton GPU shared memory on Orin

Hi I see here: Triton Inference Server Support for Jetson and JetPack — NVIDIA Triton Inference Server

the following:

  • Onnx Runtime backend does not support the OpenVino and TensorRT execution providers. The CUDA execution provider is in Beta.

  • The Python backend does not support GPU Tensors and Async BLS.

  • CUDA IPC (shared memory) is not supported. System shared memory however is supported.

  • GPU metrics, GCS storage, S3 storage and Azure storage are not supported.

CUDA IPC would be great to have. Any idea if it will be supported in the coming months? Alternatively, how can I enable GPU memory sharing between client and server on C++ Triton on the NVIDIA Orin (Jetpack 5 or 6)