Is there a plan to support MiG on Orin AGX?

We are currently deploying multiple AI models on the AGX Orin using TensorRT, but we have observed a notable decline in inference performance after launching multiple servers. We believe that MIG (Multi-Instance GPU) could potentially be an excellent solution to address this issue. I was wondering if there are any plans to support MIG on the AGX Orin in the future? Alternatively, if there are any other recommendations or best practices for efficiently running multiple AI models on a single AGX Orin, we would be very grateful for your guidance. Thank you!

Hi,

MiG needs hardware support which Orin doesn’t have.

To solve the issue, you can try to deploy engines on different CUDA streams.
Tasks on the different streams share the GPU resource in a time-slicing manner.

Thanks.

Thank you for the information regarding the current lack of MIG support on the AGX Orin. Regarding the approach of assigning engines to different CUDA streams, we have some questions. If different tasks share the GPU resources in a time-slicing manner, it should lead to a decline in performance. Are there any strategies or solutions to allocate and schedule computational resources (e.g., SMs) in a way that minimizes the performance drop for different tasks?

Hi,

If the tasks are all in the same process, you can check if green context can meet your requirements.

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.