TensorRT Triton server for multiple model instances

Dear Team,

I have a query regarding the running of multiple model parallelly with TensorRT optimizations.

Can you please reply to my queries below:

  1. Is it possible to use Triton server with TensorRT
  2. Where we can find the Triton APIs in C++ which need to be used with TensorRT sample application
  3. Do you have any reference TensorRT sample application where multiple models is tested with Triton server
  4. What is the feasibility of running the TensorRT sample application with Triton APIs on
    Nvidia Evaluation kits like Nvidia AGX Orin etc

Thanks and Regards,
Vyom Mishra

Dear @vyom.mishra ,
Triton uses TensorRT APIs internally for model preparation and perform inference. We have not tested Triton on DRIVE AGX Orin. But, you can try build it from source on DRIVE AGX Orin as ARM is supported.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.