TensorRT Triton server for multiple model instances

vyom.mishra · November 30, 2022, 3:59am

Dear Team,

I have a query regarding the running of multiple model parallelly with TensorRT optimizations.

Can you please reply to my queries below:

Is it possible to use Triton server with TensorRT
Where we can find the Triton APIs in C++ which need to be used with TensorRT sample application
Do you have any reference TensorRT sample application where multiple models is tested with Triton server
What is the feasibility of running the TensorRT sample application with Triton APIs on
Nvidia Evaluation kits like Nvidia AGX Orin etc

Thanks and Regards,
Vyom Mishra

SivaRamaKrishnaNV · December 1, 2022, 6:21am

Dear @vyom.mishra ,
Triton uses TensorRT APIs internally for model preparation and perform inference. We have not tested Triton on DRIVE AGX Orin. But, you can try build it from source on DRIVE AGX Orin as ARM is supported.

system · December 15, 2022, 6:21am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.