I have a query regarding the running of multiple model parallelly with TensorRT optimizations.
Can you please reply to my queries below:
- Is it possible to use Triton server with TensorRT
- Where we can find the Triton APIs in C++ which need to be used with TensorRT sample application
- Do you have any reference TensorRT sample application where multiple models is tested with Triton server
- What is the feasibility of running the TensorRT sample application with Triton APIs on
Nvidia Evaluation kits like Nvidia AGX Orin etc
Thanks and Regards,