How to fast build Tensorrt engine for exact version of Triton-inference-server

Description

Hi everyone,
I have a question about TensorRT (i.e., TRT) engine on Triton-inference-server (i.e., TRTIS).
We all know that we can use docker to deploy the Triton-inference-server very easily that we can deploy our models including TRT, onnx, etc) on local. However, the TRT engine should be generated by the environment as same as Triton-inference-server, so I have to create another container in order to generate a TRT engine for TRTIS.

On NGC tag, we can see that there are two images (One is for server, and the other is for client.). Hence, I checked the client which doesn’t have TRT relevant packages (libraries) so that I cannot build a TensorRT engine directly.

I wonder that is there any faster way to generate the TensorRT engine which can correspond to TRTIS environment.

Thank you so much!!!

Best regards,
Chieh

Hi @Chieh,
This issue looks like more related to TRTIS. Hence request you to raise your query on the below link.
https://forums.developer.nvidia.com/c/ai-deep-learning/libraries-sdk/inference-server/97

Thanks!

No problem. Thanks for your reminding!