How to fast build Tensorrt engine for exact version of Triton-inference-server


Hi everyone,
I have a question about TensorRT (i.e., TRT) engine on Triton-inference-server (i.e., TRTIS).
We all know that we can use docker to deploy the Triton-inference-server very easily that we can deploy our models including TRT, onnx, etc) on local. However, the TRT engine should be generated by the environment as same as Triton-inference-server, so I have to create another container in order to generate a TRT engine for TRTIS.

On NGC tag, we can see that there are two images (One is for server, and the other is for client.). Hence, I checked the client which doesn’t have TRT relevant packages (libraries) so that I cannot build a TensorRT engine directly.

I wonder that is there any faster way to generate the TensorRT engine which can correspond to TRTIS environment.

Thank you so much!!!

Best regards,


Directly build from tensorrt images of NGC.
BTW You have to use the same version.