I have a question about TensorRT (i.e., TRT) engine on Triton-inference-server (i.e., TRTIS).
We all know that we can use docker to deploy the Triton-inference-server very easily that we can deploy our models including TRT, onnx, etc) on local. However, the TRT engine should be generated by the environment as same as Triton-inference-server, so I have to create another container in order to generate a TRT engine for TRTIS.
On NGC tag, we can see that there are two images (One is for server, and the other is for client.). Hence, I checked the client which doesn’t have TRT relevant packages (libraries) so that I cannot build a TensorRT engine directly.
I wonder that is there any faster way to generate the TensorRT engine which can correspond to TRTIS environment.
Thank you so much!!!