Using TLT models with Triton Inference Server

Hi,

I have trained SSD ResNet-18 model using TLT 2.0 (transfer learning toolkit)

docker pull nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2

I have 10 checkpoints for the same model e.g.

ssd_resnet_epoch1.tlt
ssd_resnet_epoch2.tlt
ssd_resnet_epoch3.tlt
.
.
ssd_resnet_epoch10.tlt

What I want to do is load all these models on TRTIS (triton inference server) at once and perform inference.

Questions

  1. What TRTIS version/image tag should I use to support the models trained on TLT 2.0
  2. Can I load .tlt or .etlt model directly on TRTIS
  3. According the documentation we need:

Is it necessary to have github repository / client container ?

Is there any basic script which can help to directly communicate with the inference server via HTTP ?

How to quickly load/infer/unload TLT models on TRTIS using single container and HTTP API ?

What versions of TRTIS are compatible with TLT v2.0_dp_py2 models ?

Thanks

For question 1, please make sure that TLT and TRTIS use same version TRT.
TLT v2.0_dp_py2: TensorRT 7.0+
So please select TRTIS with the same TRT version.
Refer to Frameworks Support Matrix :: NVIDIA Deep Learning Frameworks Documentation and Release Notes :: NVIDIA Deep Learning Triton Inference Server Documentation

Thanks, for the support-matrix link.
Can you help with how to load/unload/infer using TLT models, using HTTP API requests ?
I am trying to follow the documentation but didn’t succeed yet.

Are you trying to follow triton document Documentation - Latest Release :: NVIDIA Deep Learning Triton Inference Server Documentation ?

yes

Sorry for late reply.
For loading trt engine, see Documentation - Latest Release :: NVIDIA Deep Learning Triton Inference Server Documentation

More reference: https://developer.nvidia.com/blog/nvidia-serves-deep-learning-inference/

Hi Nitin, did you find a solution to using TLT Models with Triton Inference Server? I’m having the same issue as you!