Using TLT models with Triton Inference Server

Hi,

I have trained SSD ResNet-18 model using TLT 2.0 (transfer learning toolkit)

docker pull nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2

I have 10 checkpoints for the same model e.g.

ssd_resnet_epoch1.tlt
ssd_resnet_epoch2.tlt
ssd_resnet_epoch3.tlt
.
.
ssd_resnet_epoch10.tlt

What I want to do is load all these models on TRTIS (triton inference server) at once and perform inference.

Questions

  1. What TRTIS version/image tag should I use to support the models trained on TLT 2.0
  2. Can I load .tlt or .etlt model directly on TRTIS
  3. According the documentation we need:

Is it necessary to have github repository / client container ?

Is there any basic script which can help to directly communicate with the inference server via HTTP ?

How to quickly load/infer/unload TLT models on TRTIS using single container and HTTP API ?

What versions of TRTIS are compatible with TLT v2.0_dp_py2 models ?

Thanks

For question 1, please make sure that TLT and TRTIS use same version TRT.
TLT v2.0_dp_py2: TensorRT 7.0+
So please select TRTIS with the same TRT version.
Refer to https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html#framework-matrix-2020 and https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel_20-03-1.html#rel_20-03-1

Thanks, for the support-matrix link.
Can you help with how to load/unload/infer using TLT models, using HTTP API requests ?
I am trying to follow the documentation but didn’t succeed yet.

Are you trying to follow triton document https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/quickstart.html ?

yes