Using TLT models with Triton Inference Server

NitinRai · June 29, 2020, 1:45pm

Hi,

I have trained SSD ResNet-18 model using TLT 2.0 (transfer learning toolkit)

docker pull nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2

I have 10 checkpoints for the same model e.g.

ssd_resnet_epoch1.tlt
ssd_resnet_epoch2.tlt
ssd_resnet_epoch3.tlt
.
.
ssd_resnet_epoch10.tlt

What I want to do is load all these models on TRTIS (triton inference server) at once and perform inference.

Questions

What TRTIS version/image tag should I use to support the models trained on TLT 2.0
Can I load .tlt or .etlt model directly on TRTIS
According the documentation we need:

github repository GitHub - triton-inference-server/server: The Triton Inference Server provides an optimized cloud and edge inferencing solution.

TRTIS container
docker pull nvcr.io/nvidia/tritonserver:20.03.1-py3

TRTIS client container
docker pull nvcr.io/nvidia/tritonserver:20.03.1-py3-clientsdk

Is it necessary to have github repository / client container ?

Is there any basic script which can help to directly communicate with the inference server via HTTP ?

How to quickly load/infer/unload TLT models on TRTIS using single container and HTTP API ?

What versions of TRTIS are compatible with TLT v2.0_dp_py2 models ?

Thanks

Morganh · June 29, 2020, 4:40pm

For question 1, please make sure that TLT and TRTIS use same version TRT.
TLT v2.0_dp_py2: TensorRT 7.0+
So please select TRTIS with the same TRT version.
Refer to Frameworks Support Matrix :: NVIDIA Deep Learning Frameworks Documentation and Release Notes :: NVIDIA Deep Learning Triton Inference Server Documentation

NitinRai · July 1, 2020, 5:57am

Thanks, for the support-matrix link.
Can you help with how to load/unload/infer using TLT models, using HTTP API requests ?
I am trying to follow the documentation but didn’t succeed yet.

Morganh · July 1, 2020, 6:33am

Are you trying to follow triton document Documentation - Latest Release :: NVIDIA Deep Learning Triton Inference Server Documentation ?

NitinRai · July 1, 2020, 6:36am

yes

Morganh · July 13, 2020, 9:11am

Sorry for late reply.
For loading trt engine, see Documentation - Latest Release :: NVIDIA Deep Learning Triton Inference Server Documentation

Morganh · July 13, 2020, 9:21am

More reference: https://developer.nvidia.com/blog/nvidia-serves-deep-learning-inference/

sebastian.gonzalez · January 28, 2021, 3:20pm

Hi Nitin, did you find a solution to using TLT Models with Triton Inference Server? I’m having the same issue as you!

Topic		Replies	Views
Using TensorRT Inference Server with TLT models TAO Toolkit	6	1275	October 12, 2021
Tensorrt engine file generated by TLT is not acceptable to inference server TensorRT	3	624	August 16, 2020
Transfer Learning Toolkit for Jetson Nano TAO Toolkit	14	1313	October 12, 2021
Using TLT trained models in TF (or Keras) to run inference TAO Toolkit	9	1428	October 12, 2021
Model exported from tlt2 fails to load on tritonis TAO Toolkit tensorrt	6	725	October 12, 2021
TF-TRT5: How to run tensorflow-tensorrt inferences with multiple GPUs TensorRT	10	3577	September 3, 2019
TensorRT Inference Server rejecting valid trt.engine file generated by TLT Triton Inference Server - archived	0	689	August 16, 2020
GPU difference inference TAO Toolkit	11	985	February 18, 2022
Upgrading TLT exported models to work with TensorRT 7.1.2 TAO Toolkit	23	1750	October 12, 2021
How can I perform inference using a TLT output detectnet_v2 .trt model with in custom tensorflow and python TAO Toolkit tensorflow	4	772	October 12, 2021

Using TLT models with Triton Inference Server

Related topics