Using TensorRT Inference Server with TLT models

mbufi · March 13, 2020, 5:38pm

Hello,

I am posting this here because I am not sure if this is a TLT question or inference server question…

To test out TensorRT Inference Server, I trained a quick Resnet50 Classification model with TLT.

Everything worked out great, so I exported it and then converted it into a trt model on my x86 machine with the exporter from the docker container.

The converter creats a .trt model
TensorRT server expects a .plan

How can I get the server to load a .trt or get TLT to create a .plan?

Thank you!

mbufi · March 13, 2020, 5:57pm

For more information,

TLT Docker = tlt-streamanalytics:v1.0_py2

TRTIS Docker = tensorrtserver:19.10-py3

For TRT models the easiest way to get the correct model configuration for TRTIS is to not provide a config.pbtxt and instead use --strict-model-config=false. See https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/model_configuration.html#generated-model-configuration

If I do the following command:

sudo docker run --gpus all --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -v/tensorrt-inference-server/docs/examples/model_repo_test:/tmp/models nvcr.io/nvidia/tensorrtserver:19.10-py3 /opt/tensorrtserver/bin/trtserver --model-store=/tmp/models --strict-model-config=false

I get the error:

~/ai/tensorrt-inference-server$ sudo docker run --gpus all --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -v/home/tensorrt-inference-server/docs/examples/model_repo_test:/tmp/models nvcr.io/nvidia/tensorrtserver:19.10-py3 /opt/tensorrtserver/bin/trtserver --model-store=/tmp/models --strict-model-config=false 

===============================
== TensorRT Inference Server ==
===============================

NVIDIA Release 19.10 (build 8266503)

Copyright (c) 2018-2019, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

I0313 17:56:15.270401 1 metrics.cc:160] found 1 GPUs supporting NVML metrics
I0313 17:56:15.276123 1 metrics.cc:169]   GPU 0: GeForce RTX 2080
I0313 17:56:15.276324 1 server.cc:110] Initializing TensorRT Inference Server
E0313 17:56:17.103776 1 logging.cc:43] ../rtSafe/coreReadArchive.cpp (31) - Serialization Error in verifyHeader: 0 (Magic tag does not match)
E0313 17:56:17.103841 1 logging.cc:43] INVALID_STATE: std::exception
E0313 17:56:17.103847 1 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0313 17:56:17.110454 1 model_repository_manager.cc:1453] must specify platform for model 'resnet50_steel'
E0313 17:56:17.110505 1 main.cc:1099] error: creating server: INTERNAL - failed to load all models

Do you know why I cant generate this config file properly?

Morganh · March 16, 2020, 4:43am

Hi martin,
The tlt-export will generate an etlt model.
The tlt-converter will generate a trt engine. That is exactly the trt plan file.
From your comment, “exported it and then converted it into a trt model on my x86 machine”, did you convert it via tlt-converter?

More info, please see TRT engine deployment

mbufi · March 16, 2020, 11:44am

Hi Morganh,

Yes I used the tlt-converter on my desktop computer to create the tensorRT .trt model.

Still does not work with TRTIS

andrliu · March 17, 2020, 3:00pm

Hi Martin,

You have to make sure that TLT and TRTIS use same version TRT.
In your setting,
TLT Docker = tlt-streamanalytics:v1.0_py2: TRT 5.1.5
TRTIS Docker = tensorrtserver:19.10-py3: TRT: 6.0.1
In that case, TRTIS will not be able to recognize TRT engine,

tensorrtserver:19.08-py3 with should be able to work with your TLT TRT engine.

Good luck!

mbufi · March 29, 2020, 12:44pm

@andrliu
Great! Thank you for the update. Could you please provide me a config file that would work with loading my TLT trained model with TensorRT Server?

That way I know I am doing it correctly.

Many thanks,
Martin

Topic		Replies	Views
TensorRT Inference Server rejecting valid trt.engine file generated by TLT Triton Inference Server - archived	0	692	August 16, 2020
Tensorrt engine file generated by TLT is not acceptable to inference server TensorRT	3	632	August 16, 2020
Using TLT models with Triton Inference Server TAO Toolkit tensorrt	8	1445	October 12, 2021
How do I import the trained model from TLT to Triton? TAO Toolkit	2	862	October 12, 2021
Model exported from tlt2 fails to load on tritonis TAO Toolkit tensorrt	6	736	October 12, 2021
Upgrading TLT exported models to work with TensorRT 7.1.2 TAO Toolkit	23	1763	October 12, 2021
Tlt-convert on jetson nano TAO Toolkit	6	1850	October 12, 2021
During the Transfer Learning SDK, problems occurred when the model was deployed to Deepstream Deep Learning (Training & Inference)	1	336	November 28, 2019
Very bad result on tlt mobilenetv2 tensorrt TensorRT	5	1041	January 5, 2022
Nvidia Transfer Learning Toolkit tlt-converter for TensorRT 6 TAO Toolkit	16	1793	October 12, 2021

Using TensorRT Inference Server with TLT models

Related topics