TensorRT Inference Server rejecting valid trt.engine file generated by TLT

ghazni · August 16, 2020, 7:50pm

Summary

TensorRT Inference Server is not accepting valid trt.engine file generated by TLT (confirmed by TensorRT support team ref: Tensorrt engine file generated by TLT is not acceptable to inference server)

I’ve been advised to describe issue in this forum.

Description

Note: I haven’t changed any code anywhere. Similarly none of the commands were altered. All commands are as per described in online guides or jupyter notebooks.

1: Trained the yolo3 (packages with TLT) with TLT (nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3)
2: converted the converted the yolo_resnet18_epoch_080.etlt to trt.engine file (using commands in jupyter notebook)
3: renamed trt.engine file to model.plan file and moved into tensorrt inference server model_repository (command and output below):

nvidia-docker run --gpus=1 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/infer_models/test_model_repository:/models -e LD_PRELOAD=“/preloadlibs/libnvinfer_plugin.so.7.0.0.1 /preloadlibs/libnvds_infercustomparser_yolov3_tlt.so” nvcr.io/nvidia/tritonserver:20.03-py3 trtserver --model-repository=/models --strict-model-config=false

=============================

== Triton Inference Server ==

NVIDIA Release 20.03 (build 11042949)

Various files include modifications © NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

2020-08-15 15:03:24.627236: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
I0815 15:03:24.657388 1 metrics.cc:164] found 1 GPUs supporting NVML metrics
I0815 15:03:24.663151 1 metrics.cc:173] GPU 0: GeForce RTX 2080 Ti
I0815 15:03:24.663566 1 server.cc:120] Initializing Triton Inference Server
E0815 15:03:26.924412 1 model_repository_manager.cc:1519] model output must specify ‘dims’ for yolov3_resnet18
error: creating server: INTERNAL - failed to load all models

As per link:
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/model_repository.html#section-tensorrt-models

No model configuration file is required for tensorrt engine files. Inference server must be able to read all necessary information from the file itself. Now that TensorRT support team has confirmed this is issue with inference layer (Tensorrt engine file generated by TLT is not acceptable to inference server).

Could you please advise what do i need to do to resolve this issue?

Environment

TensorRT Version : 7.0 (same under TLT container and on host Ubuntu 18.04 system)
GPU Type : RTX 2080Ti
Nvidia Driver Version : 450.57
CUDA Version : 10.2 (same under TLT container and on host Ubuntu 18.04 system)
CUDNN Version : 7.6.5 (TLT container environment), 7.6.3 (Host Ubuntu 18.04 system )
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) : 3.6.9
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) :

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

etlt, trt.engine, libnvinfer_plugin, libinfercustomparser are in following folder:

https://drive.google.com/drive/folders/1DkZjYIu1TmUZAuqIZZuiBrtS9a6SDWyw?usp=sharing

Steps To Reproduce

-no codes changes are made inside TLT (default commands were run to train and convert)
-the command to run inference server is in description section along with errors

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

Topic		Replies	Views
Tensorrt engine file generated by TLT is not acceptable to inference server TensorRT	3	660	August 16, 2020
Using TensorRT Inference Server with TLT models TAO Toolkit	6	1314	October 12, 2021
Convert tensorrt engine from version 7 to 8 TAO Toolkit tensorrt	67	4595	October 12, 2021
Can't load trt engine and throwing an instance of 'nvinfer1::MyelinError' TAO Toolkit	17	2842	October 12, 2021
Invalid model file extension Error for Inference using TensorRT engine TensorRT tensorrt	3	554	October 12, 2021
Inference server failing with YoloV3 Object detection serialized Tensorrt Engine Triton Inference Server (archived) tensorrt	0	916	June 25, 2020
Lprnet: Failed to run the tensorrt engine verification TAO Toolkit	8	1293	October 2, 2021
Tlt-infer don't work with etlt model TAO Toolkit	11	812	October 12, 2021
Inferring detectnet_v2 .trt model in python TAO Toolkit tensorrt	58	3791	August 17, 2021
Tlt-converter does not return the engine file TAO Toolkit	3	447	October 12, 2021