Upgrading TLT exported models to work with TensorRT 7.1.2

Hello. I am trying use the latest 20.06 release of Triton inference server (with an Nvidia T4). However, the new version is incompatible with models create with the tlt-streamanalytics:v2.0_dp_py2 container due to the mismatch in versions. Please advise how to create models that will work with TensorRT 7.1.2.

Refer to https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html#framework-matrix-2020 and https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel_20-03-1.html#rel_20-03-1

Suggest to try 20.03 release.

Thank you for the prompt reply @Morganh. You’re correct, the 20.03.1 release works well with my current TLT exported models. However I have new models that require TensorRT 7.1. The 20.03.1 container does not work with TensorRT models created in TensorRT 7.1 (cause of the upgrade to CUDA 11.0 I imagine). I’d like to avoid having 2 copies of Triton server running at once if possible.

I noticed that there is a tlt-converter available for the Jetson platforms with TensorRT 7.1. Is there a way it could be compiled for x86 platforms that support a T4? So I could create my engine file?

The tlt-converter is inside the tlt_2.0_dp docker. If you run tlt_2.0_dp docker in a x86 platform that supports T4, you can use its tlt-converter to generate a trt engine.

The tlt_2.0_dp docker image does not use TensorRT 7.1 so it is unsuitable for me.

See https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#gen_eng_tlt_converter
The tlt_2.0_dp docker should support trt7.1.

I have pulled nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2 from Nvidia NGC and it uses TRT 7.0 and CUDA 10.2. Are you using a different version? I am confused.

In TLT2.0_dp docker, the tlt-converter for trt7.1 version is not available.
But for Jetpack platform, we provide the tlt-converter of trt7.1 verison. You can generate the trt engine in edge device like Xavier or nano. Then config the trt engine for triton inference server.

Does that have the potential to mess up the neural network? or will it be ok? (running an engine model designed for a Jetson Xavier on a Tesla T4)

Jetpack 4.4 also has CUDA 10.2 instead of CUDA 11.0. I’m not sure if that will cause an incompatibility issue when I move to Triton which is build on CUDA 11.0. Can you answer that?

Actually I did not try this trt engine in 20.06 Triton inference server, not sure its status.
For 20.03.1, it should work.

So, could you please have a quick try for 20.06, just replace the new trt engine with the old one.

$ tree plan_model/
plan_model/
β”œβ”€β”€ 1
β”‚ └── trt.engine
β”œβ”€β”€ config.pbtxt
└── tlt_labels.txt

If you could try it with Triton 20.06 I would appreciate it. I am upgrading my Jetson now to Jetpack 4.4. It may take me awhile to get a working model, and it would be good to know in advance.

My host PC does not upgrade to NVIDIA driver release <<450.36>, so I am afraid I cannot try 20.06.

And according to https://developer.nvidia.com/cuda-gpus#compute, I may make a mistake because if you build the trt engine in nano(5.3), the GPU Compute Capability is different with your T4(7.5).
If the engine plan file is generated on an incompatible device, it will not work.

Major or minor version incompatible? I have a Jetson Xavier with 7.2.

Should be both.
I’m still thinking the solution for your special case.

Oh darn okay. Thank you for still looking into this.

One tip: Suggest you still run 2.0_dp docker in T4, then remove the old tensorrt version , and install the 7.1.2 tensorrt.
Then try to generate the trt engine.
Since the trt engine is built with the same version as 20.06 triton server, it should be working.

Is that possible to change the underlying dependencies like that? I can download CUDA 11.0 and TensorRT 7.1 and just install them in the container. I’m a bit sceptical but am willing to try.

I have the TensorRT 7.1 package installed. I try and run tlt-converterand get the following:
tlt-converter: error while loading shared libraries: libnvinfer_plugin.so.7.0.0: cannot open shared object file: No such file or directory

Please advise.

I tried to workaround by creating a symbolic link from libnvinfer_plugin.so.7.0.0 to libnvinfer_plugin.so.7.1.3

But that lead to the following error:

[ERROR] CUDA initialization failure with error 35. Please check your CUDA installation:  http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Segmentation fault (core dumped)

So I endeavoured to install CUDA 11.0. I got it installed but ran into some issues installing it withing the docker container. Regardless it gave me a similar but new error:

[ERROR] CUDA initialization failure with error 999. Please check your CUDA installation:  http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Segmentation fault (core dumped)

I am reaching the limits of my linux/docker skills; without more guidance I will need more assistance.