The tlt-converter is inside the tlt_2.0_dp docker. If you run tlt_2.0_dp docker in a x86 platform that supports T4, you can use its tlt-converter to generate a trt engine.
The tlt_2.0_dp docker image does not use TensorRT 7.1 so it is unsuitable for me.
The tlt_2.0_dp docker should support trt7.1.
I have pulled
nvcr.io/nvidia/tlt-streamanalytics:v2.0_dp_py2 from Nvidia NGC and it uses TRT 7.0 and CUDA 10.2. Are you using a different version? I am confused.
In TLT2.0_dp docker, the tlt-converter for trt7.1 version is not available.
But for Jetpack platform, we provide the tlt-converter of trt7.1 verison. You can generate the trt engine in edge device like Xavier or nano. Then config the trt engine for triton inference server.
Does that have the potential to mess up the neural network? or will it be ok? (running an engine model designed for a Jetson Xavier on a Tesla T4)
Jetpack 4.4 also has CUDA 10.2 instead of CUDA 11.0. I’m not sure if that will cause an incompatibility issue when I move to Triton which is build on CUDA 11.0. Can you answer that?
Actually I did not try this trt engine in 20.06 Triton inference server, not sure its status.
For 20.03.1, it should work.
So, could you please have a quick try for 20.06, just replace the new trt engine with the old one.
$ tree plan_model/
│ └── trt.engine
If you could try it with Triton 20.06 I would appreciate it. I am upgrading my Jetson now to Jetpack 4.4. It may take me awhile to get a working model, and it would be good to know in advance.
My host PC does not upgrade to NVIDIA driver release <<450.36>, so I am afraid I cannot try 20.06.
And according to https://developer.nvidia.com/cuda-gpus#compute, I may make a mistake because if you build the trt engine in nano(5.3), the GPU Compute Capability is different with your T4(7.5).
If the engine plan file is generated on an incompatible device, it will not work.
Major or minor version incompatible? I have a Jetson Xavier with 7.2.
Should be both.
I’m still thinking the solution for your special case.
Oh darn okay. Thank you for still looking into this.
One tip: Suggest you still run 2.0_dp docker in T4, then remove the old tensorrt version , and install the 7.1.2 tensorrt.
Then try to generate the trt engine.
Since the trt engine is built with the same version as 20.06 triton server, it should be working.
Is that possible to change the underlying dependencies like that? I can download CUDA 11.0 and TensorRT 7.1 and just install them in the container. I’m a bit sceptical but am willing to try.
I have the TensorRT 7.1 package installed. I try and run
tlt-converterand get the following:
tlt-converter: error while loading shared libraries: libnvinfer_plugin.so.7.0.0: cannot open shared object file: No such file or directory
I tried to workaround by creating a symbolic link from
But that lead to the following error:
[ERROR] CUDA initialization failure with error 35. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html Segmentation fault (core dumped)
So I endeavoured to install CUDA 11.0. I got it installed but ran into some issues installing it withing the docker container. Regardless it gave me a similar but new error:
[ERROR] CUDA initialization failure with error 999. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html Segmentation fault (core dumped)
I am reaching the limits of my linux/docker skills; without more guidance I will need more assistance.
So, the tlt-converter for 7.1.2version is still the bottleneck. I will sync with internal team.
For deployment on x86 with NVIDIA GPU, download the tlt-converter file for the appropriate CUDA/TensorRT version to convert the model from UFF to TensorRT engine
Thank you Morganh. That works with 20.06.