TensorRT L4T docker image Python version Issue

mdegans · September 21, 2021, 9:27pm

In the TensorRT L4T docker image, the default python version is 3.8, but apt aliases like python3-dev install 3.6 versions (so package building is broken) and any python-foo packages aren’t found by python. For some packages like python-opencv building from sources takes prohibitively long on Tegra, so software that relies on it and TensorRT can’t work, at least with the default python3 version.

example:

root@ab4490a9c568:/app# apt-get install python3-opencv
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python3-opencv is already the newest version (3.2.0+dfsg-4ubuntu0.1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
root@ab4490a9c568:/app# python3
Python 3.8.0 (default, Feb 25 2021, 22:10:10) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'cv2'

Also, looks like a manual symlink was made from python3.8 to python3 instead of using update-alternatives. You can set it like this instead (for 3.6):

# update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 2
update-alternatives: using /usr/bin/python3.6 to provide /usr/bin/python3 (python3) in auto mode

That way the package manager knows what the default Python version is. Still won’t provide missing python-foo packages, but it’s a start.

It would be nice if TensorRT images worked the same on Tegra and x86. Unfortunately this can’t be relied on. Every time I have to port a Dockerfile to Tegra it’s at least a day of if tegra... workarounds leading to build scripts and --build-arg=blabla when ideally the same Dockerfile should just work on any platform, as is the case with images derived from ubuntu:latest. It would be really nice to just docker-compose up and have everything just work as there is nothing technically prohibiting this.

NVES · September 22, 2021, 4:38am

Hi ,
We recommend you to check the supported features from the below link.

You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.4 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Thanks!

mdegans · September 22, 2021, 5:54am

Thanks for your reply,

I apologize for being unclear. This is about a misconfigured python setup in your L4T TensorRT base image, unfortunately. It has nothing to do with the model. The model we are using works fine outside the container on the same version of TensorRT.

The issue is our software uses OpenCV for some preprocessing (not my decision) and that isn’t easily installable with the default version of python used (which does not match L4T.

I appreciate the move towards fully containerized solutions for Tegra, but your base image has too may differences from the x86 version to make it useful.

A Dockerfile written for x86 shout “just work” on Tegra as if does with “ubuntu:latest” or alpine or any other number of base images that manage multiple architectures (with a manifest). The base image should be the same. The package set, repo’s, should all be the same. If they’re not, it should be considered broken.

I do this regularly with non-nvidia docker images. FROM ubuntu:latest ... and it works on everything. When I am forced to work with Nvidia base images on Tegra on the other hand I spend more time working around breakage like this than actually developing software.

That’s not an exaggeration. It seem as if there is no quality control at all on this. You should have a series of Dockerfiles you test against both arm64 and amd64 base images. It’s clear that’s not done and it’s very very frustrating. On x86, your images are a pleasure. On arm very much the opposite.

spolisetty · September 23, 2021, 11:50am

Hi,

Have you tried TensorRT NGC containers NVIDIA NGC if it serves your purpose ?

Thank you.

mdegans · September 24, 2021, 4:28pm

I am referring to the NGC image. Sorry I wasn’t clear on this:

nvcr.io/nvidia/l4t-tensorrt:r8.0.1-runtime

spolisetty · September 27, 2021, 1:18pm

Hi,

This looks like out of scope for TensorRT, may be following link will be helpful to you python - " No module named 'cv2' " but it is installed - Stack Overflow

You can also try using TensorRT NGC container, if it serves your purpose.
https://ngc.nvidia.com/containers/nvidia:tensorrt

Thank you.

mdegans · September 27, 2021, 6:27pm

It looks like I’m not being clear enough. the problem is with your TensorRT NGC image, nvcr.io/nvidia/l4t-tensorrt:r8.0.1-runtime

The problem is, again, there are two python3 installs in that image. Python packages installed through apt-get do not always work with this setup. Yes, you can pip install them but there are supply chain issues there and the packages are not tested against other system packages. Further, if a wheel isn’t found, it can take hours to build packages on Tegra.

Your x86 image uses Ubuntu 20.04 as a base. If you did the same for Tegra it would solve the issue entirely since it’s Python is 3.8 but the Tegra image is Ubuntu 18.04 based (python 3.6). If this somehow isn’t clear, I apologize (and give up, since it seems i’m not getting through).

ework · September 30, 2021, 12:05am

@mdegans, are you saying that the L4T container has python 3.8 installed even though the base OS (Ubuntu 18.04) is using python 3.6? On the bare-metal (outside the container) TensorRT Debian packages only support python 3.6 on JetPack since it’s the default system python. See Support Matrix :: NVIDIA Deep Learning TensorRT Documentation

mdegans · September 30, 2021, 12:24am

Yes. Exactly that. Some python3-foo package will still work depending on where their installer places them (see sys.path), but not all:

 $ sudo docker run -it --rm nvcr.io/nvidia/l4t-tensorrt:r8.0.1-runtime
root@888659f49bd5:/# python3
Python 3.8.0 (default, Feb 25 2021, 22:10:10) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages']
>>> exit()
root@888659f49bd5:/# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"

Again, apologies for not being clearer before. The image is ubuntu:bionic based apparently (like l4t) but has python 3.8. That’s doable, but it breaks some python packages distributed through apt-get. I assumed this intentional for compat with the x86 image which is based off ubuntu:focal. I guess not.

ework · September 30, 2021, 12:40am

I was originally going to say it was not intentional, but I think the build system is common between x86 and aarch64. This might have motivated the choice for using python 3.8. If it’s like the x86 container then the python bindings for TensorRT and other libraries are in /usr/local/... and installed using a whl file rather than Debian packages. As you stated this means you will not be able to install packages using apt and have them work in the default python3 environment. You’ll need to use pip or other methods to install what you need.

mdegans · September 30, 2021, 12:43am

Issue is not all packages have pip wheels on arm64 and building them can take a very very long time. Can you use a ubuntu:focal as a base image on Tegra instead?

Also, I tend to trust packages from Canonical more than pypi given supply chain issues. The former are signed with Canonical’s gpg key. The latter I’m one typo away from installing malware.

mdegans · September 30, 2021, 12:45am

Kind of doubt that. There are a bunch more differences between the arm64 and amd64 images. A test suite of common Dockerfiles might help.

ework · September 30, 2021, 12:52am

I guess the only option for now would be to install the Debian packages for TensorRT yourself into the container, which will be for 3.6. I’ll see if I can figure out why they did this.

mdegans · September 30, 2021, 12:58am

Thanks. Appreciate it. Ideally, i’d like to be able to use the same Dockerfile for x86 and tegra, with a build script just supplying the base image as an argument a bit like this:

ARG BASE_IMAGE

FROM ${BASE_IMAGE}
...

Unfortunately i’m finding it never goes smoothly since there are too many difference between the x86 and aarch64 images. Same Dockerfile there’s a line like this, for example:

RUN if [ -f "/opt/tensorrt/install_opensource.sh" ] ; then /opt/tensorrt/install_opensource.sh ; fi

That’s kind of ugly. And this stuff ends up taking up half the Dockerfile. I would greatly appreciate it if you had a test suite to check these kinds of things.

mdegans · September 30, 2021, 1:04am

Yes, but that can’t be automated because the downloads are behind a login wall. I could COPY it into the image, but that would increase the image size since docker layers are COW. Also, a bunch of nvidia l4t packages refuse to install on a non-l4t-base rootfs. I don’t have the time to tear apart a bunch of debian packages to find what preinst script is breaking stuff.

Likewise l4t-base has no nvidia apt sources enabled so a apt-get install tensorrt is out of the question. It’s possible to add the apt sources, but again, it’s a ton of ugliness and hacking and it shouldn’t be necessary because that image should have the sources out of the box (not relying on bind mounting all of the things). Please, test both x86 and aarch64 images against common Dockerfiles and consider any differences breakage.

ework · September 30, 2021, 10:52pm

@mdegans, I’ve found out who maintains this container and been discussing the issue with them. Hopefully we can take a look at this python incompatibility in the next release.

mdegans · October 1, 2021, 7:42pm

Thanks. If there’s one bit of feedback I’d love to give your nvidia-docker on Tegra team is that the bind mounting approach should be scrapped. It means you need different base image Dockerfiles for each architecture, and for each JP release, and this divergence leads to countless bugs. Heck, it even breaks “docker build” unless you make the nvidia runtime the default.

If I pull ubuntu:latest from docker hub, I get the same thing on x86 and aarch64. My Dockerfiles that are FROM ubuntu:latest will “just work”. That means I write less platform specific code which is really, really nice. As soon as that breaks, so does everything downstream.

Yes, it means the base image will be larger and you won’t be able to “cheat” by bind mounting all of the things at runtime, but this could be solved by releasing a “l4t-slim” or “core” image dedicated to just running containers. It’d be more repeatable. It’d be more reliable. The system attack surface would be minimized, and you’d avoid privilege escalation CVEs like this – a direct result of using this bind mounting approach.

ework · October 1, 2021, 8:04pm

This has been a pain point for our team as well because we have to inform them which directories/files to mount for our developed libraries. There are plans to move away from this approach to just plain stand-alone containers, but not sure what the timeline is for that.

BenButterworth · January 27, 2022, 7:56pm

Hey @ework, did your discussions with the maintainer come of anything?

:)

ework · January 27, 2022, 8:43pm

Yes, this task has been picked up for development, but not sure when it would be completed. Unfortunately (depending on how you see it) an L4T release based on Ubuntu 20.04, which would also solve this issue, is more likely to occur first.

Topic		Replies	Views
Unable to use TensorRT inside the L4T-Tensorflow container Jetson Xavier NX tensorrt	9	1957	October 18, 2021
Docker image with python support for OpenCV, TensorRT and PyCuda Jetson TX2	20	9302	October 18, 2021
L4t-base:36.4.X image missing, l4t-base:36.2.0 image misconfigured and no tensorflow wheels for JP>6.1 Jetson AGX Orin tensorrt , cuda , tensorflow	6	139	March 6, 2025
L4t-tensorrt image for TRT 7 / jp45 Jetson Xavier NX tensorrt	3	536	April 13, 2023
Could not build docker image in DL4AGX DRIVE AGX Xavier General tensorrt	6	742	June 7, 2020
Suggestion to solve Tegra Nvidia-docker issues Jetson Nano cuda , docker	20	4069	October 15, 2021
TensorRT v21.12-py3 Docker image cannot work with GPU option on ARM (AGX) device TensorRT tensorrt	13	2548	February 18, 2024
Problem creating my devel container in jetson AGX container Jetson AGX Xavier containers	18	3474	December 6, 2021
Docker image similar to the l4t version DeepStream SDK	2	146	April 11, 2024
Installing Pycuda for TensorRT Dorcker Image Jetson TX2 tensorrt , docker , pycuda	18	848	March 26, 2024

TensorRT L4T docker image Python version Issue

Related topics