Torch2trt fail when building container from jetson-containers

castej10 · January 21, 2025, 1:26am

platform: Jetson Orin AGX
l4T: 36.4.0
Jetpack: 6.1
CUDA:12.6.68

Hello, Im trying to build a docker container using jetson-container and i keep getting an error when building a custom container.

when i try to build using jetson-containers build --name=amigo_detect/ ros:humble-ros-base nanoowl, The ros part builds but when building the nanoowl part i get this error:

-- Building container amigo_detect/nanoowl:r36.4.0-torch2trt

DOCKER_BUILDKIT=0 docker build --network=host --tag amigo_detect/nanoowl:r36.4.0-torch2trt \
--file /home/castej-jetson/jetson-containers/packages/pytorch/torch2trt/Dockerfile \
--build-arg BASE_IMAGE=amigo_detect/nanoowl:r36.4.0-torchvision \
/home/castej-jetson/jetson-containers/packages/pytorch/torch2trt \
2>&1 | tee /home/castej-jetson/jetson-containers/logs/20250120_183217/build/amigo_detect_nanoowl_r36.4.0-torch2trt.txt; exit ${PIPESTATUS[0]}

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
            environment-variable.

Sending build context to Docker daemon  15.36kB
Step 1/6 : ARG BASE_IMAGE
Step 2/6 : FROM ${BASE_IMAGE}
 ---> 882d0e2cf731
Step 3/6 : ADD https://api.github.com/repos/NVIDIA-AI-IOT/torch2trt/git/refs/heads/master /tmp/torch2trt_version.json

 ---> 433fbe242b68
Step 4/6 : COPY patches/ /tmp/patches/
 ---> 6dd6285cdde7
Step 5/6 : RUN cd /opt &&     git clone --depth=1 https://github.com/NVIDIA-AI-IOT/torch2trt &&     cd torch2trt &&     cp /tmp/patches/flattener.py torch2trt &&     pip3 install --verbose . &&     sed 's|^set(CUDA_ARCHITECTURES.*|#|g' -i CMakeLists.txt &&     sed 's|Catch2_FOUND|False|g' -i CMakeLists.txt &&     cmake -B build -DCUDA_ARCHITECTURES=${CUDA_ARCHITECTURES} . &&     cmake --build build --target install &&     ldconfig &&     pip3 install --no-cache-dir --verbose nvidia-pyindex &&     pip3 install --no-cache-dir --verbose onnx-graphsurgeon
 ---> Running in f0f57afeed4b
Cloning into 'torch2trt'...
Using pip 24.3.1 from /usr/local/lib/python3.10/dist-packages/pip (python 3.10)
Looking in indexes: https://pypi.jetson-ai-lab.dev/jp6/cu126
Processing /opt/torch2trt
  Preparing metadata (setup.py): started
  Running command python setup.py egg_info
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/opt/torch2trt/setup.py", line 2, in <module>
      import tensorrt
    File "/usr/local/lib/python3.10/dist-packages/tensorrt/__init__.py", line 75, in <module>
      from .tensorrt import *
  ImportError: libnvdla_compiler.so: cannot open shared object file: No such file or directory
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /usr/bin/python3.10 -c '
  exec(compile('"'"''"'"''"'"'
  # This is <pip-setuptools-caller> -- a caller that pip uses to run setup.py
  #
  # - It imports setuptools before invoking setup.py, to enable projects that directly
  #   import from `distutils.core` to work with newer packaging standards.
  # - It provides a clear error message when setuptools is not installed.
  # - It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so
  #   setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:
  #     manifest_maker: standard file '"'"'-c'"'"' not found".
  # - It generates a shim setup.py, for handling setup.cfg-only projects.
  import os, sys, tokenize
  
  try:
      import setuptools
  except ImportError as error:
      print(
          "ERROR: Can not execute `setup.py` since setuptools is not available in "
          "the build environment.",
          file=sys.stderr,
      )
      sys.exit(1)
  
  __file__ = %r
  sys.argv[0] = __file__
  
  if os.path.exists(__file__):
      filename = __file__
      with tokenize.open(__file__) as f:
          setup_py_code = f.read()
  else:
      filename = "<auto-generated setuptools caller>"
      setup_py_code = "from setuptools import setup; setup()"
  
  exec(compile(setup_py_code, filename, "exec"))
  '"'"''"'"''"'"' % ('"'"'/opt/torch2trt/setup.py'"'"',), "<pip-setuptools-caller>", "exec"))' egg_info --egg-base /tmp/pip-pip-egg-info-0dvngrhj
  cwd: /opt/torch2trt/
  Preparing metadata (setup.py): finished with status 'error'
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
The command '/bin/bash -c cd /opt &&     git clone --depth=1 https://github.com/NVIDIA-AI-IOT/torch2trt &&     cd torch2trt &&     cp /tmp/patches/flattener.py torch2trt &&     pip3 install --verbose . &&     sed 's|^set(CUDA_ARCHITECTURES.*|#|g' -i CMakeLists.txt &&     sed 's|Catch2_FOUND|False|g' -i CMakeLists.txt &&     cmake -B build -DCUDA_ARCHITECTURES=${CUDA_ARCHITECTURES} . &&     cmake --build build --target install &&     ldconfig &&     pip3 install --no-cache-dir --verbose nvidia-pyindex &&     pip3 install --no-cache-dir --verbose onnx-graphsurgeon' returned a non-zero code: 1
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/castej-jetson/jetson-containers/jetson_containers/build.py", line 112, in <module>
    build_container(args.name, args.packages, args.base, args.build_flags, args.build_args, args.simulate, args.skip_tests, args.test_only, args.push, args.no_github_api, args.skip_packages)
  File "/home/castej-jetson/jetson-containers/jetson_containers/container.py", line 147, in build_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash', shell=True, check=True)  
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'DOCKER_BUILDKIT=0 docker build --network=host --tag amigo_detect/nanoowl:r36.4.0-torch2trt --file /home/castej-jetson/jetson-containers/packages/pytorch/torch2trt/Dockerfile --build-arg BASE_IMAGE=amigo_detect/nanoowl:r36.4.0-torchvision /home/castej-jetson/jetson-containers/packages/pytorch/torch2trt 2>&1 | tee /home/castej-jetson/jetson-containers/logs/20250120_183217/build/amigo_detect_nanoowl_r36.4.0-torch2trt.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.

I definitely have libnvdla_compiler.so in /usr/lib/aarch64-linux-gnu/tegra/ so im not sure what is happening.

carolyuu · January 21, 2025, 1:30am

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

TensorFlow: Installing TensorFlow for Jetson Platform - NVIDIA Docs
PyTorch: Installing PyTorch for Jetson Platform - NVIDIA Docs
We also have containers that have frameworks preinstalled:
Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC

3. Tutorial

Startup deep learning tutorial:

Jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson
TensorRT sample: Jetson/L4T/TRT Customized Example - eLinux.org

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

AastaLLL · January 21, 2025, 4:07am

Hi,

Step 4/6 : COPY patches/ /tmp/patches/
 ---> 6dd6285cdde7

Could you launch the 6dd6285cdde7 image and check if libnvdla_compiler.so is mounted into the container as well?

Thanks.

castej10 · January 21, 2025, 2:36pm

hello, yes it is found inside the container:

castej-jetson@castejjetson-desktop:~$ sudo docker run --runtime nvidia -it --rm --network=host 6dd6285cdde7
sourcing   /opt/ros/humble/install/setup.bash
ROS_DISTRO humble
ROS_ROOT   /opt/ros/humble
root@castejjetson-desktop:/# find / -name "libnvdla_compiler.so" 2>/dev/null
/usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so
root@castejjetson-desktop:/#

AastaLLL · January 23, 2025, 8:29am

Hi,

Thanks for the testing.

Could you also check if you can import tensorrt correctly?
If you can reproduce a similar error when import tensorrt, could you check if adding the libnvdla_compiler.so location to the LD_LIBRARY_PATH helps?

If so, please help to update the Dockerfile accordingly and build it again.

Thanks.

castej10 · January 27, 2025, 4:21pm

Hello @AastaLLL Thanks for the reply.
I am able to import tensorrt correctly and cannot reproduce a similar error.

root@castejjetson-desktop:/# python3
Python 3.10.12 (main, Jan 17 2025, 14:35:34) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorrt
>>> tensorrt.__version__
'10.4.0'
>>> 
KeyboardInterrupt
>>> 
root@castejjetson-desktop:/# find / -name "libnvdla_compiler.so"
/usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so

So is the rest of the container looking for “libnvdla_compiler.so” in a different path??

AastaLLL · January 28, 2025, 7:21pm

Hi,

We got more info about this issue.

As libnvdla_compiler.so is mounted with nvidia runtime, it doesn’t exist in the building stage.
A possible WAR seems to build the torch2trt by launching it with nvidia runtime and saving the package for installation directly.

Below is a related discussion for your reference:

Thanks.

castej10 · February 5, 2025, 1:03am

@AastaLLL I’m a bit confused on how to proceed, is the problem inherently fixed or would i have to reflash the Jetson for it to get fixed?
Thanks

whitesscott · February 5, 2025, 3:36am

That file is installed via the nvidia-l4t-dla-compiler package.

dpkg -S /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so
nvidia-l4t-dla-compiler: /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so

You could try installing that package to see if it helps.

AastaLLL · February 10, 2025, 8:11am

Hi,

The issue comes from TensorRT requiring the DLA driver (libnvdla_compiler.so).
The DLA driver was part of the TensorRT library but moved to the OOT driver from JetPack 6.1.

On Jetson, the driver is mounted through NVIDIA Container Toolkit (--runtime nvidia).
This indicates that the libraries are mounted when launching the container and don’t exist when building a Dockerfile.

A WAR for this is to either copy the driver manually or build the torch2trt beforehand to ensure libnvdla_compiler.so exists.

Thanks.

system · March 12, 2025, 2:27am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jetson container package torch2trt on nano orin 8gb Developement kit Jetson Orin Nano containers	2	656	September 6, 2023
Libnvdla_compiler.so not found (bis repetita) Jetson AGX Orin tensorrt	8	65	March 31, 2025
Container torch_tensorrt not working Jetson AGX Orin tensorrt , containers	15	92	March 12, 2025
How can I create my custom containers for Jetson Nano Jetson Nano docker , containers	23	3116	October 15, 2021
Jetson could not load library dlopen error in ROS container Jetson AGX Orin jetpack , tensorrt , ros	7	1454	September 14, 2022
Error while building the NanoSAM jetson container Jetson Orin Nano tensorrt	7	50	March 10, 2025
Libnvdla_compiler.so error on nvidia jetson container Jetson AGX Xavier containers	14	3086	January 4, 2023
Jetson Orin Nano Developer Kit, Jetpack, Cuda, Tensorflow with GPU and TensorRT Jetson Orin Nano tensorflow	16	3495	September 28, 2023
Trt_pose model in docker: ImportError: libnvmedia_tensor.so: cannot open shared object file: No such file or directory Jetson Nano tensorrt , dla	7	960	May 3, 2023
NVIDIA L4T TensorRT containers with libnvinfer-dev Jetson AGX Orin tensorrt	13	1761	June 8, 2022

Torch2trt fail when building container from jetson-containers

1. Performance

2. Installation

3. Tutorial

4. Report issue

Related topics