Upgrading CUDA for Autoware Compatibility and tensorrt libs not Accessible Inside the l4t-jetpack

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.8.1
[*] DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[*] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[*] DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
[*] other

Host Machine Version
[*] native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

I am trying to build a docker container on the nvidia drive agx orin using a multistage built method where I first use
nvcr.io/nvidia/l4t-jetpack: as the base image. In the second stage I am using arm64v8/ros:humble as based image to leveraging Cuda/cudnn/Tensorrt and ROS to build a specific autoware environment to run it on the drive agx orin. (PS: i am copying all the necessary libs and recreating the symlinks so that they are available in the next stage)

Currently I am facing some issues during the build the process :

1. Initially I have tested all the existing tags of the jetpack like:

and for testing initially with just stage 1, I ran a simple cuda code, inside the container r36 was not compatible r35 lead to some JIT compiler errors but in all the other cases (35.3, 35.2, 35.1) I was able to access the GPU from inside the container. Cudnn tests also ran fine. But I am facing some issues related to Tensorrt which is itself failing my whole build while building specific autoware environment because some packages need tensorrt.

Just by building standalone this for testing versions of jetpack

by running the docker containers like:

docker run -it --gpus all --runtime nvidia nvcr.io/nvidia/l4t-jetpack:<tag_version> /bin/bash
I ran
find / -name libnvdla_compiler.so
but it was not to be found in (r35.3.1, r35.2.1, 35.1.0).

I tried to mount only this specific libraries those were missing inside the container like this,

docker run -it -v /usr/lib/libnvdla_compiler.so:/usr/lib/libnvdla_compiler.so --gpus all --runtime nvidia nvcr.io/nvidia/l4t-jetpack:r35.1.0 /bin/bash 

I read that those low level libraries are generally flashed via SDK manager. Soon i encountered more issues related to more missing libs like: (libnvmedia.so, libnvmedia_tensor.so, libnvmedia_dla.so, etc.)

Why in the first place they are not accessible from the host system inside the container.

2. I require assistance in updating CUDA from version 11.4 to 12.2 on my system, to ensure compatibility with the latest version of Autoware. Could you please provide guidance on the upgrade process?

JetPack is for Jetson platforms, and there are specific considerations when running Docker containers on DRIVE AGX Orin. Please refer to Running Docker Containers Directly on NVIDIA DRIVE AGX Orin | NVIDIA Technical Blog for initial guidance on running Docker on DRIVE AGX Orin.

Regarding the Upgrade of CUDA from version 11.4 to 12.2, it is important to note that upgrading the CUDA version on DRIVE AGX Orin is not supported. The current version in the latest release, 6.0.8.1, is CUDA 11.4. Upgrading CUDA beyond the supported version may lead to compatibility issues.

1 Like

So, in that case If i use L4T-tensorrt and install a compatible cudnn. Then i dont need to run my application inside a docker container like mentioned in this post https://developer.nvidia.com/blog/running-docker-containers-directly-on-nvidia-drive-agx-orin/. I guess L4T (Linux for tegara) containers are supported on Nvidia Drive Agx Orin.

I tried running some different versions of l4t-tensorrt and installing cudnn manually. I am able to access GPU inside the docker container, cudnn also works but somehow there is an issue with Tensorrt. The docker container can not find some BSP libs that is needed by tensorrt. Is there any work around for that condition.

Apart from that I tried to compile some simple tensorrt code inside the Nvidia drive AGX orin (not inside the docker container). It is not working through the following commands, i checked tensorrt is installed after flashing the drive

nvidia@tegra-ubuntu:~$ sudo find / -name "libnvinfer*" ! -path "/mnt/external-ssd/*"
[sudo] password for nvidia: 
/usr/share/doc/libnvinfer-plugin8
/usr/share/doc/libnvinfer-bin
/usr/share/doc/libnvinfer8
/usr/lib/aarch64-linux-gnu/libnvinfer_builder_resource.so.8.5.10
/usr/lib/aarch64-linux-gnu/libnvinfer.so.8
/usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.8.5.10
/usr/lib/aarch64-linux-gnu/libnvinfer.so.8.5.10
/usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.8
/var/lib/dpkg/info/libnvinfer-plugin8.md5sums
/var/lib/dpkg/info/libnvinfer8.list
/var/lib/dpkg/info/libnvinfer-plugin8.list
/var/lib/dpkg/info/libnvinfer8.md5sums
/var/lib/dpkg/info/libnvinfer-bin.md5sums
/var/lib/dpkg/info/libnvinfer8.shlibs
/var/lib/dpkg/info/libnvinfer-plugin8.shlibs
/var/lib/dpkg/info/libnvinfer8.triggers
/var/lib/dpkg/info/libnvinfer-bin.list
/var/lib/dpkg/info/libnvinfer-plugin8.triggers
dpkg -l | grep TensorRT
ii  libnvinfer-bin                       8.5.10-1+cuda11.4                       arm64        TensorRT binaries
ii  libnvinfer-plugin8                   8.5.10-1+cuda11.4                       arm64        TensorRT plugin libraries
ii  libnvinfer8                          8.5.10-1+cuda11.4                       arm64        TensorRT runtime libraries
ii  libnvonnxparsers8                    8.5.10-1+cuda11.4                       arm64        TensorRT ONNX libraries
ii  libnvparsers8                        8.5.10-1+cuda11.4                       arm64        TensorRT parsers libraries

and the error while compiling the code:

 nvcc -arch=sm_87 test1_tensorrt.cpp -o tensorrt_test -lnvinfer
test1_tensorrt.cpp:1:10: fatal error: NvInfer.h: No such file or directory
    1 | #include <NvInfer.h>
      |          ^~~~~~~~~~~
compilation terminated.

I tried to look for this particular header file but it does not exist.

nvidia@tegra-ubuntu:~$ sudo find / -name "NvInfer.h" ! -path "/mnt/external-ssd/*"
find: ‘/proc/2439332’: No such file or directory
find: ‘/proc/2439333’: No such file or directory
find: ‘/proc/2439337’: No such file or directory

and it tried to even set the variable paths:

nvidia@tegra-ubuntu:~$ export LIBRARY_PATH=/usr/lib/aarch64-linux-gnu:$LIBRARY_PATH
-linux-gnu:$LD_LIBRARY_PATH
nvidia@tegra-ubuntu:~$ export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu:$LD_LIBRARY_PATH

But the issue persists.

Header files are not available on target. Please copy them from host/docker environment.

1 Like

can you please elaborate or link to a source. I am confused which files to copy and paste at which location in the root fs inside the drive

Dear @gautam.kumar.jain1,
Please check /usr/include/aarch64-linux-gnu/ on host(if sdkmanager is used to flash) or on docker(DRIVE OS 6.0.6 container) and copy to /usr/include/aarch64-linux-gnu on target

1 Like

Dear @SivaRamaKrishnaNV,

I am trying to find a workaround so that I can solve the lib realted issues and I can run tensorrt inside a container on Nvidia Drive Agx Orin using l4t-tensorrt as base image rather than jetapack. I try to mount the missing libs specifically which are required by tensorrt while compiling.

Like ::

docker run -it -v /usr/lib/libnvdla_compiler.so:/usr/lib/libnvdla_compiler.so --gpus all --runtime nvidia nvcr.io/nvidia/l4t-jetpack:r35.1.0 /bin/bash

When i compile the code i got failure because of some missing libs, I tried searching them on Nvidia Drive Agx Orin

find / -name libnvmedia.so
find / -name libnvmedia_tensor.so
find / -name libnvmedia_dla.so

They are not available but I was able to find these libs on the host system from where I flashed the drive via SDK under path like:

 ttz_ad@TTZ-ad  ~/nvidia/nvidia_sdk/JetPack_5.1.2_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/rootfs/usr/lib/aarch64-linux-gnu/tegra  ls | grep libnvmedia
libnvmedia_2d.so
libnvmedia2d.so
libnvmedia_dla.so
libnvmedia_eglstream.so
libnvmedia_ide_parser.so
libnvmedia_ide_sci.so
libnvmedia_iep_sci.so
libnvmedia_ijpd_sci.so
libnvmedia_ijpe_sci.so
libnvmedia_iofa_sci.so
libnvmedia_isp_ext.so
libnvmedialdc.so
libnvmedia_sci_overlay.so
libnvmedia.so
libnvmedia_tensor.so

Can I copy these libs also and mount them while running the docker containers. Will that work to make use of l4t-tensorrt on Nvidia Drive Agx Orin.? Because only tensorrt is a issue here cuda is working I am able access the gpu inside the container and cudnn also works fine

Dear @gautam.kumar.jain1,
You can try copying them and see if it works.

Regarding CUDA 12.x, as Vick clarified, it is not possible. Note that CUDA 12.x require nvidia driver >=525. 60.13(CUDA Compatibility :: NVIDIA Data Center GPU Driver Documentation). But DRIVE OS comes with 470.x.

1 Like

@SivaRamaKrishnaNV

Yes copying all the libnvmedia_*.so files and mounting them while running the docker container I am able to access GPU and compile the cudnn, cuda and tensorrt code from inside the container. Thank you for your guidance.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.