Install necessary CUDA and TensorRT components in the NGC container to build source code

ligcoxx · August 2, 2022, 7:29pm

Hey, we want to use ROS2 in Jetson Xavier NX(Jetpack 4.6) to do our work. For a some reasons, our work relies on ROS2 Galactic and Ubuntu Facol. We want to compile the source code via colcon toolkit in the container and run the executable, which may rely on some Jetpack components and CUDA-related header files (for example, nvinfer.h).

We see some works that nvidia community has done, but they are not meeting our needs. We tried in the following aspects and encountered some problems:

Try using a community-compiled ROS container, but the container did not contain Tensorrt and CUDA related files, so the source code could not be compiled in container.
Try to build our own container based on L4T base container , as the description of container website:

Starting with the r34.1 release (JetPack 5.0 Developer Preview), the l4t-base will not bring CUDA, CuDNN and TensorRT from the host file system. … , Users can apt install Jetson packages and other software of their choice to extend the l4t-base dockerfile (see above) while building application containers. All JetPack components are hosted in the Debian Package Management server here.

But I couldn’t find any more documentation to install these Jetpack components via Debian Package Management.

Try building our own container based on the L4T TensorRT, but since the image only provides runtime, our couldn’t finish building the source code in the container.
We noticed similar problems and caused similar errors in our attempts. I also tried building containers based on DeepStream-l4t, but it still couldn’t build our project

Can you help me?

dusty_nv · August 2, 2022, 9:49pm

Hi @ligcoxx, those containers should in fact contain CUDA and TensorRT - on JetPack 4.x, these are mounted in dynamically from the device and you should set your default docker runtime to ‘nvidia’ if you need those files during docker build operations. On JetPack 5.x, those ROS containers are based on the jetpack container, which includes the CUDA and TensorRT packages pre-installed. There are also PyTorch variants of the ROS containers that come with PyTorch pre-installed.

For example, when I check for TensorRT headers inside the ros:galactic-ros-base-l4t-r34.1.1 container, they are there:

ls -ll /usr/include/aarch64-linux-gnu/NvInfer*
-rw-r--r-- 1 root root   4544 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferConsistency.h
-rw-r--r-- 1 root root   1177 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferConsistencyImpl.h
-rw-r--r-- 1 root root 301919 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInfer.h
-rw-r--r-- 1 root root  44542 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferImpl.h
-rw-r--r-- 1 root root   3573 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferLegacyDims.h
-rw-r--r-- 1 root root   9402 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferPlugin.h
-rw-r--r-- 1 root root   9894 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferPluginUtils.h
-rw-r--r-- 1 root root  76216 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferRuntimeCommon.h
-rw-r--r-- 1 root root 106364 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferRuntime.h
-rw-r--r-- 1 root root  22182 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferSafeRuntime.h
-rw-r--r-- 1 root root   1219 Apr 30 22:43 /usr/include/aarch64-linux-gnu/NvInferVersion.h

ligcoxx · August 2, 2022, 10:48pm

With your help, I went through my source code carefully. I found that part of the TensorRT sample common file was used in my source code, which led me to mistakenly think that the container did not contain CUDA and TensorRT packages. I did the following:

docker run -it -v MY_WORKSPACE:/home dustynv/ros:galactic-ros-base-l4t-r34.1.1  /bin/bash
docker cp /usr/src/tensorrt/samples/common MY_CONTAINER_ID:/usr/src/tensorrt/samples/

When I compiled using colcon toolkit, several outputs like the following are printed:

Finished <<< my_package[7.31s]                                                                                                                                                                       
--- stderr: my_package
In file included from /usr/src/tensorrt/samples/common/logging.h:20,
                 from /usr/src/tensorrt/samples/common/logger.h:20,
                 from /usr/src/tensorrt/samples/common/logger.cpp:17:
/usr/include/aarch64-linux-gnu/NvInferRuntimeCommon.h:19:10: fatal error: cuda_runtime_api.h: No such file or directory
   19 | #include <cuda_runtime_api.h>

I checked these files reporting errors, and their dir is usr/local/cuda/targets/aarch64-linux/include/

ls -ll /usr/local/cuda/targets/aarch64-linux/include/ | grep cuda_runtime
-rw-r--r--  1 root root  540860 Nov 15  2021 cuda_runtime_api.h
-rw-r--r--  1 root root  106069 Nov 15  2021 cuda_runtime.h
-rw-r--r--  1 root root   64002 Nov 15  2021 generated_cuda_runtime_api_meta.h

After setting up a soft links, I solved the problem.

ln -s  /usr/local/cuda/targets/aarch64-linux/include/* /usr/include/

Maybe setting up a soft links is the right way?

dusty_nv · August 3, 2022, 1:00am

I’m not sure what additional environment configuration that colcon needs to build CUDA, but typically with CMake you would use cuda_add_executable() and cuda_add_library() to automatically add the -I/usr/local/cuda/include build flag to gcc/g++. But in this case perhaps you need a CMake include_directories() or target_include_directories() (or create the softlinks, as you have done)

ligcoxx · August 3, 2022, 4:14pm

Hi, @dusty_nv, I use cuda_add_executable() and cuda_add_library()solved this problem.

Thanks in advance for your time!

dusty_nv · August 3, 2022, 11:56pm

No problem at all, glad you got it working!

system · August 17, 2022, 11:57pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.