Running TensorRT inference in docker container on Drive Orin AGX

david030 · June 8, 2023, 8:11am

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.2.10884
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Hello,

I would like to run TensorRT inference inside a docker container on a Drive Orin AGX. I tried multiple ways but they didn’t work. So I have multiple questions:

Where shall I create the TensorRT model ?
If I create the trt model on the host system it has version 8.5.10.4 but I cannot install TensorRT version 8.5.10.4 inside the docker container because I can’t find the version anywhere. If I try to create the model inside a container with TensorRT 8.5.2.2 trtexec returns the error

“[TRT] 1: [cudlaUtils.cpp::operator()::32] Error Code 1: DLA (Failed to get number of available DLA devices)”.

Which docker base image can I use to run the inference?
What is the recommended way for this use case ?

Thanks,
David

SivaRamaKrishnaNV · June 8, 2023, 8:55am

Dear @david030,

Dear @david030,
Could you please confirm if you are testing docker on host or target? Is your TRT model prepared to run on DLA ?

david030 · June 8, 2023, 9:07am

Sry I am not sure what you mean by host and target. I created the trt model on the Drive Orin and tested docker also on the Orin. To create the model I used “trtexec --onnx=<model_path> --saveEngine=<engine_path>”

SivaRamaKrishnaNV · June 8, 2023, 10:00am

Dear @david030,
Thank you for clarification. Could you share the docker container details?

david030 · June 8, 2023, 10:09am

I tried to use this one dustynv/ros:noetic-pytorch-l4t-r35.2.1

SivaRamaKrishnaNV · June 8, 2023, 10:54am

Dear @david030,
Is the used docker container based on L4T?

david030 · June 8, 2023, 10:57am

Yes, it’s based on L4T.

david030 · June 15, 2023, 7:42am

But it don’t has to be this docker image. I would be happy about any description on how to run tensorRT inside a docker container on a Drive Orin.

SivaRamaKrishnaNV · June 15, 2023, 8:29am

Dear @david030,
How about mouting TRT samples on docker like how CUDA samples mounted in https://developer.nvidia.com/blog/running-docker-containers-directly-on-nvidia-drive-agx-orin/

david030 · June 15, 2023, 2:26pm

Dear SivaRamaKrishnaNV,

I tried, but following the TensorRT repository (https://github.com/NVIDIA/TensorRT/tree/release/8.5) to compile the samples they recommend to create a TensorRT-OSS build container. But I cannot create this TensorRT-OSS build container since it cannot find the TensorRT version (8.5.10.4) which is installed on the Drive Orin.

And if I try to create the TensorRT_OSS with a lower TensorRT version (8.3.0.5) I get the following error when I run the samples:

[W] [TRT] Unable to determine GPU memory usage
[06/15/2023-14:22:53] [W] [TRT] Unable to determine GPU memory usage
[06/15/2023-14:22:53] [I] [TRT] [MemUsageChange] Init CUDA: CPU +130, GPU +0, now: CPU 158, GPU 0 (MiB)
[06/15/2023-14:22:53] [W] [TRT] CUDA initialization failure with error: 999. Please check your CUDA installation: NVIDIA CUDA Installation Guide for Linux

SivaRamaKrishnaNV · June 22, 2023, 8:05am

Dear @david030,
You may check cross compile the TRT samples installed using sdkmanager on host( at /usr/src/tensorrt/)

david030 · June 22, 2023, 9:34am

Dear @SivaRamaKrishnaNV,

thank you for your answer. Can you give me some tips how to cross compile the TRT samples installed using sdkmanager or point me to the right documentation? The TRT documentation of the respository only describes how to compile the samples from src but not from the installed version.

I tried with
sudo PROTOBUF_INSTALL_DIR=/usr/lib/aarch64-linux-gnu make TARGET=aarch64
but at some point it failed picking the right includes:

In file included from /usr/aarch64-linux-gnu/include/wchar.h:30,
                 from /usr/aarch64-linux-gnu/include/c++/9/cwchar:44,
                 from /usr/aarch64-linux-gnu/include/c++/9/bits/postypes.h:40,
                 from /usr/aarch64-linux-gnu/include/c++/9/bits/char_traits.h:40,
                 from /usr/aarch64-linux-gnu/include/c++/9/string:40,
                 from ../common/argsParser.h:20,
                 from sampleOnnxMNIST.cpp:27:
/usr/include/x86_64-linux-gnu/bits/floatn.h:87:9: error: ‘__float128’ does not name a type; did you mean ‘__cfloat128’?
   87 | typedef __float128 _Float128;

SivaRamaKrishnaNV · July 28, 2023, 6:21am

Dear @david030,
Does this issue still need support?

david030 · July 28, 2023, 7:31am

Yes please.

0xdeadbeef · August 10, 2023, 3:12am

@SivaRamaKrishnaNV is this issue resolved , if so how ?

SivaRamaKrishnaNV · August 10, 2023, 3:49am

Dear @david030 ,
I misred your later comments. If the requirement is to run TRT sample on ubuntu docker container(mentioned in the blog) in Orin, You can compile the TRT sample on target directly and mount them inside ubuntu docker like how CUDA samples is mounted.

david030 · August 10, 2023, 12:06pm

Dear @SivaRamaKrishnaNV ,

thank you for your answer, but as far as I know the compilation of the TRT samples requires the TRT header files which are not available on the target.

VickNV · August 10, 2023, 3:38pm

In order to compile the TRT samples for your target, you will need to perform cross-compilation. You can find detailed instructions for cross-compilation in Build and Run Sample Applications for DRIVE OS 6.x Linux | NVIDIA Docs.

system · September 12, 2023, 3:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.