Error in executing TensorRT samples through docker container environment

saurabh.sinalkar · October 9, 2024, 12:09pm

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.10.0
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
2.1.0
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Issue Description

I want a Tensorrt docker container environment in Drive Orin to run our DNN models. For that i used the following Tensorrt container available on NGC website : nvcr.io/nvidia/tensorrt:21.12-py3

The reason to use this specific version is that from this version, it supports linux arm64 architecture.

The container gets pulled and executed successfully. But when i try to run its sample cpp programs in order to check the container environment, its showing the below mentioned error for every samples.

Error message depicts cuda installation error, but as per the container image cuda is already part of the container environment.

PS: I have checked the same container samples execution in x86 host environment for Drive Orin and it gets executed successfully without any error.

Requesting your help in resolving this error.

Error String
[10/09/2024-11:47:11] [E] [TRT] 6: [cudaDeviceProfile.cpp::isCudaInstalledCorrectly::119] Error Code 6: Internal Error (CUDA initialization failure with error 999. Please check your CUDA installation: CUDA Installation Guide for Linux)

&&&& FAILED TensorRT.sample_onnx_mnist [TensorRT v8201] # ./sample_onnx_mnist

Logs

root@1b9425014e73:/workspace/tensorrt/bin# ./sample_onnx_mnist

&&&& RUNNING TensorRT.sample_onnx_mnist [TensorRT v8201] # ./sample_onnx_mnist

[10/09/2024-11:47:11] [I] Building and running a GPU inference engine for Onnx MNIST

[10/09/2024-11:47:11] [W] [TRT] Unable to determine GPU memory usage

[10/09/2024-11:47:11] [I] [TRT] [MemUsageChange] Init CUDA: CPU +7, GPU +0, now: CPU 17, GPU 0 (MiB)

[10/09/2024-11:47:11] [E] [TRT] 6: [cudaDeviceProfile.cpp::isCudaInstalledCorrectly::119] Error Code 6: Internal Error (CUDA initialization failure with error 999. Please check your CUDA installation: CUDA Installation Guide for Linux)

&&&& FAILED TensorRT.sample_onnx_mnist [TensorRT v8201] # ./sample_onnx_mnist

SivaRamaKrishnaNV · October 9, 2024, 2:07pm

Dear @saurabh.sinalkar,
Any issues to check TensorRT sample directly on target? Note that, the docker environment on Orin is for experimentation.
Also, I notice a WARNING that orin GPU is not supported on this container

nvidia@tegra-ubuntu:~$ sudo docker run -it --rm --privileged --network host --runtime nvidia --gpus all -v $(pwd):$(pwd) -w $(pwd) nvcr.io/nvidia/tensorrt:21.12-py3

=====================
== NVIDIA TensorRT ==
=====================

NVIDIA Release 21.12 (build 29870938)
NVIDIA TensorRT Version 8.2.1
Copyright (c) 2016-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Container image Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

https://developer.nvidia.com/tensorrt

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

To install Python sample dependencies, run /opt/tensorrt/python/python_setup.sh

To install the open-source samples corresponding to this TensorRT release version
run /opt/tensorrt/install_opensource.sh.  To build the open source parsers,
plugins, and samples for current top-of-tree on master or a different branch,
run /opt/tensorrt/install_opensource.sh -b <branch>
See https://github.com/NVIDIA/TensorRT for more information.
WARNING: Detected NVIDIA Orin GPU, which is not yet supported in this version of the container
ERROR: No supported GPU(s) detected to run this container
/opt/nvidia/entrypoint.d/52-gpu-driver-version-check.sh: line 14: nvidia-smi: command not found

Failed to detect NVIDIA driver version.

saurabh.sinalkar · October 10, 2024, 2:30pm

Dear @SivaRamaKrishnaNV,

There are no issues running it directly on target after cross compiling it on host.
As mentioned earlier our requirement is to have a tensorrt docker container environment to run custom DNNs.
Which TensorRT container version supports Drive Orin GPU?
As per the website, the container supports both x86 and arm64 architecture.

SivaRamaKrishnaNV · October 10, 2024, 3:01pm

We have not tested TensorRT containers on target. Make sure the TRT and CUDA version matches to DRIVE OS release.
How about launching a ubuntu 20.04 docker and mounting the TensorRT sample folder on it to test running on docker? Does that fit your requirement?

saurabh.sinalkar · October 16, 2024, 9:37am

Dear @SivaRamaKrishnaNV,

For Drive OS 6.0.6, the TRT version is 8.5.10 & Cuda version is 11.4. When searched on the Tensorrt NGC container website there is no version matching the above configuration.

The base container approach can fit our requirement.
Could you please guide in order to start with ubuntu 20 as base image approach,
what all dependencies to be installed for TensorRt to make its samples running successfully within the container?

SivaRamaKrishnaNV · October 16, 2024, 12:23pm

Dear @saurabh.sinalkar ,
Below is tested on DRIVE OS 6.0.10 .

nvidia@tegra-ubuntu:/usr/src/tensorrt$ sudo docker run -it --rm --privileged --network host --runtime nvidia --gpus all -v /usr/src/tensorrt/:/usr/src/tensorrt -v /home/nvidia/:/home/nvidia ubuntu:20.04
root@tegra-ubuntu:/# cd /usr/src/tensorrt/
root@tegra-ubuntu:/usr/src/tensorrt# cd bin/
root@tegra-ubuntu:/usr/src/tensorrt/bin# ./trtexec --onnx=/home/nvidia/mnist.onnx
&&&& RUNNING TensorRT.trtexec [TensorRT v8613] # ./trtexec --onnx=/home/nvidia/mnist.onnx
[10/16/2024-12:20:39] [I] === Model Options ===
[10/16/2024-12:20:39] [I] Format: ONNX
[10/16/2024-12:20:39] [I] Model: /home/nvidia/mnist.onnx
[10/16/2024-12:20:39] [I] Output:
[10/16/2024-12:20:39] [I] === Build Options ===
[10/16/2024-12:20:39] [I] Max batch: explicit batch
[10/16/2024-12:20:39] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[10/16/2024-12:20:39] [I] minTiming: 1
[10/16/2024-12:20:39] [I] avgTiming: 8
[10/16/2024-12:20:39] [I] Precision: FP32
[10/16/2024-12:20:39] [I] LayerPrecisions:
[10/16/2024-12:20:39] [I] Layer Device Types:
[10/16/2024-12:20:39] [I] Calibration:
[10/16/2024-12:20:39] [I] Refit: Disabled
[10/16/2024-12:20:39] [I] Version Compatible: Disabled
[10/16/2024-12:20:39] [I] TensorRT runtime: full
[10/16/2024-12:20:39] [I] Lean DLL Path:
[10/16/2024-12:20:39] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[10/16/2024-12:20:39] [I] Exclude Lean Runtime: Disabled
[10/16/2024-12:20:39] [I] Sparsity: Disabled
[10/16/2024-12:20:39] [I] Safe mode: Disabled
[10/16/2024-12:20:39] [I] Build DLA standalone loadable: Disabled
[10/16/2024-12:20:39] [I] Allow GPU fallback for DLA: Disabled
[10/16/2024-12:20:39] [I] DirectIO mode: Disabled
[10/16/2024-12:20:39] [I] Restricted mode: Disabled
[10/16/2024-12:20:39] [I] Skip inference: Disabled
[10/16/2024-12:20:39] [I] Save engine:
[10/16/2024-12:20:39] [I] Load engine:
[10/16/2024-12:20:39] [I] Profiling verbosity: 0
[10/16/2024-12:20:39] [I] Tactic sources: Using default tactic sources
[10/16/2024-12:20:39] [I] timingCacheMode: local
[10/16/2024-12:20:39] [I] timingCacheFile:
[10/16/2024-12:20:39] [I] Heuristic: Disabled
[10/16/2024-12:20:39] [I] Preview Features: Use default preview flags.
[10/16/2024-12:20:39] [I] MaxAuxStreams: -1
[10/16/2024-12:20:39] [I] BuilderOptimizationLevel: -1
[10/16/2024-12:20:39] [I] Calibration Profile Index: 0
[10/16/2024-12:20:39] [I] Input(s)s format: fp32:CHW
[10/16/2024-12:20:39] [I] Output(s)s format: fp32:CHW
[10/16/2024-12:20:39] [I] Input build shapes: model
[10/16/2024-12:20:39] [I] Input calibration shapes: model
[10/16/2024-12:20:39] [I] === System Options ===
[10/16/2024-12:20:39] [I] Device: 0
[10/16/2024-12:20:39] [I] DLACore:
[10/16/2024-12:20:39] [I] Plugins:
[10/16/2024-12:20:39] [I] setPluginsToSerialize:
[10/16/2024-12:20:39] [I] dynamicPlugins:
[10/16/2024-12:20:39] [I] ignoreParsedPluginLibs: 0
[10/16/2024-12:20:39] [I]
[10/16/2024-12:20:39] [I] === Inference Options ===
[10/16/2024-12:20:39] [I] Batch: Explicit
[10/16/2024-12:20:39] [I] Input inference shapes: model
[10/16/2024-12:20:39] [I] Iterations: 10
[10/16/2024-12:20:39] [I] Duration: 3s (+ 200ms warm up)
[10/16/2024-12:20:39] [I] Sleep time: 0ms
[10/16/2024-12:20:39] [I] Idle time: 0ms
[10/16/2024-12:20:39] [I] Inference Streams: 1
[10/16/2024-12:20:39] [I] ExposeDMA: Disabled
[10/16/2024-12:20:39] [I] Data transfers: Enabled
[10/16/2024-12:20:39] [I] Spin-wait: Disabled
[10/16/2024-12:20:39] [I] Multithreading: Disabled
[10/16/2024-12:20:39] [I] CUDA Graph: Disabled
[10/16/2024-12:20:39] [I] Separate profiling: Disabled
[10/16/2024-12:20:39] [I] Time Deserialize: Disabled
[10/16/2024-12:20:39] [I] Time Refit: Disabled
[10/16/2024-12:20:39] [I] NVTX verbosity: 0
[10/16/2024-12:20:39] [I] Persistent Cache Ratio: 0
[10/16/2024-12:20:39] [I] Optimization Profile Index: 0
[10/16/2024-12:20:39] [I] Inputs:
[10/16/2024-12:20:39] [I] === Reporting Options ===
[10/16/2024-12:20:39] [I] Verbose: Disabled
[10/16/2024-12:20:39] [I] Averages: 10 inferences
[10/16/2024-12:20:39] [I] Percentiles: 90,95,99
[10/16/2024-12:20:39] [I] Dump refittable layers:Disabled
[10/16/2024-12:20:39] [I] Dump output: Disabled
[10/16/2024-12:20:39] [I] Profile: Disabled
[10/16/2024-12:20:39] [I] Export timing to JSON file:
[10/16/2024-12:20:39] [I] Export output to JSON file:
[10/16/2024-12:20:39] [I] Export profile to JSON file:
[10/16/2024-12:20:39] [I]
[10/16/2024-12:20:39] [I] === Device Information ===
[10/16/2024-12:20:39] [I] Selected Device: Orin
[10/16/2024-12:20:39] [I] Compute Capability: 8.7
[10/16/2024-12:20:39] [I] SMs: 16
[10/16/2024-12:20:39] [I] Device Global Memory: 28953 MiB
[10/16/2024-12:20:39] [I] Shared Memory per SM: 164 KiB
[10/16/2024-12:20:39] [I] Memory Bus Width: 256 bits (ECC disabled)
[10/16/2024-12:20:39] [I] Application Compute Clock Rate: 1.275 GHz
[10/16/2024-12:20:39] [I] Application Memory Clock Rate: 1.275 GHz
[10/16/2024-12:20:39] [I]
[10/16/2024-12:20:39] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[10/16/2024-12:20:39] [I]
[10/16/2024-12:20:39] [I] TensorRT version: 8.6.13
[10/16/2024-12:20:39] [I] Loading standard plugins
[10/16/2024-12:20:40] [I] [TRT] [MemUsageChange] Init CUDA: CPU +418, GPU +0, now: CPU 438, GPU 2521 (MiB)
[10/16/2024-12:20:42] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +833, GPU +990, now: CPU 1303, GPU 3544 (MiB)
[10/16/2024-12:20:42] [I] Start parsing network model.
[10/16/2024-12:20:42] [I] [TRT] ----------------------------------------------------------------
[10/16/2024-12:20:42] [I] [TRT] Input filename:   /home/nvidia/mnist.onnx
[10/16/2024-12:20:42] [I] [TRT] ONNX IR version:  0.0.3
[10/16/2024-12:20:42] [I] [TRT] Opset version:    8
[10/16/2024-12:20:42] [I] [TRT] Producer name:    CNTK
[10/16/2024-12:20:42] [I] [TRT] Producer version: 2.5.1
[10/16/2024-12:20:42] [I] [TRT] Domain:           ai.cntk
[10/16/2024-12:20:42] [I] [TRT] Model version:    1
[10/16/2024-12:20:42] [I] [TRT] Doc string:
[10/16/2024-12:20:42] [I] [TRT] ----------------------------------------------------------------
[10/16/2024-12:20:42] [I] Finished parsing network model. Parse time: 0.0187303
[10/16/2024-12:20:42] [I] [TRT] Graph optimization time: 0.00524188 seconds.
[10/16/2024-12:20:42] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[10/16/2024-12:20:45] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[10/16/2024-12:20:45] [I] [TRT] Total Host Persistent Memory: 24224
[10/16/2024-12:20:45] [I] [TRT] Total Device Persistent Memory: 0
[10/16/2024-12:20:45] [I] [TRT] Total Scratch Memory: 0
[10/16/2024-12:20:45] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB
[10/16/2024-12:20:45] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 6 steps to complete.
[10/16/2024-12:20:45] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 0.032384ms to assign 3 blocks to 6 nodes requiring 32256 bytes.
[10/16/2024-12:20:45] [I] [TRT] Total Activation Memory: 31744
[10/16/2024-12:20:45] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +0, GPU +4, now: CPU 0, GPU 4 (MiB)
[10/16/2024-12:20:45] [I] Engine built in 5.74856 sec.
[10/16/2024-12:20:45] [I] [TRT] Loaded engine size: 0 MiB
[10/16/2024-12:20:45] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +0, now: CPU 0, GPU 0 (MiB)
[10/16/2024-12:20:45] [I] Engine deserialized in 0.00805533 sec.
[10/16/2024-12:20:45] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 0 (MiB)
[10/16/2024-12:20:45] [I] Setting persistentCacheLimit to 0 bytes.
[10/16/2024-12:20:45] [I] Using random values for input Input3
[10/16/2024-12:20:45] [I] Input binding for Input3 with dimensions 1x1x28x28 is created.
[10/16/2024-12:20:45] [I] Output binding for Plus214_Output_0 with dimensions 1x10 is created.
[10/16/2024-12:20:45] [I] Starting inference
[10/16/2024-12:20:48] [I] Warmup completed 2844 queries over 200 ms
[10/16/2024-12:20:48] [I] Timing trace has 43831 queries over 3.00014 s
[10/16/2024-12:20:48] [I]
[10/16/2024-12:20:48] [I] === Trace details ===
[10/16/2024-12:20:48] [I] Trace averages of 10 runs:
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0366837 ms - Host latency: 0.0504517 ms (enqueue 0.0321884 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.037558 ms - Host latency: 0.0517944 ms (enqueue 0.0330215 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0370209 ms - Host latency: 0.0498688 ms (enqueue 0.0324585 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0374924 ms - Host latency: 0.050473 ms (enqueue 0.0329025 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0377975 ms - Host latency: 0.0517593 ms (enqueue 0.0332657 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0369522 ms - Host latency: 0.050325 ms (enqueue 0.0323166 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0369568 ms - Host latency: 0.0499893 ms (enqueue 0.0321518 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0376144 ms - Host latency: 0.0519073 ms (enqueue 0.032869 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0372635 ms - Host latency: 0.0511383 ms (enqueue 0.0327148 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0369919 ms - Host latency: 0.0505646 ms (enqueue 0.0323563 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0395462 ms - Host latency: 0.052916 ms (enqueue 0.0344376 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0368561 ms - Host latency: 0.0505753 ms (enqueue 0.0322845 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0369522 ms - Host latency: 0.050827 ms (enqueue 0.032225 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0366974 ms - Host latency: 0.0497192 ms (enqueue 0.0318558 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0367722 ms - Host latency: 0.0499268 ms (enqueue 0.0317825 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.037323 ms - Host latency: 0.0508362 ms (enqueue 0.032756 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0382553 ms - Host latency: 0.052533 ms (enqueue 0.0336761 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0367325 ms - Host latency: 0.0498199 ms (enqueue 0.0321274 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0369675 ms - Host latency: 0.0501755 ms (enqueue 0.0321136 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0369537 ms - Host latency: 0.050502 ms (enqueue 0.0320572 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0372925 ms - Host latency: 0.0503967 ms (enqueue 0.0328186 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0401718 ms - Host latency: 0.0536148 ms (enqueue 0.0356628 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0374878 ms - Host latency: 0.0510742 ms (enqueue 0.0327362 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.037294 ms - Host latency: 0.050853 ms (enqueue 0.0325378 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0365662 ms - Host latency: 0.0496338 ms (enqueue 0.0321136 ms)
[10/16/2024-12:20:48] [I] Average on 10 runs - GPU latency: 0.0373306 ms - Host latency: 0.0503799 ms (enqueue 0.0329514 ms)
[10/16/2024-12:20:48] [I] === Performance summary ===
[10/16/2024-12:20:48] [I] Throughput: 14609.6 qps
[10/16/2024-12:20:48] [I] Latency: min = 0.0466309 ms, max = 0.145081 ms, mean = 0.0502668 ms, median = 0.0498047 ms, percentile(90%) = 0.0512695 ms, percentile(95%) = 0.0522461 ms, percentile(99%) = 0.0639648 ms
[10/16/2024-12:20:48] [I] Enqueue Time: min = 0.0292969 ms, max = 0.124634 ms, mean = 0.0324217 ms, median = 0.0322266 ms, percentile(90%) = 0.0334473 ms, percentile(95%) = 0.0341797 ms, percentile(99%) = 0.0415039 ms
[10/16/2024-12:20:48] [I] H2D Latency: min = 0.00512695 ms, max = 0.0356445 ms, mean = 0.00625038 ms, median = 0.00616455 ms, percentile(90%) = 0.00646973 ms, percentile(95%) = 0.0065918 ms, percentile(99%) = 0.00708008 ms
[10/16/2024-12:20:48] [I] GPU Compute Time: min = 0.0338135 ms, max = 0.130402 ms, mean = 0.037107 ms, median = 0.0368042 ms, percentile(90%) = 0.0380859 ms, percentile(95%) = 0.0387268 ms, percentile(99%) = 0.0472412 ms
[10/16/2024-12:20:48] [I] D2H Latency: min = 0.00585938 ms, max = 0.0387573 ms, mean = 0.00690968 ms, median = 0.00683594 ms, percentile(90%) = 0.00714111 ms, percentile(95%) = 0.00732422 ms, percentile(99%) = 0.00793457 ms
[10/16/2024-12:20:48] [I] Total Host Walltime: 3.00014 s
[10/16/2024-12:20:48] [I] Total GPU Compute Time: 1.62644 s
[10/16/2024-12:20:48] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
[10/16/2024-12:20:48] [W]   If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
[10/16/2024-12:20:48] [W] * GPU compute time is unstable, with coefficient of variance = 5.58083%.
[10/16/2024-12:20:48] [W]   If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
[10/16/2024-12:20:48] [I] Explanations of the performance metrics are printed in the verbose logs.
[10/16/2024-12:20:48] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8613] # ./trtexec --onnx=/home/nvidia/mnist.onnx

saurabh.sinalkar · October 16, 2024, 1:11pm

I tried same on Drive OS : 6.0.6

Getting below error :

adas_coe@tegra-ubuntu:/usr/src/tensorrt$ sudo docker run -it --rm --privileged --network host --runtime nvidia --gpus all -v /usr/src/tensorrt/:/usr/src/tensorrt -v /home/adas_coe/Container-share:/home/adas_coe/Container-share ubuntu:20.04
[sudo] password for adas_coe:
root@tegra-ubuntu:/# cd /usr/src/tensorrt/
root@tegra-ubuntu:/usr/src/tensorrt# cd bin
root@tegra-ubuntu:/usr/src/tensorrt/bin# ./trtexec -onnx=/home/adas_coe/Container-share/mnist-12.onnx
./trtexec: error while loading shared libraries: libnvinfer.so.8: cannot open shared object file: No such file or directory
root@tegra-ubuntu:/usr/src/tensorrt/bin#

Could you let me know whats the issue?

SivaRamaKrishnaNV · October 16, 2024, 1:18pm

Could you check updating LD_LIBRARY_PATH to include the libnvinfer.so.8 path to see if it fixes?

saurabh.sinalkar · October 16, 2024, 1:30pm

libnvinfer.so.8 is not present in /usr/lib/aarch64-linux-gnu directory to include it in LD_LIBRARY_PATH.

So unable to fix this.

SivaRamaKrishnaNV · October 16, 2024, 2:34pm

Could you find it using locate command on docker/target and make sure the folder is mounted on docker using -v parameter with docker run command

Is your development tied to DRIVE OS 6.0.6?

saurabh.sinalkar · October 17, 2024, 6:04am

I couldnt find the library using locate command as well.

Currently development is tied to Drive OS 6.0.6.

SivaRamaKrishnaNV · October 17, 2024, 7:36pm

I can find them on docker. Is trtexec is running on target without docker?Please double check at /usr/lib/aarch64-linux-gnu on target. If not, you can copy them to target from docker and test.

root@6.0.6.0-0004-build-linux-sdk:/# locate libnvinfer.so.8
/drive/drive-linux/filesystem/targetfs/usr/lib/aarch64-linux-gnu/libnvinfer.so.8
/drive/drive-linux/filesystem/targetfs/usr/lib/aarch64-linux-gnu/libnvinfer.so.8.5.10

saurabh.sinalkar · October 21, 2024, 6:31am

Yes trtexec is running on target without docker.
Lib files are present on target at location /usr/lib/aarch64-linux-gn.

Are you suggesting to copy the libs on target to docker environment?

saurabh.sinalkar · October 24, 2024, 7:21am

Copying the libs from target to docker worked.

Could you confirm if the following method explained in the developer guide work in our case :

SivaRamaKrishnaNV · October 24, 2024, 7:40am

Yes. It helps to identify dependencies and update config files accordingly.

Topic		Replies	Views
Running TensorRT inference in docker container on Drive Orin AGX DRIVE AGX Orin General driveos-dl	18	1877	September 12, 2023
CUDA Error: TensorRT samples in Docker Environment DRIVE AGX Orin General driveos-dl	9	975	October 31, 2022
Tensorrt and cudnn on drive os 6.0.8 DRIVE AGX Orin General cudnn , driveos-dl	16	152	August 19, 2024
Unable to run tensorRT on DRIVE OS 6.0 DRIVE AGX Orin General driveos-dl	6	71	May 28, 2025
Upgrading CUDA for Autoware Compatibility and tensorrt libs not Accessible Inside the l4t-jetpack DRIVE AGX Orin General driveos-cuda	10	994	January 22, 2024
Is there docker containers for Drive Orin, like L4T? DRIVE AGX Orin General docker	4	599	December 13, 2022
Drive AGX Orin TensorRT inference failed DRIVE AGX Orin General driveos-dl	25	1083	September 14, 2023
Errors while running Drive samples DRIVE AGX Orin General driveos-cuda	9	863	July 8, 2023
Convert onnx model using trtexec in DRIVE OS DRIVE AGX Orin General driveos-dl	8	69	September 4, 2024
Drive OS NVIDIA Orin Linux Host : Cuda Error Jetson AGX Orin tensorrt , linux , local-host , jetson	2	691	October 17, 2022

Error in executing TensorRT samples through docker container environment

Related topics