Convert onnx model using trtexec in DRIVE OS

Please provide the following info (tick the boxes after creating this topic):
Software Version
[+] DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[+] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[+] DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
[+] native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

I am trying to convert an onnx model using tesnorrt but I get the following error. Do you have any feedback on that

/usr/src/tensorrt/bin/trtexec_debug --onnx=objectDetector_ALPS.onnx
&&&& RUNNING TensorRT.trtexec [TensorRT v8205] # /usr/src/tensorrt/bin/trtexec_debug --onnx=objectDetector_ALPS.onnx
[07/23/2024-10:29:12] [I] === Model Options ===
[07/23/2024-10:29:12] [I] Format: ONNX
[07/23/2024-10:29:12] [I] Model: objectDetector_ALPS.onnx
[07/23/2024-10:29:12] [I] Output:
[07/23/2024-10:29:12] [I] === Build Options ===
[07/23/2024-10:29:12] [I] Max batch: explicit batch
[07/23/2024-10:29:12] [I] Workspace: 16 MiB
[07/23/2024-10:29:12] [I] minTiming: 1
[07/23/2024-10:29:12] [I] avgTiming: 8
[07/23/2024-10:29:12] [I] Precision: FP32
[07/23/2024-10:29:12] [I] Calibration:
[07/23/2024-10:29:12] [I] Refit: Disabled
[07/23/2024-10:29:12] [I] Sparsity: Disabled
[07/23/2024-10:29:12] [I] Safe mode: Disabled
[07/23/2024-10:29:12] [I] DirectIO mode: Disabled
[07/23/2024-10:29:12] [I] Restricted mode: Disabled
[07/23/2024-10:29:12] [I] Save engine:
[07/23/2024-10:29:12] [I] Load engine:
[07/23/2024-10:29:12] [I] Profiling verbosity: 0
[07/23/2024-10:29:12] [I] Tactic sources: Using default tactic sources
[07/23/2024-10:29:12] [I] timingCacheMode: local
[07/23/2024-10:29:12] [I] timingCacheFile:
[07/23/2024-10:29:12] [I] Input(s)s format: fp32:CHW
[07/23/2024-10:29:12] [I] Output(s)s format: fp32:CHW
[07/23/2024-10:29:12] [I] Input build shapes: model
[07/23/2024-10:29:12] [I] Input calibration shapes: model
[07/23/2024-10:29:12] [I] === System Options ===
[07/23/2024-10:29:12] [I] Device: 0
[07/23/2024-10:29:12] [I] DLACore:
[07/23/2024-10:29:12] [I] Plugins:
[07/23/2024-10:29:12] [I] === Inference Options ===
[07/23/2024-10:29:12] [I] Batch: Explicit
[07/23/2024-10:29:12] [I] Input inference shapes: model
[07/23/2024-10:29:12] [I] Iterations: 10
[07/23/2024-10:29:12] [I] Duration: 3s (+ 200ms warm up)
[07/23/2024-10:29:12] [I] Sleep time: 0ms
[07/23/2024-10:29:12] [I] Idle time: 0ms
[07/23/2024-10:29:12] [I] Streams: 1
[07/23/2024-10:29:12] [I] ExposeDMA: Disabled
[07/23/2024-10:29:12] [I] Data transfers: Enabled
[07/23/2024-10:29:12] [I] Spin-wait: Disabled
[07/23/2024-10:29:12] [I] Multithreading: Disabled
[07/23/2024-10:29:12] [I] CUDA Graph: Disabled
[07/23/2024-10:29:12] [I] Separate profiling: Disabled
[07/23/2024-10:29:12] [I] Time Deserialize: Disabled
[07/23/2024-10:29:12] [I] Time Refit: Disabled
[07/23/2024-10:29:12] [I] Skip inference: Disabled
[07/23/2024-10:29:12] [I] Inputs:
[07/23/2024-10:29:12] [I] === Reporting Options ===
[07/23/2024-10:29:12] [I] Verbose: Disabled
[07/23/2024-10:29:12] [I] Averages: 10 inferences
[07/23/2024-10:29:12] [I] Percentile: 99
[07/23/2024-10:29:12] [I] Dump refittable layers:Disabled
[07/23/2024-10:29:12] [I] Dump output: Disabled
[07/23/2024-10:29:12] [I] Profile: Disabled
[07/23/2024-10:29:12] [I] Export timing to JSON file:
[07/23/2024-10:29:12] [I] Export output to JSON file:
[07/23/2024-10:29:12] [I] Export profile to JSON file:
[07/23/2024-10:29:12] [I]
[07/23/2024-10:29:12] [I] === Device Information ===
[07/23/2024-10:29:12] [I] Selected Device: Orin
[07/23/2024-10:29:12] [I] Compute Capability: 8.7
[07/23/2024-10:29:12] [I] SMs: 16
[07/23/2024-10:29:12] [I] Compute Clock Rate: 1.275 GHz
[07/23/2024-10:29:12] [I] Device Global Memory: 28902 MiB
[07/23/2024-10:29:12] [I] Shared Memory per SM: 164 KiB
[07/23/2024-10:29:12] [I] Memory Bus Width: 128 bits (ECC disabled)
[07/23/2024-10:29:12] [I] Memory Clock Rate: 1.275 GHz
[07/23/2024-10:29:12] [I]
[07/23/2024-10:29:12] [I] TensorRT version: 8.2.5
[07/23/2024-10:35:58] [I] [TRT] [MemUsageChange] Init CUDA: CPU +366, GPU +0, now: CPU 377, GPU 4944 (MiB)
[07/23/2024-10:35:59] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 377 MiB, GPU 4944 MiB
[07/23/2024-10:35:59] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 448 MiB, GPU 4945 MiB
[07/23/2024-10:35:59] [I] Start parsing network model
[07/23/2024-10:35:59] [I] [TRT] ----------------------------------------------------------------
[07/23/2024-10:35:59] [I] [TRT] Input filename: objectDetector_ALPS.onnx
[07/23/2024-10:35:59] [I] [TRT] ONNX IR version: 0.0.6
[07/23/2024-10:35:59] [I] [TRT] Opset version: 11
[07/23/2024-10:35:59] [I] [TRT] Producer name: pytorch
[07/23/2024-10:35:59] [I] [TRT] Producer version: 1.13.1
[07/23/2024-10:35:59] [I] [TRT] Domain:
[07/23/2024-10:35:59] [I] [TRT] Model version: 0
[07/23/2024-10:35:59] [I] [TRT] Doc string:
[07/23/2024-10:35:59] [I] [TRT] ----------------------------------------------------------------
[07/23/2024-10:35:59] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/23/2024-10:35:59] [W] [TRT] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[07/23/2024-10:35:59] [W] [TRT] Tensor DataType is determined at build time for tensors not marked as input or output.
[07/23/2024-10:35:59] [I] Finish parsing network model
[07/23/2024-10:36:00] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +964, GPU +435, now: CPU 1571, GPU 5380 (MiB)
[07/23/2024-10:41:24] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +408, GPU +1154, now: CPU 1979, GPU 6534 (MiB)
[07/23/2024-10:41:24] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[07/23/2024-10:41:24] [E] Error[2]: [utils.cpp::checkMemLimit::380] Error Code 2: Internal Error (Assertion upperBound != 0 failed. Unknown embedded device detected. Please update the table with the entry: {{2055, 16, 32}, 23121},)
[07/23/2024-10:41:24] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[07/23/2024-10:41:24] [E] Engine could not be created from network
[07/23/2024-10:41:24] [E] Building engine failed
[07/23/2024-10:41:24] [E] Failed to create engine from model.
[07/23/2024-10:41:24] [E] Engine set up failed

Dear @andreas.kloukiniotis,
Could you confirm the DRIVE OS version? Please check cat /etc/nvidia/version-*.txt. DRIVE OS 6.0.8.1 has TRT 8.6.11. But the log indicates you are using TRT 8.2.

Hello,
thanks for the quick reply.

cat /etc/nvidia/version-*.txt
6.0.8.1-34171226

dpkg-query -W tensorrt
tensorrt 8.2.5.1-1+cuda11.4

I installed trt 8.2 because it was the only one compatible with Cuda 11.4.
Should I try to install trt 8.6?

Dear @andreas.kloukiniotis,
TensorRT comes with DRIVE OS already. Please use the TRT version that comes with DRIVE OS.

The conversion indeed worked with the preinstalled tensorrt version.
But the trt that comes with the Drive os has no include folder so I can not find any header files.
Would downloading and copying them from here be okay?
https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/tars/TensorRT-8.6.1.6.Ubuntu-20.04.aarch64-gnu.cuda-12.0.tar.gz

Dear @andreas.kloukiniotis,
Please check Tensorrt and cudnn on drive os 6.0.8 - #10 by SivaRamaKrishnaNV and let us know if you have any issues.

Dear @andreas.kloukiniotis,
Is the issue resolved? Could you provide an update?

Yes the issue was resolved by following your indications. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.