Convert onnx model using trtexec in DRIVE OS

andreas.kloukiniotis · July 23, 2024, 12:10pm

Please provide the following info (tick the boxes after creating this topic):
Software Version
[+] DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[+] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[+] DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
[+] native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

I am trying to convert an onnx model using tesnorrt but I get the following error. Do you have any feedback on that

/usr/src/tensorrt/bin/trtexec_debug --onnx=objectDetector_ALPS.onnx
&&&& RUNNING TensorRT.trtexec [TensorRT v8205] # /usr/src/tensorrt/bin/trtexec_debug --onnx=objectDetector_ALPS.onnx
[07/23/2024-10:29:12] [I] === Model Options ===
[07/23/2024-10:29:12] [I] Format: ONNX
[07/23/2024-10:29:12] [I] Model: objectDetector_ALPS.onnx
[07/23/2024-10:29:12] [I] Output:
[07/23/2024-10:29:12] [I] === Build Options ===
[07/23/2024-10:29:12] [I] Max batch: explicit batch
[07/23/2024-10:29:12] [I] Workspace: 16 MiB
[07/23/2024-10:29:12] [I] minTiming: 1
[07/23/2024-10:29:12] [I] avgTiming: 8
[07/23/2024-10:29:12] [I] Precision: FP32
[07/23/2024-10:29:12] [I] Calibration:
[07/23/2024-10:29:12] [I] Refit: Disabled
[07/23/2024-10:29:12] [I] Sparsity: Disabled
[07/23/2024-10:29:12] [I] Safe mode: Disabled
[07/23/2024-10:29:12] [I] DirectIO mode: Disabled
[07/23/2024-10:29:12] [I] Restricted mode: Disabled
[07/23/2024-10:29:12] [I] Save engine:
[07/23/2024-10:29:12] [I] Load engine:
[07/23/2024-10:29:12] [I] Profiling verbosity: 0
[07/23/2024-10:29:12] [I] Tactic sources: Using default tactic sources
[07/23/2024-10:29:12] [I] timingCacheMode: local
[07/23/2024-10:29:12] [I] timingCacheFile:
[07/23/2024-10:29:12] [I] Input(s)s format: fp32:CHW
[07/23/2024-10:29:12] [I] Output(s)s format: fp32:CHW
[07/23/2024-10:29:12] [I] Input build shapes: model
[07/23/2024-10:29:12] [I] Input calibration shapes: model
[07/23/2024-10:29:12] [I] === System Options ===
[07/23/2024-10:29:12] [I] Device: 0
[07/23/2024-10:29:12] [I] DLACore:
[07/23/2024-10:29:12] [I] Plugins:
[07/23/2024-10:29:12] [I] === Inference Options ===
[07/23/2024-10:29:12] [I] Batch: Explicit
[07/23/2024-10:29:12] [I] Input inference shapes: model
[07/23/2024-10:29:12] [I] Iterations: 10
[07/23/2024-10:29:12] [I] Duration: 3s (+ 200ms warm up)
[07/23/2024-10:29:12] [I] Sleep time: 0ms
[07/23/2024-10:29:12] [I] Idle time: 0ms
[07/23/2024-10:29:12] [I] Streams: 1
[07/23/2024-10:29:12] [I] ExposeDMA: Disabled
[07/23/2024-10:29:12] [I] Data transfers: Enabled
[07/23/2024-10:29:12] [I] Spin-wait: Disabled
[07/23/2024-10:29:12] [I] Multithreading: Disabled
[07/23/2024-10:29:12] [I] CUDA Graph: Disabled
[07/23/2024-10:29:12] [I] Separate profiling: Disabled
[07/23/2024-10:29:12] [I] Time Deserialize: Disabled
[07/23/2024-10:29:12] [I] Time Refit: Disabled
[07/23/2024-10:29:12] [I] Skip inference: Disabled
[07/23/2024-10:29:12] [I] Inputs:
[07/23/2024-10:29:12] [I] === Reporting Options ===
[07/23/2024-10:29:12] [I] Verbose: Disabled
[07/23/2024-10:29:12] [I] Averages: 10 inferences
[07/23/2024-10:29:12] [I] Percentile: 99
[07/23/2024-10:29:12] [I] Dump refittable layers:Disabled
[07/23/2024-10:29:12] [I] Dump output: Disabled
[07/23/2024-10:29:12] [I] Profile: Disabled
[07/23/2024-10:29:12] [I] Export timing to JSON file:
[07/23/2024-10:29:12] [I] Export output to JSON file:
[07/23/2024-10:29:12] [I] Export profile to JSON file:
[07/23/2024-10:29:12] [I]
[07/23/2024-10:29:12] [I] === Device Information ===
[07/23/2024-10:29:12] [I] Selected Device: Orin
[07/23/2024-10:29:12] [I] Compute Capability: 8.7
[07/23/2024-10:29:12] [I] SMs: 16
[07/23/2024-10:29:12] [I] Compute Clock Rate: 1.275 GHz
[07/23/2024-10:29:12] [I] Device Global Memory: 28902 MiB
[07/23/2024-10:29:12] [I] Shared Memory per SM: 164 KiB
[07/23/2024-10:29:12] [I] Memory Bus Width: 128 bits (ECC disabled)
[07/23/2024-10:29:12] [I] Memory Clock Rate: 1.275 GHz
[07/23/2024-10:29:12] [I]
[07/23/2024-10:29:12] [I] TensorRT version: 8.2.5
[07/23/2024-10:35:58] [I] [TRT] [MemUsageChange] Init CUDA: CPU +366, GPU +0, now: CPU 377, GPU 4944 (MiB)
[07/23/2024-10:35:59] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 377 MiB, GPU 4944 MiB
[07/23/2024-10:35:59] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 448 MiB, GPU 4945 MiB
[07/23/2024-10:35:59] [I] Start parsing network model
[07/23/2024-10:35:59] [I] [TRT] ----------------------------------------------------------------
[07/23/2024-10:35:59] [I] [TRT] Input filename: objectDetector_ALPS.onnx
[07/23/2024-10:35:59] [I] [TRT] ONNX IR version: 0.0.6
[07/23/2024-10:35:59] [I] [TRT] Opset version: 11
[07/23/2024-10:35:59] [I] [TRT] Producer name: pytorch
[07/23/2024-10:35:59] [I] [TRT] Producer version: 1.13.1
[07/23/2024-10:35:59] [I] [TRT] Domain:
[07/23/2024-10:35:59] [I] [TRT] Model version: 0
[07/23/2024-10:35:59] [I] [TRT] Doc string:
[07/23/2024-10:35:59] [I] [TRT] ----------------------------------------------------------------
[07/23/2024-10:35:59] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/23/2024-10:35:59] [W] [TRT] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[07/23/2024-10:35:59] [W] [TRT] Tensor DataType is determined at build time for tensors not marked as input or output.
[07/23/2024-10:35:59] [I] Finish parsing network model
[07/23/2024-10:36:00] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +964, GPU +435, now: CPU 1571, GPU 5380 (MiB)
[07/23/2024-10:41:24] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +408, GPU +1154, now: CPU 1979, GPU 6534 (MiB)
[07/23/2024-10:41:24] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[07/23/2024-10:41:24] [E] Error[2]: [utils.cpp::checkMemLimit::380] Error Code 2: Internal Error (Assertion upperBound != 0 failed. Unknown embedded device detected. Please update the table with the entry: {{2055, 16, 32}, 23121},)
[07/23/2024-10:41:24] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[07/23/2024-10:41:24] [E] Engine could not be created from network
[07/23/2024-10:41:24] [E] Building engine failed
[07/23/2024-10:41:24] [E] Failed to create engine from model.
[07/23/2024-10:41:24] [E] Engine set up failed

SivaRamaKrishnaNV · July 23, 2024, 12:28pm

Dear @andreas.kloukiniotis,
Could you confirm the DRIVE OS version? Please check cat /etc/nvidia/version-*.txt. DRIVE OS 6.0.8.1 has TRT 8.6.11. But the log indicates you are using TRT 8.2.

andreas.kloukiniotis · July 23, 2024, 12:36pm

Hello,
thanks for the quick reply.

cat /etc/nvidia/version-*.txt
6.0.8.1-34171226

dpkg-query -W tensorrt
tensorrt 8.2.5.1-1+cuda11.4

I installed trt 8.2 because it was the only one compatible with Cuda 11.4.
Should I try to install trt 8.6?

SivaRamaKrishnaNV · July 23, 2024, 12:41pm

Dear @andreas.kloukiniotis,
TensorRT comes with DRIVE OS already. Please use the TRT version that comes with DRIVE OS.

andreas.kloukiniotis · July 23, 2024, 2:46pm

The conversion indeed worked with the preinstalled tensorrt version.
But the trt that comes with the Drive os has no include folder so I can not find any header files.
Would downloading and copying them from here be okay?
https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/tars/TensorRT-8.6.1.6.Ubuntu-20.04.aarch64-gnu.cuda-12.0.tar.gz

SivaRamaKrishnaNV · July 23, 2024, 3:23pm

Dear @andreas.kloukiniotis,
Please check Tensorrt and cudnn on drive os 6.0.8 - #10 by SivaRamaKrishnaNV and let us know if you have any issues.

SivaRamaKrishnaNV · August 5, 2024, 1:25am

Dear @andreas.kloukiniotis,
Is the issue resolved? Could you provide an update?

andreas.kloukiniotis · August 9, 2024, 8:55am

Yes the issue was resolved by following your indications. Thanks

system · September 4, 2024, 1:40pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Convert onnx model using trtexec in DRIVE OS DRIVE AGX Orin General driveos-dl	2	18	August 6, 2024
Driveworks TensorRT optimisation tool or compiler DRIVE AGX Orin General driveworks-dnn-framework	4	578	January 26, 2024
Driveworks_tensorrt_optimization tool DRIVE AGX Orin General driveworks-dnn-framework	9	545	February 28, 2024
TensorRT problem on NVIDIA APEX ORIN NX TensorRT tensorrt , jetson-inference , cudnn	1	33	August 29, 2024
Drive AGX Orin TensorRT inference failed DRIVE AGX Orin General driveos-dl	25	960	September 14, 2023
TRT on Drive OS: Failed to get number of available DLA devices DRIVE AGX Orin General driveos-dl	5	268	May 21, 2024
Keras->Onnx->TensorRT Jetson AGX Orin tensorrt	4	96	September 25, 2024
Error in executing TensorRT samples through docker container environment DRIVE AGX Orin General docker , driveos-dl	14	85	October 24, 2024
Trtexec Error：Could not find any implementation for node DRIVE AGX Orin General driveos-dl	15	415	October 3, 2024
Facing error while trying to convert onnx model using TensorRT Optimisation tool DRIVE AGX Xavier General driveos-dl	10	1421	April 3, 2023

Convert onnx model using trtexec in DRIVE OS

Related topics