TensorRT Conversion Fails On Orin Nano

destrooooyer · March 13, 2025, 1:18am

Hi,
I’m trying to convert an ONNX model to TensorRT using FP16 precision on my Orin Nano 8GB and failed. Here’s the last portion of the log when the conversion failed:

[01/01/1970-10:01:52] [V] [TRT] --------------- Timing Runner: {ForeignNode[/Concat_429_output_0.../Concat_431]} (Myelin[0x80000023])
[01/01/1970-10:01:53] [V] [TRT] [MemUsageChange] Subgraph create: CPU +72, GPU +0, now: CPU 4883, GPU 7414 (MiB)
[01/01/1970-10:04:11] [V] [TRT]  (foreignNode) Set user's cuda kernel library
[01/01/1970-10:04:11] [V] [TRT] Subgraph compilation completed in 137.516 seconds.
[01/01/1970-10:04:11] [V] [TRT] [MemUsageChange] Subgraph compilation: CPU +16, GPU -116, now: CPU 4899, GPU 7298 (MiB)
[01/01/1970-10:04:11] [W] [TRT] Tactic Device request: 399MB Available: 323MB. Device memory is insufficient to use tactic.
[01/01/1970-10:04:11] [W] [TRT] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 419033088 detected for tactic 0x0000000000000000.
[01/01/1970-10:04:11] [V] [TRT] {ForeignNode[/Concat_429_output_0.../Concat_431]} (Myelin[0x80000023]) profiling completed in 138.763 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[01/01/1970-10:04:11] [V] [TRT] *************** Autotuning format combination: Half(258048,1:8,2016,28), Half(258048,1:8,2016,28), Half(258048,1:8,2016,28), Half(18432,1:8,144,2), Half(18432,1:8,144,2) -> Half(2654208,82944,1:8,648,9) ***************
[01/01/1970-10:04:11] [V] [TRT] --------------- Timing Runner: {ForeignNode[/Concat_429_output_0.../Concat_431]} (Myelin[0x80000023])
[01/01/1970-10:04:12] [V] [TRT] [MemUsageChange] Subgraph create: CPU +71, GPU +3, now: CPU 4954, GPU 7303 (MiB)
[01/01/1970-10:06:12] [V] [TRT]  (foreignNode) Set user's cuda kernel library
[01/01/1970-10:06:12] [V] [TRT] Subgraph compilation completed in 120.025 seconds.
[01/01/1970-10:06:12] [V] [TRT] [MemUsageChange] Subgraph compilation: CPU +16, GPU -68, now: CPU 4970, GPU 7235 (MiB)
[01/01/1970-10:06:13] [W] [TRT] Tactic Device request: 438MB Available: 390MB. Device memory is insufficient to use tactic.
[01/01/1970-10:06:13] [W] [TRT] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 459730944 detected for tactic 0x0000000000000000.
[01/01/1970-10:06:13] [V] [TRT] {ForeignNode[/Concat_429_output_0.../Concat_431]} (Myelin[0x80000023]) profiling completed in 121.248 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[01/01/1970-10:06:14] [E] Error[10]: IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[/Concat_429_output_0.../Concat_431]}.)
[01/01/1970-10:06:14] [E] Engine could not be created from network
[01/01/1970-10:06:14] [E] Building engine failed
[01/01/1970-10:06:14] [E] Failed to create engine from model or file.
[01/01/1970-10:06:14] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100300] # ./trtexec --onnx=/home/nvidia/foundation_stereo/p2.onnx --saveEngine=./test.engine --fp16 --verbose

It seems the failure is related to insufficient memory?
I also tested converting this ONNX model to TensorRT with FP16 precision on my local PC (with an RTX 4070 Ti Super), and succeed. The converted TensorRT engine takes about 1.8GB memory when inferencing.

I have TensorRT-10.7.0.23 on my local PC, and TensorRT-10.3.0.30 on my Jetson Orin Nano (Jetpack 6.1)

Any suggestion? Thanks in advance for your help.

Best regards

AastaLLL · March 13, 2025, 2:33am

Hi,

Please increase builderOptimizationLevel to allow TensorRT to spend more building time for more optimization options.

Ex.

$ /usr/src/tensorrt/bin/trtexec --builderOptimizationLevel=4 ...

Or

$ /usr/src/tensorrt/bin/trtexec --builderOptimizationLevel=5 ...

Thanks.

destrooooyer · March 14, 2025, 1:39am

Following your suggested approach, I’ve successfully resolved the issue! Really appreciate your help！

system · April 9, 2025, 2:58am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Device memory is insufficient to use tactic error when converting a model in SavedModel format to tensorrt model. Jetson Nano Jetson Nano tensorrt	3	2346	January 5, 2022
Run inference with model exported by Jetson Orin Nano 8GB on Jetson Orin Nano 4GB Jetson Orin Nano tensorrt , yolo	5	984	August 3, 2023
TensorRT process killed with Orin Nano Jetson Orin Nano tensorrt	9	1395	July 27, 2023
How to convert Tensorflow model to Tensorrt? Jetson Nano tensorrt , tensorflow	8	2438	October 15, 2021
TensorRT Version Jetson Orin Nano tensorrt	2	315	July 13, 2023
Why inference in jetson nano with fp16 is slower than fp32 Jetson Nano tensorrt , jetson-inference	9	2001	September 5, 2021
Memory Issues and Conversion issues with TF-TRT on Nano Jetson Nano tensorrt	8	1582	October 18, 2021
Error Converting model to tensor RT Jetson Nano tensorrt , tensorflow	3	696	October 15, 2021
Converting tensorflow pb model file to tensorrt GPU memory error TensorRT	2	1170	October 17, 2019
Infer time after conversion and ram usage TensorRT tensorrt	12	1090	February 15, 2022

TensorRT Conversion Fails On Orin Nano

Related topics