Jetson Orin Nano - Unknown embedded device detected

vovea · November 3, 2022, 2:26pm

Hi,
we did some evaluations in the last weeks using the Orin Devkit and the different emulations of Orin NX and Orin Nano. Our workflow is that we build a TensorRT engine from an ONNX and then benchmark the engine.

This worked fine for:

Devkit (AGX 64GB)
NX 16GB
Nano 8GB

On the Nano 4GB however, we experienced the following warnings when building with trtexec:

[11/03/2022-12:01:57] [W] [TRT] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.

It seems that the build freezes after some time.

I want to know if you can reproduce the error and if this is maybe related to the known issues I saw in the Orin Nano Emulation Overlay.

Thanks

AastaLLL · November 4, 2022, 1:29am

Hi,

The Orin Nano 4GB variant is added in TensorRT 8.5 (not available yet).
So you will get the warning when running TensorRT 8.4 (JetPack 5.0.2) on the Orin 4GB board.

But we provide a fallback flow for the non-listed device and it is expected TensorRT can work on these boards.
Could you check the device status with tegrastats to see if the freeze is caused by the memory shortage?

$ sudo tegrastats

We are also trying to reproduce this issue internally.
Will share more information with you later.

Thanks.

vovea · November 4, 2022, 10:51am

Hi,
thanks for the quick response.
The fallback flow you mentioned: Can I expect nearly the same performance with TRT 8.4.1 + fallback flow as in the future with 8.5? Or do you expect it to deliver significantly more performance (>10%) with 8.5?

In terms of memory shortage, you are correct. After monitoring tegrastats, I noticed, that the swap was running full while building the engine. I was able to build the engine either by increasing the swap to 4GB or setting the memPoolSize to 2GB.
This brings me to another question: Let’s say I set the MemPoolSize to 2GB for the tactics: How much RAM in total will the engine building need in total? Does this completely depend on the model I want to build?

Thanks

AastaLLL · November 7, 2022, 6:29am

Hi,

Yes, the performance should be similar.
And the MemPoolSize should depend on the model you use.

It’s expected that the building processing should work or rise an OOM error instead of a system freeze.
Could you share which model you want to benchmark?
We want to reproduce this issue in our environment as well.

Thanks.

vovea · November 7, 2022, 9:59am

Hi,

follow-up question to the memPoolSize: Does the

Using 2779MiB as the allocation cap for memory

mean, that tensorRT cannot use more memory, even if I try to overwrite the memPoolSize?
Because I tried another segmentation model, where the memPoolSize wasn’t enough to implement a node, and I couldn’t get tensorRT to use more than the ~2800MB (with 400-500MB RAM still free).
If that is the case: Is there a way to overwrite this memory cap, or do I have to wait for 8.5 to become available?

Regarding the model which caused the freeze:
Unfortunately, I can not share the exact model I used, but I exported the same model with the default coco pre-trained weights. YOLO4 COCO weights ONNX. It is a YOLO4 with the default architecture.

I tried to build with

/usr/src/tensorrt/bin/trtexec --onnx=yolov4_1_3_704_704_static.onnx

The swap was at the default 2GB.

Thanks

AastaLLL · November 9, 2022, 3:33am

Hi,

Thanks for sharing.
We are going to check the memPoolSize issue and will share more information with you later.

In general, TensorRT will try to allocate all the available GPU memory to deploy a faster algorithm.
(usually faster implementation requires more memory)
This also can be controlled via the --workspace configuration.

Thanks.

AastaLLL · November 10, 2022, 8:06am

Hi,

Do you get the model to work by setting the --memPoolSize=workspace:2048?

We try the model you shared and limit memPoolSize to 2G.
TensorRT fails with killed due to insufficient memory.

Thanks.

vovea · November 11, 2022, 9:39am

Hi,

I repeated the problematic engine builds yesterday to confirm this (some of them with my original model and some with the model I shared). To my surprise, all the engine builds succeeded.

I tried the following Swap/Workspace/Precision Combinations:

Swap: 4GB / FP32 --memPoolSize=workspace:2048
Swap: 4GB / FP32
Swap: 4GB / FP16 --memPoolSize=workspace:2048
Swap: 4GB / FP16
Swap: 2GB / FP16 --memPoolSize=workspace:2048
Swap: 2GB / FP16
Swap: 2GB / FP32 --memPoolSize=workspace:2048
Swap: 2GB / FP32

Nothing has changed in the setup since the problem occurred last time.
I work over SSH, with a background RAM usage of ~500MB at the beginning of the engine building.
The freeze last time looked like this: After starting the engine build, at some point the terminal got really laggy and I saw the swap running full. After that, both terminals (one with engine build and one running tegrastats) froze completely. After 2 hours with no response, I reset the Orin.

Is it possible that last time there was just enough RAM to not go OOM, but not enough for the SSH service to be responsive?

AastaLLL · November 15, 2022, 6:23am

Hi,

When we test this via ssh, the system is pretty slow due to the heavy workload.
However, it does respond but it might take minutes.

We have also confirmed that the model can work on TensorRT 8.5 without the “Unknown embedded device detected” error.
Thanks.

system · December 7, 2022, 4:28am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unknown embedded device detected and slow first inference Jetson Orin Nano tensorrt , tensorflow	6	953	October 6, 2023
[W] [TRT] Unknown embedded device detected. Using 59656MiB as the[W] [TRT] Unknown embedded device detected. Using 59656MiB as the allocation ca Jetson AGX Orin tensorrt	4	2551	July 31, 2023
Run inference with model exported by Jetson Orin Nano 8GB on Jetson Orin Nano 4GB Jetson Orin Nano tensorrt , yolo	5	957	August 3, 2023
TensorRT Conversion Fails On Orin Nano Jetson Orin Nano tensorrt	3	79	March 14, 2025
Tensorrt on Jetson Nano TensorRT tensorrt , cudnn , jetson	1	40	September 30, 2024
TensorRT Python API builder build_engine faiure - Error Code 2: OutOfMemory (no further information) TensorRT	1	1010	March 24, 2022
Device memory is insufficient for Jetson example Jetson Nano jetson-inference	3	1360	March 23, 2022
Build TensorRT Engine killed TensorRT tensorrt , board-design	1	504	December 9, 2020
Device memory is insufficient to use tactic Jetson AGX Orin tensorrt	2	1730	August 27, 2023
Device memory is insufficient to use tactic error when converting a model in SavedModel format to tensorrt model. Jetson Nano Jetson Nano tensorrt	3	2328	January 5, 2022

Jetson Orin Nano - Unknown embedded device detected

Related topics