Unknown embedded device detected and slow first inference

nikola2 · August 28, 2023, 10:55am

I’m new to TensorRT and when I’m using Jetson Orin first inference is slow. I have two custom models and it happens with both of them (first inference time is 6 minutes and 2 minutes).

Some of the warnings I get when I run a simple script:

2023-08-28 03:26:10.501539: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:10.538141: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:10.538410: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:10.540753: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:10.541093: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:10.541218: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:11.131744: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:11.132048: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:11.132123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1708] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-08-28 03:26:11.132239: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:984] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-08-28 03:26:11.132361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1621] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 44820 MB memory: → device: 0, name: Orin, pci bus id: 0000:00:00.0, compute capability: 8.7
2023-08-28 03:26:27.488727: I tensorflow/compiler/tf2tensorrt/common/utils.cc:104] Linked TensorRT version: 8.5.2
2023-08-28 03:26:27.488989: I tensorflow/compiler/tf2tensorrt/common/utils.cc:106] Loaded TensorRT version: 8.5.2
2023-08-28 03:26:31.737074: I tensorflow/compiler/tf2tensorrt/convert/convert_nodes.cc:1344] [TF-TRT] Sparse compute capability is enabled.
2023-08-28 03:26:33.351818: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:26:33.354820: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.

2023-08-28 03:28:40.624897: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:28:40.627248: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:28:40.629653: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:28:40.632364: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:28:40.635026: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:28:40.637646: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:28:40.640360: W tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:83] TF-TRT Warning: DefaultLogger Unknown embedded device detected. Using 59656MiB as the allocation cap for memory on embedded devices.
2023-08-28 03:32:07.526010: W tensorflow/compiler/tf2tensorrt/convert/convert_nodes.cc:6003] TF-TRT Warning: Validation failed for TensorRTInputPH_0 and input slot 0: Input tensor with shape [1,0,2] is an empty tensor, which is not supported by TRT
2023-08-28 03:32:07.730802: W tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:1103] TF-TRT Warning: Engine creation for PartitionedCall/TRTEngineOp_000_000 failed. The native segment will be used instead. Reason: UNIMPLEMENTED: Validation failed for TensorRTInputPH_0 and input slot 0: Input tensor with shape [1,0,2] is an empty tensor, which is not supported by TRT
2023-08-28 03:32:07.731073: W tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:936] TF-TRT Warning: Engine retrieval for input shapes: [[1,0,2], [1,0,2]] failed. Running native segment for PartitionedCall/TRTEngineOp_000_000
2023-08-28 03:32:07.758491: W tensorflow/compiler/tf2tensorrt/kernels/trt_engine_op.cc:936] TF-TRT Warning: Engine retrieval for input shapes: [[1,0,2], [1,0,2]] failed. Running native segment for PartitionedCall/TRTEngineOp_000_000
{‘tf_op_layer_concat_18’: <tf.Tensor: shape=(1, 0, 12), dtype=float32, numpy=array(, shape=(1, 0, 12), dtype=float32)>}
Inference time: 0:05:44.262719
Inference time: 0:00:00.021712

Are these warnings related to the speed of first inference?

AastaLLL · August 30, 2023, 3:11am

Hi,

Could you share the details about your board and JetPack version?
Do you use Orin or Orin Nano?

Thanks.

AastaLLL · August 30, 2023, 3:23am

Hi,

If you are using Orin 64GB and Jetpack 5.1.2, the warning is expected and the fix will be available in TensorRT 8.6.

It is a harmless warning, TensorRT should work without an issue.
Please let us know if TensorRT is not working in your environment.

Thanks.

nikola2 · August 30, 2023, 6:56am

Hi,

I’m using NVIDIA®Jetson AGX Orin™ 64GB Developer Kit

Jetpack version (output of sudo apt-cache show nvidia-jetpack):

Package: nvidia-jetpack
Version: 5.1-b147
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 5.1-b147), nvidia-jetpack-dev (= 5.1-b147)

So I guess everything is ok then?

AastaLLL · August 31, 2023, 8:04am

Hi,

Yes, the warning will be fixed in JetPack 6/TensorRT 8.6 but everything can work normally in JetPack 5+TensorRT 8.5.
Sorry for the inconvenience.

Thanks.

le.lokaram.t · September 22, 2023, 6:56am

Hello @AastaLLL ,

When I run the command “python3 deepstream_test_1.py <sample_video.h264>” on my Orin device, I receive a TRT warning which is running continuously providing output after couple of minutes.
Asked this question, as this issue isn’ t occurred in the other Orin device(same TRT version) where the output is displayed immediately without the warning mentioned below. Please let me know if this is a problem with TRT 8.5 or some other?

WARNING: [TRT]: Unknown embedded device detected. Using 59660MiB as the allocation cap for memory on embedded devices.

system · October 6, 2023, 6:56am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[W] [TRT] Unknown embedded device detected. Using 59656MiB as the[W] [TRT] Unknown embedded device detected. Using 59656MiB as the allocation ca Jetson AGX Orin tensorrt	4	2551	July 31, 2023
Jetson Orin Nano - Unknown embedded device detected Jetson Orin Nano tensorrt	9	3030	December 7, 2022
GPU not detected on Nvidia Jetson Orin Nano after sudden reboot (I need to run TensorRT engine inference) Jetson Orin Nano cuda	2	362	May 23, 2024
Jetson Orin Nano Developer Kit, Jetpack, Cuda, Tensorflow with GPU and TensorRT Jetson Orin Nano tensorflow	16	3644	September 28, 2023
TensorRT gives different results on Jetson Orin Jetson AGX Orin tensorrt , nvbugs	6	833	June 5, 2023
Performance analysis on Jetson Orin Nano 8GB Jetson Nano cudnn	2	450	June 4, 2024
Run inference with model exported by Jetson Orin Nano 8GB on Jetson Orin Nano 4GB Jetson Orin Nano tensorrt , yolo	5	957	August 3, 2023
TensorRT 8.6 not running properly on Orin NX with Jetpack 6 Jetson AGX Orin tensorrt , generative_ai	6	1054	December 25, 2023
TensorRT gives different results Jetson Orin Nano tensorrt	2	77	January 2, 2025
Performance difference between Jetpack and TensorRT versions Jetson Nano tensorrt , jetson-inference	7	482	June 21, 2023

Unknown embedded device detected and slow first inference

Related topics