Unable to use TensorRTExecution Provider on Jetson AGX Xavier

muhammad.shamim1 · March 14, 2024, 9:02pm

I have Onnxruntime-GPU 1.16.0, l4t-tensort-8.5.2.2, and cuda-11.4

I am on a Jetson AGX Xavier trying to decrease the inference time of an onnx model by using a gpu. However the model performs significantly worse on the GPU when using CUDAExecutionProvider.

nxrun.InferenceSession(onnx_model_path, sess_options=so, providers=[(“TensorrtExecutionProvider”, {“trt_fp16_enable”: True}), (“CUDAExecutionProvider”, {“cudnn_conv_algo_search”: “DEFAULT”})])

I have tried the following:

going from fp32 to fp16 even amp. (https://developer.nvidia.com/blog/end-to-end-ai-for-nvidia-based-pcs-optimizing-ai-by-transitioning-from-fp32-to-fp16/)
Tried different flags with CUDAExecutionProvider.
Tried multiple NVIDIA blogs to try and decrease the inference time by converting to a .plan file.

I am trying to use TensorRTExecutionProvider but am unable to load my model when I try this session option. The error I run into is:

jetsonserver-1 | 2024-03-14 20:39:12.224268543 [I:onnxruntime:Default, tensorrt_execution_provider_utils.h:520 TRTGenerateId] [TensorRT EP] Model name is model.onnx
jetsonserver-1 | 2024-03-14 20:39:14.826604646 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2024-03-14 20:39:14 WARNING] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
jetsonserver-1 | 2024-03-14 20:39:15.086108633 [I:onnxruntime:Default, tensorrt_execution_provider.cc:1392 GetSubGraph] [TensorRT EP] TensorRT subgraph MetaDef name TRTKernel_graph_torch_jit_15612526831647503026_0**
jetsonserver-1 | 2024-03-14 20:39:15.086976803 [I:onnxruntime:Default, tensorrt_execution_provider.cc:1392 GetSubGraph] [TensorRT EP] TensorRT subgraph MetaDef name TRTKernel_graph_torch_jit_15612526831647503026_0**
jetsonserver-1 | 2024-03-14 20:39:15.087127754 [I:onnxruntime:Default, tensorrt_execution_provider.cc:1884 GetCapability] [TensorRT EP] Whole graph will run on TensorRT execution provider**
jetsonserver-1 | 2024-03-14 20:39:16.117593029 [W:onnxruntime:Default, tensorrt_execution_provider.cc:2173 Compile] [TensorRT EP] Builder optimization level can only be used on TRT 8.6 onwards!**

However TR8.6 requires cuda11.8 which my jetson agx xavier doesn’t have/can’t support (?). I am using Jetpack SDK 5.1.2

Guidance for the following would be much appreciated:
1) How can I use TensorrtExecutionProvider on my docker container.
2) Decrease inference time

rajupadhyay59 · March 15, 2024, 1:01am

Maybe try upgrading your cuda 11.8?

during apt-get install cuda, be careful to mention the version as well.

AastaLLL · March 15, 2024, 3:00am

Hi,

It looks like your package is not compatible with the JetPack environment.
How do you install onnxruntime?

You can find the wheel for JetPack users in our eLinux wiki below:
https://elinux.org/Jetson_Zoo#ONNX_Runtime

Thanks.

muhammad.shamim1 · March 17, 2024, 9:25pm

I install ONNXRUNTIME using that Jetson Zoo link you provided.

And was using version 1.16.0 for python 3.8

muhammad.shamim1 · March 22, 2024, 2:26pm

Bump.

My Dockerfile is defined as below. My model fails to load and I am unable to run inference on it.

so = nxrun.SessionOptions()
so.intra_op_num_threads = 4
so.log_severity_level = 3
nxrun.InferenceSession(onnx_model_path, sess_options=so, providers=[“TensorrtExecutionProvider”, (“CUDAExecutionProvider”, {“cudnn_conv_algo_search”: “DEFAULT”})])

ERROR: Gets stuck on 2024-03-22 14:22:43.238667386 [I:onnxruntime:Default, tensorrt_execution_provider.cc:1884 GetCapability] [TensorRT EP] Whole graph will run on TensorRT execution provider

If I use this instead (Added trt_fp_16_enable):

nxrun.InferenceSession(onnx_model_path, sess_options=so, providers=[(“TensorrtExecutionProvider”, {“trt_fp16_enable”: True}), (“CUDAExecutionProvider”, {“cudnn_conv_algo_search”: “DEFAULT”})])

I get stuck where I originally showed.

Any guidance is greatly appreciated.

Dockerfile.txt (3.0 KB)

kayccc · April 10, 2024, 5:09am

Is this still an issue to support? Any result can be shared?

AastaLLL · April 11, 2024, 6:12am

Hi,

It’s more recommended to use TensorRT for inference since it is optimized for both memory and performance.
You can run it with /usr/src/tensorrt/bin/trtexec --onnx=[model].

Please let us know if onnxruntime is more preferred.
Thanks.

muhammad.shamim1 · April 11, 2024, 3:23pm

Hi @kayccc and @AastaLLL thank you for the replies. I did use trtexec but it seemed as though I was only able to do multiple runs of random input data. Is there a way to incorporate it into my flask website that is run on a docker container?

AastaLLL · April 18, 2024, 6:04am

Hi,

You can build your container on the top of l4t-jetpack or l4t-tensorrt:

Thanks.

system · May 8, 2024, 6:19am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to run ONNX runtime with TensorRT execution provider on docker based on NVidia image CUDA Setup and Installation	4	7454	June 22, 2022
Clarity needed on differences between acceleration frameworks/runtimes for AGX Xavier Jetson AGX Xavier tensorrt , cuda , onnx	4	945	October 18, 2021
TensorRT Quick Start Guide Example is not running (JetPack 4.2.2) Jetson AGX Xavier tensorrt , onnx	6	895	January 5, 2022
I am unable to perform inference with the TensorRTExecutionProvider specified in onnxruntime TensorRT tensorrt , cudnn	1	430	November 30, 2023
Could not infer onnx model for TensorrtExecutionProvider provider TensorRT tensorrt , onnx	1	1157	November 11, 2022
Performance DECREASE with tensorRT under onnxruntime, pt2 Jetson AGX Xavier tensorrt	5	2771	May 25, 2022
Build ONNXInference-gpu wheel for Jetpack5 with Cuda and TRT Jetson AGX Orin tensorrt , cuda , onnx	6	2643	August 10, 2022
Trouble building onnxruntime with tensorrt Jetson AGX Xavier tensorrt , jetson-inference	7	1677	February 11, 2022
If CUDA 11.4.19 is installed, is TensorRT 8.4 required? Jetson AGX Orin tensorrt , cuda	6	978	August 31, 2023
[E] Error[1]: [genericReformat.cu::executeMemcpy::1334] Error Code 1: Cuda Runtime (invalid argument) Jetson AGX Xavier tensorrt	10	1218	September 19, 2022

Unable to use TensorRTExecution Provider on Jetson AGX Xavier

Related topics