Trouble building onnxruntime with tensorrt

Hi All, first time poster~

I’m trying to build onnxruntime with tensorrt support on my jetson agx xavier with jetpack v4.6. I’m following instructions off of this page Build with different EPs | onnxruntime but my build fails. The most common error is:

onnxruntime/gsl/gsl-lite.hpp(1959): warning: calling a host function from a host device function is not allowed

I’ve tried with the latest CMAKE version 3.22.1, and version 3.21.1 as mentioned on the website.

See attachment for the full text log.
jetstonagx_onnxruntime-tensorrt_install.log (168.6 KB)

The end goal of this build is to create a .whl binary to then use as part of the installation process of another program in a docker container. Any help and insight is appreciated, thank you!

-Sidney

Hi,

You can find some prebuilt packages for JetPack 4.6 at the below link:
Does it meet your requirement or do you want to build it from the source?

https://elinux.org/Jetson_Zoo#ONNX_Runtime

Thanks.

I started off there, and found the link I included in my first post on that page too, under “Build from Source”. The prebuild wheels work fine, but they do not include the tensorrt backend. I’m trying to build onnxruntime such that it includes the tensorrt backend. Has anyone else tried or achieved this? Does anything from the attached build logfile stand out?
Thanks.

Hi,

Thanks for your feedback.

Ideally, it should work.
We are going to reproduce this issue first. Will share more information with you later.

Hi,

We just double-check the wheel package shared on the eLinux page.
With v1.10.0+JetPack4.6, we can run ONNXRuntime with TensorrtExecutionProvider successfully.

Would you mind giving it a try?

$ python3
Python 3.6.9 (default, Dec  8 2021, 21:08:43)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import onnxruntime as ort
>>> sess = ort.InferenceSession('/usr/src/tensorrt/data/mnist/mnist.onnx', providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider'])
2022-01-25 03:08:29.992372812 [W:onnxruntime:Default, tensorrt_execution_provider.h:53 log] [2022-01-25 08:08:29 WARNING] /home/onnxruntime/onnxruntime-py36/cmake/external/onnx-tensorrt/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2022-01-25 03:08:31.460957667 [W:onnxruntime:Default, tensorrt_execution_provider.h:53 log] [2022-01-25 08:08:31 WARNING] Detected invalid timing cache, setup a local cache instead
>>> 

Thanks.

Hi AastaLLL,

This helped! I was using wheel for ort v1.8.0. The latest v1.10.0 wheel for Jetsons seems to include the tensorrt provider out of the box. Thank you!

I was expecting a speed-up from using TRT with my models. Instead I’m seeing a significant (15-20x) slowdown. What am I missing? (Please let me know if I should make a new post for this question continuation).

The following runs show the seconds it took to run an inception_v3 and inception_v4 model on 100 images using CUDAExecutionProvider and TensorrtExecutionProvider respectively. The models were trained and converted to onnx using pytorch on a different computer. The runs are executed through docker on the Jetson AGX device in MAXN mode.
Using JTop I can see that with CUDAExecutionProvider the GPU is always fully engaged, and with TensorrtExecutionProvider the GPU is intermittently engaged, like it’s sputtering.

      inception_v3  inception_v4
CUDA           11s           16s
TRT           223s          257s

So the best speed I’m getting is ~9img/sec. Shouldn’t I be able to crank out more frames per seconds?

Hi,

Yes. It will be good to open a new topic for the performance issue.

Ideally, you should get some acceleration when deploying with TensorRT.
Let’s check this in deep on the new topic.

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.