TensorRT model deployement on Triton Inference

shiva.sanketh1 · May 4, 2023, 6:00pm

Hello.
I want load multiple TensorRT models on Triton to provide inference on my Jetson Xavier NX device. I found that Triton Inference Server for Edge computing is the best option for this. My pre-processing and my post-processing code is in python.

But in this link (server/jetson.md at main · triton-inference-server/server · GitHub ) they have mentioned

JetPack 5.0 does not support TensorRT using Onnx runtime.
CUDA IPC shared memory is not supported.
Python backend does not support GPU tensors.

My question is

JetPack 5.0 does not support TensorRT using Onnx runtime.
a. Can I use TensorRT saved models (.plan) ?
CUDA IPC shared memory is not supported.
a. What effect does this have on inference ?
b. Would gPRC request be better in this case ?
Python backend does not support GPU tensors.
a. What does this mean?

AastaLLL · May 5, 2023, 3:41am

Hi,

1. You can use the ONNX model as well as the TensorRT engine (.plan).
It is ONNXRuntime inference not to be supported.

2. CUDA IPC use for sharing buffer between processes.

3. This indicates you cannot use the tensor that uses a GPU buffer.

Thanks.

system · May 31, 2023, 12:44am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Triton GPU shared memory on Orin Jetson Projects	0	364	March 18, 2024
Does Python backend in Triton Server for Jetson supports GPU? Jetson Orin NX inference-server-triton	4	1056	November 1, 2022
TensorRT Quick Start Guide Example is not running (JetPack 4.2.2) Jetson AGX Xavier tensorrt , onnx	6	894	January 5, 2022
Unable to use TensorRTExecution Provider on Jetson AGX Xavier Jetson AGX Xavier tensorrt	9	576	April 18, 2024
Using Python3.7 instead python 3.6 on jetson TX1 and transfer installed Libraries Jetson TX1	7	1309	March 20, 2023
Error Code 2: Internal Error (Assertion enginePtr != nullptr failed.) Jetson AGX Xavier tensorrt , jetson-inference	3	4438	March 30, 2022
Mod operator unsupported in TensorRT 8.4.1 (included w/ Jetpack 5.0.2) TensorRT jetpack , tensorrt , cuda , jetson-inference , onnx	5	1531	January 2, 2023
ONNX to TensoRT conversion failing with error: "each train expected to have at most one ShapeHostToDeviceNode" Jetson Xavier NX tensorrt , pytorch	9	936	August 16, 2023
High RAM consumption with CUDA and TensorRT on Jetson Xavier NX Jetson Xavier NX tensorrt	10	2754	October 18, 2021
Can we execute TensorRT models on CPU? TensorRT	1	10013	October 27, 2020

TensorRT model deployement on Triton Inference

Related topics