TensorRT for combined encoder/decoder model

tony161 · May 14, 2025, 9:15pm

what is the best way to convert a combined encoder decoder transformer model to tensorrt. The model is used by calling model.encode() and model.decode(), which is different then the typical forward pass that is supported. Also, once the model is passed through torch_tensorrt.compile, what is the expected type of the output converted model? What do we do if the actual converted model runs slower than the original unconverted model? Is there a docker container image that exists for jetson that has tensorrt, jetpack 6.2, and tensorrt-llm?

Are there strategies to reduce memory usage during conversion like incremental optimizations?
Whether dynamic shapes affect optimization performance?
Best practices for custom operations like a specialized patch embedding layer or rotary positional encoding.

Topic		Replies	Views
Loading Optimized saved_model with c++ Jetson AGX Xavier tensorrt	4	404	December 22, 2021
TensorRT Conversion Jetson Nano tensorrt	4	415	October 15, 2021
Training TensorRT model Deep Learning (Training & Inference) tensorrt , tensorflow , jetson-inference	1	332	June 5, 2020
Jetson Nano convert tensorflow model to tensorrt Jetson Nano tensorrt , tensorflow	4	1082	February 7, 2023
Different ways to convert TensorFlow model to TensorRT Jetson Nano tensorrt , tensorflow , jetson-inference , onnx	3	2314	October 19, 2022
What is the procedure to convert tensorflow model to TensorRT for running inference on jetson nano Jetson Nano tensorrt , jetson-inference	2	393	February 1, 2023
Onnx model to Tensorrt conversion Jetson Xavier NX tensorrt	2	368	September 6, 2023
Converting tensorflow model to tensorrt GPU-Accelerated Libraries	0	349	May 30, 2020
Custom Faster RCNN Optimisation Using TensorRT on Jetson Nano Jetson Nano	4	1414	October 18, 2021
converting frcnn model in to tensorRT TensorRT	1	887	February 26, 2020

TensorRT for combined encoder/decoder model

Related topics