Hello Community,
TensorRT: 10.3.0
NVIDIA GPU: NVIDIA Jetson Orin Nano 8GB - AVerMedia 131S SuperL4T
Nvidia driver version: L4T R36.4.4
cuDNN Version: 9.3.0.75
Operating System: Ubuntu 22.04.5 LTS
Python Version: Python 3.10.12
I am tying to run a computer vision model based on segmentation on a Jetson Orin nano industrial kit attcahed with 4 Realsense Cameras. I am using an ONNX model with an input of 1088Hx1920W at FP16. I am facing a memory restraint which means that the TensorRT is failing during engine build because it cannot find any kernel/tactic that both supports that node and fits in the memory currently available:
2026-03-10 14:10:14.842034592 [W:onnxruntime:Default, tensorrt_execution_provider.h:92 log] [2026-03-10 13:10:14 WARNING] Detected layernorm nodes in FP16.
2026-03-10 14:10:14.842109632 [W:onnxruntime:Default, tensorrt_execution_provider.h:92 log] [2026-03-10 13:10:14 WARNING] Running layernorm after self-attention with FP16 Reduce or Pow may cause overflow. Forcing Reduce or Pow Layers in FP32 precision, or exporting the model to use INormalizationLayer (available with ONNX opset >= 17) can help preserving accuracy.
2026-03-10 14:10:22.426030400 [W:onnxruntime:Default, tensorrt_execution_provider.h:92 log] [2026-03-10 13:10:22 WARNING] Tactic Device request: 161MB Available: 144MB. Device memory is insufficient to use tactic.
2026-03-10 14:10:22.437048032 [W:onnxruntime:Default, tensorrt_execution_provider.h:92 log] [2026-03-10 13:10:22 WARNING] UNSUPPORTED_STATE: Skipping tactic 0 due to insufficient memory on requested size of 169205760 detected for tactic 0x0000000000000000.
This same issue does not occur when I rely on teh CUDA execution provider when initilalising my model. How can I use teh TensorRT Execution provider to work on teh computer vision project, keeping teh same resolution for teh input images (1088Hx1920W), since I have already trained a lot of images, & it would be difficult to create teh ONNX weights file from scratch.