ONNX TensorRT Engines FP16/32

xiaokangxu2 · October 2, 2025, 4:40pm

Hello,

I was exporting some YOLO models to TensorRT and had a question about precisions. I am first exporting to ONNX via Ultralytics and then building the TensorRT engine myself on a Jetson Nano 2 GB. This is the code I am using

yolo export model=yolov8n-seg.pt format=onnx opset=12 imgsz=512

/usr/src/tensorrt/bin/trtexec \
    --onnx=yolov8n-seg.onnx \
    --saveEngine=yolov8n-seg.engine \
    --workspace=512 \
    --fp16

My understanding is that this exports .pt (which doesn’t have a specific precision) to FP32 ONNX, and then converts to FP16 when building the TRT engine. Is there a practical difference between this and exporting to FP16 ONNX and then converting to FP16 when building?

Thank you!

AastaLLL · October 3, 2025, 5:39am

Hi,

It’s recommended to save ONNX with full precision and quantize it when converting it to TensorRT.
So the quantization is measured by a deployed algorithm.

Thanks.

xiaokangxu2 · October 6, 2025, 4:53am

Thank you! By quantization you are referring to FP32 ONNX to FP16 TensorRT, right (since INT8 is not supported on Nano 2 GB, if my understanding is correct)?

kayccc · October 21, 2025, 7:59am

Is this still an issue to support? Any result can be shared?

AastaLLL · October 23, 2025, 5:00am

Hi,

Yes, you can find the detailed support matrix in the link below:

CUDA compute capability of Jetson Nano is 5.3, so only FP32 and FP16 are available.

Thanks.

system · November 19, 2025, 3:00am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why inference in jetson nano with fp16 is slower than fp32 Jetson Nano tensorrt , jetson-inference	9	2126	September 5, 2021
ONNX Runtime Error: fp16 precision has been set for a layer or layer output, but fp16 is not configured in the builder Jetson Nano jetson-inference , onnx	3	3231	February 4, 2022
Build engine TensorRT on Jetson Nano Jetson Nano tensorrt	6	2052	August 30, 2023
Inference using FP16 and FP32 precision giving no performance gain on Jetson Nano Jetson Nano	2	1430	October 14, 2021
Use trtexec to convert onnx format to fp16 tensorRT engine, and perform inference and export nan TensorRT cudnn	2	240	June 3, 2025
Converting FCN8-ResNet18 from Pytorch to TensorRT for inference on Jetson Nano TensorRT tensorrt , jetson-inference , pytorch , python , onnx	3	2364	October 12, 2021
Time of inference in FP16 and FP32 is the same Jetson TX2 tensorrt	20	2057	August 10, 2022
Wrong output when converting Depth Anything V2 ONNX model to Tensorrt TensorRT cudnn , jetson	2	441	April 2, 2025
Convert the TRT model with FP16 Jetson TX2 jetpack , tensorrt , jetson-inference	7	2695	October 18, 2021
Problem converting tensorflow model to TensorRT Jetson Nano tensorrt , tensorflow	5	478	March 26, 2024

ONNX TensorRT Engines FP16/32

Related topics