Different output binding size for YOLOv8-seg

ashutosh.mishra1 · February 9, 2025, 9:21am

Description

I am trying to obtain inference time through engine file for YOLOv8-seg through this code tensorrtx/yolov8 at master · wang-xinyu/tensorrtx · GitHub.

I get the output binding size different when I also obtain the engine file from Ultralytics(.pt) to ONNX export followed by trtexec command.

More specifically,

trtexec.exe --onnx=yolov8s-seg_fp32.onnx --saveEngine=yolov8s-seg_fp32.engine --workspace=3000

The output binding using the link shared above is the following

.pt → .wts → .engine
bingding: output (90001, 1, 1)
bingding: proto (32, 240, 240)
ONNX through Ultralytics → trtexec → .engine
bingding: output0 (1, 300, 38)
bingding: output1 (1, 32, 240, 240)

Somehow, I am not able to use the second engine file in the code base.

Environment

TensorRT Version: 8.5.1.7
GPU Type: RTX 4080 Laptop GPU
Nvidia Driver Version: 561.17
CUDA Version: 12.6
CUDNN Version:
Operating System + Version: TensorRT docker image
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): NA
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:22.12-py3

Can someone please help to figure out why is this happening ?

Y-T-G · February 10, 2025, 2:48pm

Ultralytics prepends some metadata to the engine file which means that it can’t be loaded into TensorRT directly. You would have to remove that code.

github.com/ultralytics/ultralytics

ultralytics/engine/exporter.py

9d8c2fe3c


      
              # Free CUDA memory
              del self.model
              gc.collect()
              torch.cuda.empty_cache()
          
              # Write file
              build = builder.build_serialized_network if is_trt10 else builder.build_engine
              with build(network, config) as engine, open(f, "wb") as t:
                  # Metadata
                  meta = json.dumps(self.metadata)
                  t.write(len(meta).to_bytes(4, byteorder="little", signed=True))
                  t.write(meta.encode())
                  # Model
                  t.write(engine if is_trt10 else engine.serialize())
          
              return f, None
          
          @try_export
          def export_saved_model(self, prefix=colorstr("TensorFlow SavedModel:")):
              """YOLO TensorFlow SavedModel export."""
              cuda = torch.cuda.is_available()

Topic		Replies	Views
Trtexec and dynamic batch size TensorRT	4	5410	July 22, 2021
ONNX runtime result differs from int8 quantized pytorch model TensorRT tensorrt , onnx	5	2007	February 15, 2022
ONNX Model and Tensorrt Engine gives different output TensorRT tensorrt , onnx	13	5391	June 29, 2022
Error (smClk > 0) when I using TensorRT8.6 with cuda12 on RTX4070 TensorRT tensorrt	4	575	July 7, 2023
I can't get result from TensorRT model TensorRT tensorrt	8	1009	May 31, 2022
Load ONNX model with batch size TensorRT	3	1762	October 12, 2021
Tensorrt Conversion TensorRT	2	78	November 30, 2024
Trtexec stuck,when convert onnx to rt TensorRT tensorrt	2	280	July 1, 2024
Building a engine takes too long TensorRT	13	3283	December 8, 2022
Input length mismatch (onnx conversion to .trt) TensorRT tensorrt , onnx	4	1261	July 13, 2022

Different output binding size for YOLOv8-seg

Description

Environment

Related topics