Description
I am trying to obtain inference time through engine file for YOLOv8-seg through this code tensorrtx/yolov8 at master · wang-xinyu/tensorrtx · GitHub.
I get the output binding size different when I also obtain the engine file from Ultralytics(.pt) to ONNX export followed by trtexec command.
More specifically,
trtexec.exe --onnx=yolov8s-seg_fp32.onnx --saveEngine=yolov8s-seg_fp32.engine --workspace=3000
The output binding using the link shared above is the following
- .pt → .wts → .engine
bingding: output (90001, 1, 1)
bingding: proto (32, 240, 240) - ONNX through Ultralytics → trtexec → .engine
bingding: output0 (1, 300, 38)
bingding: output1 (1, 32, 240, 240)
Somehow, I am not able to use the second engine file in the code base.
Environment
TensorRT Version: 8.5.1.7
GPU Type: RTX 4080 Laptop GPU
Nvidia Driver Version: 561.17
CUDA Version: 12.6
CUDNN Version:
Operating System + Version: TensorRT docker image
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): NA
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:22.12-py3
Can someone please help to figure out why is this happening ?