Polygraphy failure in onnx vs. TRT for specific Shape layer in YOLO8

Description

We want to utilize polygraphy tool for some kind of integration test for YOLO8 in onnx vs. its corresponding TensorRT (TRT) model.

  • When converting YOLO8 model to onnx and running polygraphy run on it, there’s one specific output which fails the comparison: Unnamed Layer* 311. When looking in Netron we can see it corresponds to a Shape layer.
  • This happens with the original YOLO8 model without any changes to it, as seen in the attached code below:
import torch
from ultralytics import YOLO

original_model = YOLO("yolov8n.pt")
dummy_input = torch.zeros(1, 3, 192, 1440, requires_grad=True)
original_model.export(format='onnx', imgsz=(192, 1440),
                       opset=12, verbose=True)
  • Polygraphy logs are attached here. Note the lines:
 Comparing Output: '(Unnamed Layer* 311) [ElementWise]_output' (dtype=int32, shape=(3,)) with '311' (dtype=int64, shape=(4,))
[I]         Tolerance: [abs=0.001, rel=0.001] | Checking elemwise error
[E]         Will not compare outputs of different shapes. Note: Output shapes are (3,) and (4,).
Note: Use --no-shape-check or set check_shapes=False to attempt to compare values anyway.
[E]         FAILED | Output: '(Unnamed Layer* 311) [ElementWise]_output' | Difference exceeds tolerance (rel=0.001, abs=0.001)

Environment

TensorRT Version: 8.5.2
GPU Type: Jetson AGX Orin NX
Nvidia Driver Version:
CUDA Version: 11.4
CUDNN Version: 8.6.0.
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): Python 3.8.10
PyTorch Version (if applicable): 2.0.1
Baremetal or Container (if container which image + tag): We utilzie the polygraphy installed from nvcr.io/nvidia/l4t-tensorrt:r8.5.2.2-devel base image
onnxruntime version used for polygraphy: onnxruntime==1.15.1
numpy version used for polygraphy: numpy==1.21.6

Relevant Files

Steps To Reproduce

  • download the onnx file
  • run polygraphy on this model:
    polygraphy run <path_to_onnx_model> --trt --onnxrt --onnx-outputs mark all --trt-outputs mark all --atol 1e-3 --rtol 1e-3

Note: When writing this issue in yolo8 issues
they said they don’t provide support for 3rd party tools so I put my question here instead.

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

@AakankshaS
Hi, onnx model and script is already shared in my post.
trtexec is successful but that’s not relevant for the issue- I need polygraphy run to be successful, for verifying full compatibility of onnx<–>TRT. trtexec can be successful while polygraphy run can fail.
Anyway, since you asked for trtexec logs for some reason, here it is.
yolov8n_original_trtexec.txt (3.8 MB)

Hi,

We are able to reproduce the error. Please allow us some time to check in detail.

Thank you.

@spolisetty any update on this? We are facing a similar issue with yolov5. This occurs with TRT version 8.5.3.1.