Accuracy issues in object detection models using Jetson AGX Orin

We are seeking assistance with running YOLO v8 models on the AGX Orin 64GB development board, particularly concerning the utilization of two distinct engine file formats: Ultralytics’ .engine file format and DeepStream’s .engine file format.

For the conversion process to Ultralytics’ .engine file format, the following command is used:

yolo export model='./yolov8n.pt' format='engine' device=0

Conversely, the DeepStream .engine file format conversion is performed through the following command sequence:

export PATH=$PATH:/usr/src/tensorrt/bin
trtexec --onnx=./yolov8n.onnx --workspace=10000 --minShapes=input:1x3x640x640 --optShapes=input:16x3x640x640 --maxShapes=input:32x3x640x640 --buildOnly --saveEngine=yolov8n.engine

While exploring these formats, we encountered significant differences in accuracy between the two. Notably, the .engine files generated by DeepStream demonstrate a detection rate lacking almost 30% of detections, whereas the Ultralytics’ conversion consistently achieves a 99% detection rate across architectures (FP16, FP32, INT8).

Furthermore, our observations extend to classification tasks, where the DeepStream engine files display abnormal behavior, consistently presenting one class as the output for all classification cases. In contrast, Ultralytics’ conversion proves to be highly accurate in this context as well.

Considering these discrepancies, we suspect potential configuration errors during conversion or inference processes. To facilitate further investigation and assistance, attached herewith are the configuration files for the respective models.

Your expertise and guidance in rectifying these issues would be greatly appreciated. Thank you for your assistance and support in this matter.

Archive 3.zip (267.2 KB)

1 Like

How did you compare the detection rate of the two model engines? Can you provide the commands and results? Where and how did you get the ONNX model? Why did you think the “yolov8n.pt” and the “yolov8n.onnx” are just the same model?
What you have provided is only DeepStream configurations.

  1. How did you compare the detection rate of the two model engines?
    We conducted a comparative analysis on detection bounding box results from the video outputs of both model engines. Subsequently, we evaluated and compared the differences observed in detections across the same video.

  2. Can you provide the commands and results?
    command to infer using ultralytics:
    yolo predict detect model=‘./yolov8n.engine’ source=‘video.mp4’

For deepstream:
We used deepstream-app command to infer over the same video.mp4

  1. Where and how did you get the ONNX model?
    For ultralytics:
    We directly used the export command as mentioned earlier
    yolo export model=‘./yolov8n.pt’ format=‘engine’ device=0
    which internally uses same torch.onnx.export internally to convert first to onnx and then to engine

  2. What you have provided is only DeepStream configurations
    The configuration files shared serve as a reference. These configurations were instrumental in facilitating the inference process using the specified models.

We followed the official documentation from ultralytics to export to engine

Here are the results from deepstream and ultralytics : Deepstream and ultralytics videos - Google Drive

Why don’t you measure the mAP of the onnx model with TensorRT first? DeepStream involves many extra video processing which may not be exactly the same with Pytorch. If you want to compare the DeepStream with Ultralytics, you need to make sure the pipelines are exactly the same.

For yolov8 classification:
We followed this repository

We used utils/export_yoloV8.py file for onnx conversion but it failed to generate the onnx file so we manually changed this function from:
class DeepStreamOutput(nn.Module):
def init(self):
super().init()

def forward(self, x):
    x = x.transpose(1, 2)
    boxes = x[:, :, :4]
    scores, classes = torch.max(x[:, :, 4:], 2, keepdim=True)
    classes = classes.float()
    return boxes, scores, classes

To:
class DeepStreamOutput(nn.Module):
def init(self):
super().init()

def forward(self, x):
    x = x.transpose(-1, 1)
    boxes = x[:, :4]
    scores, classes = torch.max(boxes, -1, keepdim=True)
    classes = classes.float()
    return boxes, scores, classes

So after making these changes we were able to generate the .onnx as well as engine file, but while inferencing that engine file, it was giving only one class as output label. We are not sure whether the change we made in the onnx conversion script is causing this problem. Can you please guide us in Yolov8 classification model conversion

Can you consult the author of the model or the author of the repo you refer to?

Hi, Since we are having lot of confusions regarding converting a yolo classifier to engine file, can you share a official nvidia deepstream repo/ links which can be used for classification conversion?

The issue is the failure with converting pytorch to ONNX, we don’t find any Nvidia components being involved in this。

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Can this help you? Export a PyTorch model to ONNX — PyTorch Tutorials 2.1.1+cu121 documentation

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.