Issues with Bounding Box Detection Using ONNX Runtime Model Conversion from YOLOv5

Description

Hello NVIDIA Community,

I am currently working on a project using the YOLOv5 model for object detection. After training my model using the Ultralytics YOLO library, I successfully converted my model (best.pt) to the ONNX format (best.onnx) using the following script:

from ultralytics import YOLO

Load the YOLO model

model = YOLO(“best.pt”) # Load your trained model

Export the model to ONNX format

model.export(format=“onnx”) # Creates ‘best.onnx’

Load the exported ONNX model

onnx_model = YOLO(“best.onnx”)

After the conversion, I used the ONNX Runtime for inference. Here is my C++ code snippet from the ultralytics:

void Detector(YOLO_V8*& p) {
    std::filesystem::path imgs_path = "/home/ubuntu/margsofr/yolov5-opencv-cpp-python/images";
    for (auto& i : std::filesystem::directory_iterator(imgs_path)) {
    std::string img_path = i.path().string();
    cv::Mat img = cv::imread(img_path);
    std::vector<DL_RESULT> res;
    p->RunSession(img, res);
    std::vector<cv::Rect> boxes;
    std::vector<float> confidences;
    for (const auto& re : res) {
        boxes.push_back(re.box);
        confidences.push_back(re.confidence);
    }

    std::vector<int> indices;
    cv::dnn::NMSBoxes(boxes, confidences, 0.5, 0.4, indices);

    for (int idx : indices) {
        const auto& re = res[idx];
        cv::rectangle(img, re.box, cv::Scalar(0, 255, 0), 3);
    }

    cv::imwrite("output.jpg", img);
}

Environment

ONNX Version: 1.20.0
CMake Version: 3.26
OpenCV Version: 4.10.0
Operating System: Ubuntu 22.04

Output:

this is the input and output dimensions along with first 3 detections;

./Yolov8OnnxRuntimeCPPInference
[YOLO_V8(CUDA)]: Cuda warm-up cost 2322.47 ms.
Input node dimensions: 1 3 640 640
Inference run completed
Output node dimensions: 1 5 8400
Shape: 5 8400
model type: 1
Using FP32 for rawData matrix
rawData shape after transpose: 8400 x 5
NMS results count: 1924
[YOLO_V8(CUDA)]: 18.114ms pre-process, 2355.5ms inference, 94.241ms post-process.
Box: x=1056, y=239, width=1277, height=1347, confidence=658.98
Box: x=920, y=1551, width=1297, height=811, confidence=654.36
Box: x=-107, y=1561, width=1053, height=763, confidence=653.25

Three detected bboxes:

Hi @arnav1 ,
Can you pls help with your onnx model?

Thanks