NVIDIA Forum Issue Template: Bounding Boxes Not Generated (YOLOv8 Custom Parser)

Subject: YOLOv8 Custom Parser Not Generating Bounding Boxes (Output Shape: 1x5x8400)


DeepStream Version: 7.0
TensorRT Engine: Working (no error)
Custom Parser: Implemented
Model Output Shape: (1, 5, 8400)
Problem: No bounding boxes rendered
Parser Code: Included below
Expected Behavior: Boxes should be drawn for confident detections
Observed Behavior: No bounding boxes appear, even when confidence is high
Inference Log Output: Shows valid detections and dimensions (see std::cout)


πŸ” Description:

I am using a YOLOv8 engine that outputs shape (1, 5, 8400) β€” i.e., 5 channels: [x_center, y_center, width, height, confidence].

I implemented the following DeepStream custom parser (nvdsinfer_custom_impl.cpp) to parse the output and populate objectList.

The engine runs without error, the parser logs correct dimensions and boxes, but no bounding boxes are drawn in DeepStream.

I confirmed that objectList.push_back() is called and populated correctly β€” but nothing appears visually.


πŸ“Œ Parser Code:

#include "nvdsinfer_custom_impl.h"
#include <iostream>
#include <vector>
#include <algorithm>
#include <cstring>

extern "C" bool NvDsInferParseCustomYoloV8(
    const std::vector<NvDsInferLayerInfo> &outputLayers,
    const NvDsInferNetworkInfo &networkInfo,
    const NvDsInferParseDetectionParams &detectionParams,
    std::vector<NvDsInferObjectDetectionInfo> &objectList)
{
    if (outputLayers.empty()) {
        std::cerr << "No output layers found!" << std::endl;
        return false;
    }

    const NvDsInferLayerInfo &layer = outputLayers[0];
    const float *output = static_cast<const float *>(layer.buffer);
    const NvDsInferDims &dims = layer.inferDims;

    std::cout << "Layer Name: " << layer.layerName << std::endl;
    std::cout << "Dims: numDims=" << dims.numDims << " | ";
    for (unsigned int i = 0; i < dims.numDims; ++i)
        std::cout << "d[" << i << "]=" << dims.d[i] << " ";
    std::cout << std::endl;

    int channels = 0;
    int numBoxes = 0;

    if (dims.numDims == 3 && dims.d[0] == 1 && dims.d[1] == 5) {
        // Shape: (1, 5, 8400)
        channels = dims.d[1];
        numBoxes = dims.d[2];
    }
    else if (dims.numDims == 2 && dims.d[0] == 5) {
        // Shape: (5, 8400)
        channels = dims.d[0];
        numBoxes = dims.d[1];
    }
    else {
        std::cerr << "Unexpected output dims. Got: ";
        for (unsigned int i = 0; i < dims.numDims; ++i)
            std::cerr << dims.d[i] << (i == dims.numDims - 1 ? "" : "x");
        std::cerr << std::endl;
        return false;
    }

    for (int i = 0; i < numBoxes; ++i) {
        float x_center = output[0 * numBoxes + i];
        float y_center = output[1 * numBoxes + i];
        float width    = output[2 * numBoxes + i];
        float height   = output[3 * numBoxes + i];
        float conf     = output[4 * numBoxes + i];

        if (conf < detectionParams.perClassThreshold[0])  // Default 0.3
            continue;

        float x = x_center - width / 2.0f;
        float y = y_center - height / 2.0f;

        NvDsInferObjectDetectionInfo obj;
        obj.classId = 0;
        obj.left = x * networkInfo.width;
        obj.top = y * networkInfo.height;
        obj.width = width * networkInfo.width;
        obj.height = height * networkInfo.height;
        obj.detectionConfidence = conf;

        objectList.push_back(obj);
    }

    return true;
}

extern "C" bool NvDsInferParseCustomYoloV8Cuda(
    const std::vector<NvDsInferLayerInfo> &outputLayers,
    const NvDsInferNetworkInfo &networkInfo,
    const NvDsInferParseDetectionParams &detectionParams,
    std::vector<NvDsInferObjectDetectionInfo> &objectList)
{
    return NvDsInferParseCustomYoloV8(outputLayers, networkInfo, detectionParams, objectList);
}

❓Questions:

  1. Is there any known issue with shape (1, 5, 8400)?
  2. Do bounding boxes require a minimum size to appear visually?
  3. Could nvosd or downstream elements be filtering objects unintentionally?
  4. Is there a different expected format for YOLOv8 output in DeepStream?

my config file is –
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
[tiled-display]
enable=0
[source0]
enable=1
type=3
uri=file:///opt/nvidia/deepstream/deepstream-7.0/sources/deepstream_python_apps/apps/deepstream-test3/data/videp>
num-sources=1
[sink0]
enable=1
type=5
sync=0
gpu-id=0
[streammux]
batch-size=1
width=640
height=640
batched-push-timeout=40000
[property]
process-mode=1
enable=1
nvbuf-memory-type=0
network-input-shape=3;640;640
gpu-id=0
batch-size=1
interval=0
gie-unique-id=1
net-scale-factor=1.0
network-type=1

num-detected-classes=1

infer-dims=3;640;640
maintain-aspect-ratio=1
custom-lib-path=./libnvdsinfer_custom_yolov3.so
parse-bbox-func-name=NvDsInferParseCustomYoloV8
output-blob-names=output0
onnx-file=models/yolov8n-face-lindevs.onnx
model-engine-file=models/yolov8n-face-lindevs3.engine
labelfile-path=labels.txt
[class-attrs-all]
pre-cluster-threshold=0.1
topk=100
nms-iou-threshold=0.5

Any help is appreciated. Thank you.

Please refer to the following repository.

This repository transposes the yolov8 output for easier parsing. In addition, this model outputs 80 classes, but your model only outputs 1 class. Please modify the code according to your model.

nvdsosd does not filter any bbox. The bbox is not displayed usually because the bbox in the object meta is incorrect.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.