Yolov8 model detection output printing confusing

izzatullokhm · July 17, 2025, 9:43am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
• DeepStream Version
hbrain@ubuntu:~/DeepStream-Yolo$ deepstream-app --version-all
deepstream-app version 6.3.0
DeepStreamSDK 6.3.0
CUDA Driver Version: 11.4
CUDA Runtime Version: 11.4
TensorRT Version: 8.5
cuDNN Version: 8.6
libNVWarp360 Version: 2.0.1d3

hbrain@ubuntu:~/DeepStream-Yolo$
• JetPack Version (5.1.3)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( ERROR from src_elem: Internal data stream error.)
** I am trying to do object detection using yolov8 in deepstream 6.3.0.

Hi, I working on deploying Yolov8 model to Nvidia Jetson Xavier.

I am using github repo: GitHub - marcoslucianops/DeepStream-Yolo: NVIDIA DeepStream SDK 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models

I could not understand the logic in nvdsparsebbox_Yolo.cpp file

what I want is print the bbox result to terminal and send then via mqtt. I have added printing statements to see the bbox information

#include "nvdsinfer_custom_impl.h"
#include "utils.h"
#include <iomanip>

extern "C" bool
NvDsInferParseYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);

static NvDsInferParseObjectInfo
convertBBox(const float& bx1, const float& by1, const float& bx2, const float& by2, const uint& netW, const uint& netH)
{
  NvDsInferParseObjectInfo b;

  float x1 = bx1;
  float y1 = by1;
  float x2 = bx2;
  float y2 = by2;
  
  // 🔍 DEBUG: Print raw YOLO output coordinates
  std::cout << "🔍 RAW YOLO OUTPUT:" << std::endl;
  std::cout << "   → Network Resolution: " << netW << "x" << netH << std::endl;
  std::cout << "   → Raw coordinates: (" << bx1 << ", " << by1 << ") to (" << bx2 << ", " << by2 << ")" << std::endl;
  std::cout << "   → Raw size: " << std::fixed << std::setprecision(1) << (bx2 - bx1) << "x" << (by2 - by1) << " pixels" << std::endl;

  x1 = clamp(x1, 0, netW);
  y1 = clamp(y1, 0, netH);
  x2 = clamp(x2, 0, netW);
  y2 = clamp(y2, 0, netH);

  b.left = x1;
  b.width = clamp(x2 - x1, 0, netW);
  b.top = y1;
  b.height = clamp(y2 - y1, 0, netH);
  
  // 🔍 DEBUG: Print processed coordinates
  std::cout << "🔍 PROCESSED BBOX (" << netW << "x" << netH << "):" << std::endl;
  std::cout << "   → Final BBox: left=" << (int)b.left << ", top=" << (int)b.top << ", width=" << (int)b.width << ", height=" << (int)b.height << std::endl;
  std::cout << "   → Top-Left: (" << (int)b.left << ", " << (int)b.top << ")" << std::endl;
  std::cout << "   → Bottom-Right: (" << (int)(b.left + b.width) << ", " << (int)(b.top + b.height) << ")" << std::endl;
  
  // 🔍 DEBUG: Show what this would look like scaled to 1920x1080
  if (netW == 640 && netH == 640) {
    float scale_x = 1920.0f / 640.0f;  // 3.0
    float scale_y = 1080.0f / 640.0f;  // 1.6875
    
    std::cout << "🔍 SCALED TO 1920x1080:" << std::endl;
    std::cout << "   → Scaled BBox: left=" << (int)(b.left * scale_x) << ", top=" << (int)(b.top * scale_y) 
              << ", width=" << (int)(b.width * scale_x) << ", height=" << (int)(b.height * scale_y) << std::endl;
    std::cout << "   → Scaled Top-Left: (" << (int)(b.left * scale_x) << ", " << (int)(b.top * scale_y) << ")" << std::endl;
    std::cout << "   → Scaled Bottom-Right: (" << (int)((b.left + b.width) * scale_x) << ", " << (int)((b.top + b.height) * scale_y) << ")" << std::endl;
  }
  std::cout << "   ═══════════════════════════════════════════════════════════════" << std::endl;

  return b;
}

static void
addBBoxProposal(const float bx1, const float by1, const float bx2, const float by2, const uint& netW, const uint& netH,
    const int maxIndex, const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
  std::cout << "\n📦 BBOX PROPOSAL #" << (binfo.size() + 1) << ":" << std::endl;
  std::cout << "   → Resolution Context: " << netW << "x" << netH << std::endl;
  std::cout << "   → Class: " << maxIndex << " | Confidence: " << std::fixed << std::setprecision(3) << maxProb << std::endl;

  NvDsInferParseObjectInfo bbi = convertBBox(bx1, by1, bx2, by2, netW, netH);

  if (bbi.width < 1 || bbi.height < 1) {
    std::cout << "❌ REJECTED: BBox too small (width=" << bbi.width << ", height=" << bbi.height << ")" << std::endl;
    std::cout << "───────────────────────────────────────────────────────────────" << std::endl;
    return;
  }

  bbi.detectionConfidence = maxProb;
  bbi.classId = maxIndex;
  binfo.push_back(bbi);
  
  std::cout << "✅ ADDED TO DEEPSTREAM QUEUE" << std::endl;
  std::cout << "───────────────────────────────────────────────────────────────" << std::endl;
}

static std::vector<NvDsInferParseObjectInfo>
decodeTensorYolo(const float* output, const uint& outputSize, const uint& netW, const uint& netH,
    const std::vector<float>& preclusterThreshold)
{
  std::vector<NvDsInferParseObjectInfo> binfo;
  
  std::cout << "\n🔍 YOLO TENSOR DECODING:" << std::endl;
  std::cout << "   → Output size: " << outputSize << " detections" << std::endl;
  std::cout << "   → Network resolution: " << netW << "x" << netH << std::endl;
  std::cout << "   → Confidence threshold: " << preclusterThreshold[0] << std::endl;
  std::cout << "═══════════════════════════════════════════════════════════════════════════════════════" << std::endl;

  int valid_detections = 0;
  
  for (uint b = 0; b < outputSize; ++b) {
    float maxProb = output[b * 6 + 4];
    int maxIndex = (int) output[b * 6 + 5];

    if (maxProb < preclusterThreshold[maxIndex]) {
      continue;
    }
    
    if (maxIndex != 0) {  // Only process person class
      continue;
    }
    
    valid_detections++;

    float bx1 = output[b * 6 + 0];
    float by1 = output[b * 6 + 1];
    float bx2 = output[b * 6 + 2];
    float by2 = output[b * 6 + 3];

    std::cout << "\n🎯 VALID DETECTION #" << valid_detections << " FOUND:" << std::endl;
    addBBoxProposal(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
  }
  
  std::cout << "\n📊 YOLO DECODING SUMMARY:" << std::endl;
  std::cout << "   → Total raw outputs: " << outputSize << std::endl;
  std::cout << "   → Valid detections: " << valid_detections << std::endl;
  std::cout << "   → Final proposals: " << binfo.size() << std::endl;
  std::cout << "═══════════════════════════════════════════════════════════════════════════════════════" << std::endl;

  return binfo;
}

static bool
NvDsInferParseCustomYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo, NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList)
{
  if (outputLayersInfo.empty()) {
    std::cerr << "ERROR: Could not find output layer in bbox parsing" << std::endl;
    return false;
  }

  std::cout << "\n🚀 YOLO PARSING STARTED:" << std::endl;
  std::cout << "╔══════════════════════════════════════════════════════════════════════════════╗" << std::endl;
  std::cout << "║                         YOLO PARSING DEBUG                                  ║" << std::endl;
  std::cout << "╠══════════════════════════════════════════════════════════════════════════════╣" << std::endl;
  std::cout << "║ Network Input Resolution : " << std::setw(4) << networkInfo.width << "x" << std::setw(4) << networkInfo.height << "                                ║" << std::endl;
  std::cout << "║ Model processes in       : " << std::setw(4) << networkInfo.width << "x" << std::setw(4) << networkInfo.height << " coordinate space              ║" << std::endl;
  std::cout << "║ DeepStream will display  : scaled to output resolution                      ║" << std::endl;
  std::cout << "╚══════════════════════════════════════════════════════════════════════════════╝" << std::endl;

  std::vector<NvDsInferParseObjectInfo> objects;

  const NvDsInferLayerInfo& output = outputLayersInfo[0];
  const uint outputSize = output.inferDims.d[0];

  std::vector<NvDsInferParseObjectInfo> outObjs = decodeTensorYolo((const float*) (output.buffer), outputSize,
      networkInfo.width, networkInfo.height, detectionParams.perClassPreclusterThreshold);

  objects.insert(objects.end(), outObjs.begin(), outObjs.end());

  // 🚀 FINAL DEBUG: What's actually being sent to DeepStream
  std::cout << "\n🚀 FINAL OUTPUT TO DEEPSTREAM:" << std::endl;
  std::cout << "╔══════════════════════════════════════════════════════════════════════════════╗" << std::endl;
  std::cout << "║                    EXACT DATA SENT TO DEEPSTREAM OSD                        ║" << std::endl;
  std::cout << "╚══════════════════════════════════════════════════════════════════════════════╝" << std::endl;
  
  if (objects.size() == 0) {
    std::cout << "⚠️  NO OBJECTS TO SEND TO DEEPSTREAM - Nothing will be drawn" << std::endl;
  } else {
    for (size_t i = 0; i < objects.size(); i++) {
      const auto& obj = objects[i];
      std::cout << "\n🎯 DeepStream Object [" << (i+1) << "/" << objects.size() << "]:" << std::endl;
      std::cout << "   → ClassID: " << obj.classId << " (Person)" << std::endl;
      std::cout << "   → Confidence: " << std::fixed << std::setprecision(3) << obj.detectionConfidence << std::endl;
      std::cout << "   → EXACT OSD Coordinates (" << networkInfo.width << "x" << networkInfo.height << "):" << std::endl;
      std::cout << "     left=" << (int)obj.left << ", top=" << (int)obj.top << ", width=" << (int)obj.width << ", height=" << (int)obj.height << std::endl;
      std::cout << "   → Top-Left: (" << (int)obj.left << ", " << (int)obj.top << ")" << std::endl;
      std::cout << "   → Bottom-Right: (" << (int)(obj.left + obj.width) << ", " << (int)(obj.top + obj.height) << ")" << std::endl;
      
      // Show expected display size if using 640x640 model
      if (networkInfo.width == 640 && networkInfo.height == 640) {
        float scale_x = 1920.0f / 640.0f;
        float scale_y = 1080.0f / 640.0f;
        std::cout << "   → Expected Display Size (if 1920x1080 output):" << std::endl;
        std::cout << "     left=" << (int)(obj.left * scale_x) << ", top=" << (int)(obj.top * scale_y) 
                  << ", width=" << (int)(obj.width * scale_x) << ", height=" << (int)(obj.height * scale_y) << std::endl;
      }
      std::cout << "   ═══════════════════════════════════════════════════════════════" << std::endl;
    }
  }
  
  std::cout << "\n📺 NEXT STEP: DeepStream OSD will draw " << objects.size() << " bounding boxes on video" << std::endl;
  std::cout << "   → If display shows DIFFERENT sizes than above coordinates," << std::endl;
  std::cout << "     then tracker/DeepStream is modifying them post-processing." << std::endl;
  std::cout << "   → If display shows SAME sizes as above coordinates," << std::endl;
  std::cout << "     then this is exactly what YOLO model detected." << std::endl;
  std::cout << "═══════════════════════════════════════════════════════════════════════════════════════\n" << std::endl;

  objectList = objects;

  return true;
}

extern "C" bool
NvDsInferParseYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList)
{
  return NvDsInferParseCustomYolo(outputLayersInfo, networkInfo, detectionParams, objectList);
}

CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYolo);

and the result is

🎯 DeepStream Object [1/4]:
→ ClassID: 0 (Person)
→ Confidence: 0.471
→ EXACT OSD Coordinates (640x640):
left=473, top=317, width=14, height=47
→ Top-Left: (473, 317)
→ Bottom-Right: (487, 364)
→ Expected Display Size (if 1920x1080 output):
left=1419, top=535, width=42, height=79

and then I have plotted the bbox info to image but it is drawn on the half of the object,

but using deepstream-app it is plotting to whole object.

My image size is 1920x1080, detected object bbox info must be around left=1419, top=535, width=42, height=150

there is no issue with left and top and width values, but height is decreased by 2

@anyone can help me?

junshengy · July 17, 2025, 10:54am

First things first, please use ``` when copy-pasting stack traces or code (I’ve edited your post to use that) otherwise it’s not really readable.

How do you plot it? Does your program use deepstream? If you use nvinfer + nvdsosd, this problem will not occur because deepstream-app works properly.

This is unnecessary， You can specify filter-out-class-ids=x in the nvinfer configuration file to filter out unnecessary categories.

If you want to send bbox through mqtt, you also can also use nvmsgconv and nvmsgbroker.

junshengy · July 17, 2025, 10:56am

I don’t understand what you mean, please describe in detail how you plot the bbox

yingliu · August 12, 2025, 2:11am

**There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks**

system · August 26, 2025, 2:11am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream YOLOV8 parser DeepStream SDK jetson , deepstream	17	534	April 10, 2025
Custom_parse_bbox_func fails to send data to deepstream for some frames DeepStream SDK tensorrt , python , inference-server-triton , deepstream , deepstream61	5	905	April 6, 2023
Get detected bounding box infomations from deepstream-yolo-app DeepStream SDK	11	2748	October 12, 2021
Yolov8 using deepstream jetson nano DeepStream SDK jetson , deepstream	13	318	October 17, 2024
Issue with Bounding Boxes and Object Detection in DeepStream Using YOLOv8 Model DeepStream SDK yolo , deepstream	9	259	April 18, 2025
Yolov3 bounding boxes outside osd screen DeepStream SDK	4	740	October 12, 2021
DeepStream YOLOv8 Custom Model – No Bounding Boxes Detected Despite Correct Parser, ONNX, and TRT Engine (2-Class Model) DeepStream SDK tensorrt , cuda , yolo , jetson , deepstream	5	84	December 15, 2025
Incorrect Bounding Box Decoding with YOLOv8 TensorRT Engine in DeepStream (Output Shape [5, 8400]) DeepStream SDK deepstream	4	156	June 12, 2025
YOLOv8-OBB model in deepstream General deepstream	13	611	August 9, 2024
objectDetector_YoloV3 with Deepstream-app Random Bounding Boxes DeepStream SDK	3	1146	October 12, 2021

Yolov8 model detection output printing confusing

Related topics