YoloV4 BBox confidence values are wrong

yousef.hesham1 · August 4, 2022, 1:29am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) → NVIDIA GeForce GTX 1650
• DeepStream Version → 6.1
• JetPack Version (valid for Jetson only) NA
• TensorRT Version → TensorRT 8.2.5.1
• NVIDIA GPU Driver Version (valid for GPU only) NVIDIA driver 515
• Issue Type( questions, new requirements, bugs) bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hello,
I’ve Followed this updated guide of YoloV4: YoloV4 Manual

Also the: Following Repo

The application is working, but ObjectList.size() reports 0 objects. Which means I’m not getting any bounding boxes as seen here:

After some debugging, turns out the confidence levels the network obtained is always less than 0.1

After Printing Object confidence output against the configured threshold, here’s the Deepstream Output:

maxProb=0.00624847 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00129032 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.71065e-05 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.72853e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000273466 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00196075 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00395584 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00308037 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00318336 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00321007 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00319672 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.0032177 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00320625 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00320625 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00319481 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00319672 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00317001 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00313568 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00318336 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00424194 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00452423 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000881195 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.46031e-05 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=4.13656e-05 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000129342 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000457287 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000349283 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000365019 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000364304 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000365973 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000362635 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000362635 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000363111 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000366449 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000365496 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000365496 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000365973 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000362158 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000420094 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.000361919 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0.00010705 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=4.17233e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.96046e-08 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.96046e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=3.8743e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.78165e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.48363e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.30481e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.30481e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.42402e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.42402e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.42402e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.36442e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=7.09295e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=8.46386e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.2517e-06 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.78814e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=0 detectionParams.perClassPreclusterThreshold[maxIndex]=0
maxProb=0 detectionParams.perClassPreclusterThreshold[maxIndex]=0
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.78814e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.78814e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=1.19209e-07 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4
maxProb=5.96046e-08 detectionParams.perClassPreclusterThreshold[maxIndex]=0.4

Find some useful inputs for your inspection below:

nvdsinfer_yolov4parser.cpp File:

/*
 * Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

#include <algorithm>
#include <cassert>
#include <cmath>
#include <cstring>
#include <fstream>
#include <iostream>
#include <unordered_map>
#include "nvdsinfer_custom_impl.h"

static const int NUM_CLASSES_YOLO = 80;

float clamp(const float val, const float minVal, const float maxVal)
{
    assert(minVal <= maxVal);
    return std::min(maxVal, std::max(minVal, val));
}

extern "C" bool NvDsInferParseCustomYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList);


/* YOLOv4 implementations */
static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx1, const float& by1, const float& bx2,
                                     const float& by2, const uint& netW, const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution

    float x1 = bx1 * netW;
    float y1 = by1 * netH;
    float x2 = bx2 * netW;
    float y2 = by2 * netH;

    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);
    x2 = clamp(x2, 0, netW);
    y2 = clamp(y2, 0, netH);

    b.left = x1;
    b.width = clamp(x2 - x1, 0, netW);
    b.top = y1;
    b.height = clamp(y2 - y1, 0, netH);

    return b;
}

static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
                     const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);

    std::cerr << "b.left="<<bbi.left<< " b.width=" << bbi.width << " b.top=" << bbi.top  << " b.height=" << bbi.height;

    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;


    bbi.classId = maxIndex;
    std::cerr << "maxProb="<<maxProb;
    std::cerr << "maxIndex="<<maxIndex;

    binfo.push_back(bbi);
}

static std::vector<NvDsInferParseObjectInfo>
decodeYoloV4Tensor(
    const float* boxes, const float* scores,
    const uint num_bboxes, NvDsInferParseDetectionParams const& detectionParams,
    const uint& netW, const uint& netH)
{
    std::vector<NvDsInferParseObjectInfo> binfo;

    uint bbox_location = 0;
    uint score_location = 0;
    for (uint b = 0; b < num_bboxes; ++b)
    {
        float bx1 = boxes[bbox_location];
        float by1 = boxes[bbox_location + 1];
        float bx2 = boxes[bbox_location + 2];
        float by2 = boxes[bbox_location + 3];

        float maxProb = 0.0f;
        int maxIndex = -1;

        for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
        {
            float prob = scores[score_location + c];
            if (prob > maxProb)
            {
                maxProb = prob;
                maxIndex = c;
            }
        }

        std::cerr << "maxProb="<<maxProb<< " detectionParams.perClassPreclusterThreshold[maxIndex]=" << detectionParams.perClassPreclusterThreshold[maxIndex] << std::endl;

        if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
        {
            addBBoxProposalYoloV4(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
        }

        bbox_location += 4;
        score_location += detectionParams.numClassesConfigured;
    }

    return binfo;
}

extern "C" bool NvDsInferParseCustomYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList)
{
    if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
    {
        std::cerr << "WARNING: Num classes mismatch. Configured:"
                  << detectionParams.numClassesConfigured
                  << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
    }

    std::vector<NvDsInferParseObjectInfo> objects;

    const NvDsInferLayerInfo &boxes = outputLayersInfo[0]; // num_boxes x 4
    const NvDsInferLayerInfo &scores = outputLayersInfo[1]; // num_boxes x num_classes

    // 3 dimensional: [num_boxes, 1, 4]
    assert(boxes.inferDims.numDims == 3);
    // 2 dimensional: [num_boxes, num_classes]
    assert(scores.inferDims.numDims == 2);

    // The second dimension should be num_classes
    assert(detectionParams.numClassesConfigured == scores.inferDims.d[1]);
    
    uint num_bboxes = boxes.inferDims.d[0];

    // std::cout << "Network Info: " << networkInfo.height << "  " << networkInfo.width << std::endl;

    std::vector<NvDsInferParseObjectInfo> outObjs =
        decodeYoloV4Tensor(
            (const float*)(boxes.buffer), (const float*)(scores.buffer), num_bboxes, detectionParams,
            networkInfo.width, networkInfo.height);

    objects.insert(objects.end(), outObjs.begin(), outObjs.end());

    objectList = objects;
    
    // std::cerr << "After postprocessing objects.size()" << objects.size() << std::endl;
    // std::cerr << "After postprocessing objectList.size()" << objectList.size() << std::endl;

    return true;
}
/* YOLOv4 implementations end*/


/* Check that the custom function has been defined correctly */
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomYoloV4);

config_YoloV4.txt FIle:

################################################################################
#
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
################################################################################

# Following properties are mandatory when engine files are not specified:
#   int8-calib-file(Only in INT8), model-file-format
#   Caffemodel mandatory properties: model-file, proto-file, output-blob-names
#   UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
#   ONNX: onnx-file
#
# Mandatory properties for detectors:
#   num-detected-classes
#
# Optional properties for detectors:
#   cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)
#   custom-lib-path
#   parse-bbox-func-name
#
# Mandatory properties for classifiers:
#   classifier-threshold, is-classifier
#
# Optional properties for classifiers:
#   classifier-async-mode(Secondary mode only, Default=false)
#
# Optional properties in secondary mode:
#   operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
#   input-object-min-width, input-object-min-height, input-object-max-width,
#   input-object-max-height
#
# Following properties are always recommended:
#   batch-size(Default=1)
#
# Other optional properties:
#   net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
#   model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
#   mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),
#   custom-lib-path, network-mode(Default=0 i.e FP32)
#
# The values in the config file are overridden by values set through GObject
# properties.

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
model-engine-file=../../data/models/YoloV4/yolov4_-1_3_640_640_dynamic.engine
labelfile-path=../../data/models/YoloV4/labels.txt
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
network-type=0
is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV4
custom-lib-path=../nvdsinfer_yolov4parser/libnvds_YoloV4Parser.so
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
nms-iou-threshold=0.6
pre-cluster-threshold=0.4

yuweiw · August 4, 2022, 9:24am

Hi @yousef.hesham1 , Which application do you use with your model in deepstream? Could you provide us your stream? Thanks

yousef.hesham1 · August 4, 2022, 9:27am

Hello @yuweiw, Thanks for the quick response!
I’m using the apps/deepstream-imagedata-multistream Sample application
If by stream you mean the video Im testing with, I’m using the following 2 files as part of 6 stream test:

file:///opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_1080p_h264.mp4
file:///opt/nvidia/deepstream/deepstream-6.1/samples/streams/sample_1080p_h265.mp4

And they produce no bounding boxes. Could it be an ONNX conversion issue?

yuweiw · August 4, 2022, 11:29am

====》Could it be an ONNX conversion issue?
It might be. You can try to test it with our yolov4 trt model and postprocess function. Thanks
Please refer the link below to download our yolov4 model.
https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps

yousef.hesham1 · August 4, 2022, 1:39pm

@yuweiw I tried the tao deployable yolov4 model (yolov4_resnet18_395.etlt) and here’s the output Im getting:

Can you assist with this please? Also I can upload the converted ONNX I’ve used when I first encountered this issue if it needs inspection.

yuweiw · August 5, 2022, 7:17am

From the video you attached, we can see that the lables show out. So maybe the way you draw the bbox is wrong. You can refer the draw_bounding_boxes fucntion in deepstream-imagedata-multistream file. Also you can draw anything in the picture as a test.

yousef.hesham1 · August 5, 2022, 7:25pm

Hello @yuweiw
I understand that this function is only used to draw on image data extracted from the object meta, then is saved on desk.

And the actual drawing of the ObjectList bounding Box are drawn using NVIDIA plugins in the pipeline. Please correct me if I’m wrong.

And anyways I’ve double checked the function and it’s exactly the same as the original sample.

Could it be a conversion issue?

yuweiw · August 8, 2022, 11:13am

====>Please correct me if I’m wrong.
You are right. We use the osd plugin to draw.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvdsosd.html
===>Could it be a conversion issue?
I think you used tao deployable yolov4 model, so there is no conversion.
Could you provide your change about deepstream-imagedata-multistream? Thanks

yousef.hesham1 · August 8, 2022, 8:54pm

I did use the tao deployable yolov4 model and the results are shown above in the comments.
Not much changes were made other than adding tracker config.

yuweiw · August 12, 2022, 9:43am

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

The reason may be your post-process algorithm. I use the yolov4 model in our tao app to run the deepstream-imagedata-multistream demo. When use your post-process, it cannot draw the bbox, the object recognized is 0. But when I use the post-process in TAO app, it works well. You can try this from the link below:
https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/blob/master/post_processor/nvdsinfer_custombboxparser_tao.cpp

yousef.hesham1 · August 31, 2022, 1:59pm

Turns out the conversion from original pytorch weights to ONNX was the issue, not the post-process algorithm. I had to use a specific pytorch version in order to generate a functioning ONNX file.

Many thanks

system · September 14, 2022, 1:59pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Get wrong infer results while testing yolov4 on deepstream 5.0 DeepStream SDK	46	9515	October 12, 2021
DeepstreamSDK 4.0.1 Custom Yolov3-Tiny Error DeepStream SDK	10	1522	April 27, 2020
Running YOLOV4 on DS 5.0 DeepStream SDK python	3	1141	October 12, 2021
Yolov4 does not show any bounding boxes DeepStream SDK	2	394	August 16, 2023
objectDetector_YoloV3 with Deepstream-app Random Bounding Boxes DeepStream SDK	3	1110	October 12, 2021
Yolov4 not working in deepstream app? TAO Toolkit	26	1354	August 28, 2021
Custom YOLOv3 model in DeepStream 5.0 DeepStream SDK	16	1618	October 12, 2021
App Run Fails With Errors DeepStream SDK jetson-inference , yolo , debugging-and-troubleshooting	28	1891	October 23, 2023
Yolo_v4 TLT-V3 custom bounding box parser is not working TAO Toolkit	6	582	October 12, 2021
Iplugin tensorrt engine error for ds5.0 DeepStream SDK	29	4261	October 12, 2021

YoloV4 BBox confidence values are wrong

Related topics