Get wrong infer results while testing yolov4 on deepstream 5.0

@ersheng
I did configure batch size of both [streammux] and [primary-gie]. I will try agine and upload the error info later. Thanks

Hi, @pinktree3
post # 12 is useful. I just refered to it.
For the error you met, you can check your version of pytorch and tensorrt.

Darknet2ONNX

Pytorch 1.4.0 for TensorRT 7.0 and higher
Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher

ONNX2TensorRT

TensorRT version Recommended: 7.0, 7.1

thanks @ersheng @jiejing_ma

@ersheng Followed your yolov4 repo to make TRT engine for yolov5 which was built successfully. Compared the output with pytorch mode and they are both same. But when I hook it up in DeepStream I am not getting any boxes. I have uploaded the code and relevant files here. Let me know if you have any pointers!

Thanks

Hi y14uc339,

Please help to open a new topic for your issue. Thanks

@kayccc I have opened a new topic its been 2 days! Do you mind having a look Yolov5 giving unexpected outputs

Thank you for the detailed steps. I followed all your steps.
I’m using custom yolov4 model. While I try to run deepstream-app. I got following error

nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initResource() <nvdsinfer_context_impl.cpp:667> [UID = 1]: Detect-postprocessor failed to init resource because dlsym failed to get func NvDsInferParseCustomYoloV4 pointer

Solved:

@y14uc339 Thank you, I find out the problem.
@ersheng config_infer_primary_yoloV4.txt I changed
parse-bbox-func-name=NvDsInferParseCustomYoloV4 —> parse-bbox-func-name=NvDsInferParseYoloV4

Then it’s works.

@karthick define the parsing box function just like the other functions are declared in the custom impl cpp file and you will see off this error. This works:

extern "C" bool NvDsInferParseYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList);



static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx, const float& by, const float& bw,
                                     const float& bh, const uint& netW, const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution
    float xCenter = bx * netW;
    float yCenter = by * netH;

    float w = bw * netW;
    float h = bh * netH;

    float x0 = xCenter - w * 0.5;
    float y0 = yCenter - h * 0.5;
    float x1 = x0 + w;
    float y1 = y0 + h;

    x0 = clamp(x0, 0, netW);
    y0 = clamp(y0, 0, netH);
    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);

    b.left = x0;
    b.width = clamp(x1 - x0, 0, netW);
    b.top = y0;
    b.height = clamp(y1 - y0, 0, netH);

    return b;
}

static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
                     const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);
    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;
    bbi.classId = maxIndex;
    binfo.push_back(bbi);
}

static std::vector<NvDsInferParseObjectInfo>
decodeYoloV4Tensor(
    const float* detections, const uint num_bboxes,
    NvDsInferParseDetectionParams const& detectionParams,
    const uint& netW, const uint& netH)
{
    std::vector<NvDsInferParseObjectInfo> binfo;

    uint bbox_location = 0;
    for (uint b = 0; b < num_bboxes; ++b)
    {
        float bx = detections[bbox_location];
        float by = detections[bbox_location + 1];
        float bw = detections[bbox_location + 2];
        float bh = detections[bbox_location + 3];

        float maxProb = 0.0f;
        int maxIndex = -1;

        uint cls_location = bbox_location + 4;
        for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
        {
            float prob = detections[cls_location + c];
            if (prob > maxProb)
            {
                maxProb = prob;
                maxIndex = c;
            }
        }

        if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
        {
            addBBoxProposalYoloV4(bx, by, bw, bh, netW, netH, maxIndex, maxProb, binfo);
        }

        bbox_location += 4 + detectionParams.numClassesConfigured;
    }

    return binfo;
}

extern "C"  bool NvDsInferParseYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList)
{
    if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
    {
        std::cerr << "WARNING: Num classes mismatch. Configured:"
                  << detectionParams.numClassesConfigured
                  << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
    }

    std::vector<NvDsInferParseObjectInfo> objects;

        const NvDsInferLayerInfo &layer = outputLayersInfo[0]; // num_boxes x (4 + num_classes)

        // 2 dimensional: [num_boxes, 4 + num_classes]
        assert(layer.inferDims.numDims == 2);
        // The second dimension should be 4 + num_classes
        assert(detectionParams.numClassesConfigured == layer.inferDims.d[1] - 4);

        uint num_bboxes = layer.inferDims.d[0];

        // std::cout << "Network Info: " << networkInfo.height << "  " << networkInfo.width << std::endl;

        std::vector<NvDsInferParseObjectInfo> outObjs =
            decodeYoloV4Tensor(
                (const float*)(layer.buffer), num_bboxes, detectionParams,
                networkInfo.width, networkInfo.height);

        objects.insert(objects.end(), outObjs.begin(), outObjs.end());

    objectList = objects;

    return true;
}

/* This is a sample bounding box parsing function for the sample YoloV3 detector model */
static NvDsInferParseObjectInfo convertBBox(const float& bx, const float& by, const float& bw,
                                     const float& bh, const int& stride, const uint& netW,
                                     const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution
    float xCenter = bx * stride;
    float yCenter = by * stride;
    float x0 = xCenter - bw / 2;
    float y0 = yCenter - bh / 2;
    float x1 = x0 + bw;
    float y1 = y0 + bh;

    x0 = clamp(x0, 0, netW);
    y0 = clamp(y0, 0, netH);
    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);

    b.left = x0;
    b.width = clamp(x1 - x0, 0, netW);
    b.top = y0;
    b.height = clamp(y1 - y0, 0, netH);

    return b;
}

static void addBBoxProposal(const float bx, const float by, const float bw, const float bh,
                     const uint stride, const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBox(bx, by, bw, bh, stride, netW, netH);
    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;
    bbi.classId = maxIndex;
    binfo.push_back(bbi);
}

/* Check that the custom function has been defined correctly */
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloV4);

@y14uc339 Thank you. Problem in config_infer_primary_yoloV4.txt file.
parse-bbox-func-name=NvDsInferParseYoloV4

I followed this solution and worked like a charm
However, I can’t seem to correctly modify the functions to work with Yolov4 tiny
Do you have any tips on what I can do?
Thanks

@y14uc339 @pinktree3 @jiejing_ma @gaylord @hymanzhu1983

Output format on https://github.com/Tianxiaomo/pytorch-YOLOv4 is now split from a single output
[batch_size, num_boxes, 4 + num_classes]
into
[batch_size, num_boxes, 1, 4] and [batch_size, num_boxes, num_classes].

For each bounding box, [x_center, y_center, H, W] mode is changed to [x1, y1, x2, y2].

So, there are corresponding updates in objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp.

Follow this updated guide if you want to use the latest updates of YoloV4: YoloV4 Manual

2 Likes

@ersheng So, I used pytorch-YOLOv4 repo. Created an onnx model with batch size 2:

python demo_darknet2onnx.py cfg/yolov4-tiny.cfg yolov4-tiny.weights data/dog.jpg 2

then created a tensorrt engine:

trtexec --onnx=yolov4_2_3_416_416_fp16.onnx --explicitBatch --saveEngine=yolov4_2_3_416_416_fp16.engine --workspace=2048 --fp16 

Then made changes in the deepstream-app config file as:

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/sixth_video.mp4
uri=file:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/sixth_video.mp4

#uri=file:/opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_720p.h264
num-sources=2

[streammux]
batch-size=2

In the detector config file:

[property]
batch-size=2

The error that I am getting is:

root@eca7bf78ed85:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4# deepstream-app -c deepstream_app_config_yoloV4.txt 
WARNING: ../nvdsinfer/nvdsinfer_func_utils.cpp:34 [TRT]: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
0:00:02.490156301   128 0x5599102016d0 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1577> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/yolov4_2_3_416_416_fp16.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:685 [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input           3x416x416       
1   OUTPUT kFLOAT boxes           2535x1x4        
2   OUTPUT kFLOAT confs           2535x80         

0:00:02.490251454   128 0x5599102016d0 WARN                 nvinfer gstnvinfer.cpp:599:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1518> [UID = 1]: Backend has maxBatchSize 1 whereas 2 has been requested
0:00:02.490269099   128 0x5599102016d0 WARN                 nvinfer gstnvinfer.cpp:599:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1689> [UID = 1]: deserialized backend context :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/yolov4_2_3_416_416_fp16.engine failed to match config params, trying rebuild
0:00:02.493503318   128 0x5599102016d0 INFO                 nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1591> [UID = 1]: Trying to create engine from model files
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:934 failed to build network since there is no model file matched.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:872 failed to build network.
0:00:02.493790285   128 0x5599102016d0 ERROR                nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1611> [UID = 1]: build engine file failed
0:00:02.493815136   128 0x5599102016d0 ERROR                nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1697> [UID = 1]: build backend context failed
0:00:02.493829151   128 0x5599102016d0 ERROR                nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1024> [UID = 1]: generate backend failed, check config file settings
0:00:02.494012819   128 0x5599102016d0 WARN                 nvinfer gstnvinfer.cpp:781:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:02.494030430   128 0x5599102016d0 WARN                 nvinfer gstnvinfer.cpp:781:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/config_infer_primary_yoloV4.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:651>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(781): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/config_infer_primary_yoloV4.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed

So, I want to ask how to run yolov4 with batch-size>1 and process multiple streams parallely. Thanks

I got the same error, have you found the solution?

@hymanzhu1983 No not yet!

@hymanzhu1983 @y14uc339

Problem has been reproduced.
We are looking into the issue with inference via DeepStream when explicit_batch_size > 1

I’m testing with karthick blog implementation, i’m able to run with explicit_batch_size 1 with 416x416 fp16 tensorRT model, however, i got a very bad result compare to original darknet yolov4 model, I tried with RTSP CCTV camera as real time input, what I figure out is that, when a lot of classes exists in a frame, for example mouse, keyboard monitor and person is exist in a frame, it able to predict the correct result, however, when a class only exists in a frame, for example only “person” in the frame, it can’t predict anything… is this have the problem done with the yolov3 kernels.cu things?

Hi I followed the instructions DeepStream SDK FAQ
and it seams to run somehow.
But I dont get any visual output.

Unknown or legacy key specified 'is-classifier' for group [property]
Opening in BLOCKING MODE 
0:00:03.976461743   698     0x26425b60 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1701> [UID = 1]: deserialized trt engine from :/xavier_ssd/deepstream/deepstream-5.0/sources/objectDetector_Yolo/yolov4_1_3_608_608_fp16.engine
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input           3x608x608       
1   OUTPUT kFLOAT boxes           22743x1x4       
2   OUTPUT kFLOAT confs           22743x80        

0:00:03.976671001   698     0x26425b60 INFO                 nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1805> [UID = 1]: Use deserialized engine model: /xavier_ssd/deepstream/deepstream-5.0/sources/objectDetector_Yolo/yolov4_1_3_608_608_fp16.engine
0:00:03.997042392   698     0x26425b60 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/xavier_ssd/deepstream/deepstream-5.0/sources/objectDetector_Yolo/config_infer_primary_yoloV4.txt sucessfully

Runtime commands:
	h: Print this help
	q: Quit

	p: Pause
	r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
      To go back to the tiled display, right-click anywhere on the window.

** INFO: <bus_callback:181>: Pipeline ready

Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 261 
NVMEDIA: Reading vendor.tegra.display-size : status: 6 
NvMMLiteBlockCreate : Block : BlockType = 261 
** INFO: <bus_callback:167>: Pipeline running

NvMMLiteOpen : Block : BlockType = 4 
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4 
H264: Profile = 66, Level = 0 
avg bitrate=0 for CBR, force to CQP mode

**PERF:  FPS 0 (Avg)	
**PERF:  28.07 (26.81)	
**PERF:  28.44 (28.37)	
**PERF:  28.54 (28.38)	
**PERF:  28.36 (28.39)	
**PERF:  28.41 (28.39)	
**PERF:  28.52 (28.43)	
**PERF:  28.35 (28.42)	
**PERF:  28.40 (28.42)	
**PERF:  28.45 (28.41)	
**PERF:  28.39 (28.41)	

Any suggestions what I’m doing wrong?

Hi driver05,

Please help to open a new topic for your issue.
Thanks

You should check your [sink] section.

Well I use the 608_608_fp16 model but I cant reproduce your problem.
The problem I have is with explicit_batch_size > 1 as long as I dont convert with the right batch size.
I have this only with the python test 3 file not with the deepstrea-app testfile there it works…

1 Like