Get wrong infer results while testing yolov4 on deepstream 5.0

@y14uc339

Highest Yolo version the cuda kernel in /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/ can support is YoloV3.

We are trying to embed Yolo layer into tensorRT engine before deploying to DeepStream, which would cause Yolo cuda kernel in DeepStream no longer to be used. You can have a look at my previous post here: YoloV4 Solution.

YoloV5 may have a similar problem and we will work on it applying the same solution. But you can also imitate this YoloV4 solution to solve your YoloV5 problem by yourself.

1 Like

@ersheng this might be a dumb question!! I understand that Highest Yolo version the cuda kernel in /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/ can support is YoloV3. BUt I am not using that kernel to implement yolov5 but a different kernel. So, even a different implementation of cuda kernel that works for yolov5 in TRT would not work in DeepStream is that what you are trying to say?

@y14uc339 @CJR

Sorry for the misunderstanding.
CJR is providing you a solution to suit the YoloV5 from https://github.com/wang-xinyu/tensorrtx in this stream, and you can continue to follow this stream.

However, I can give you my suggestions which follows a different workflow:
Pytorch --> ONNX --> TRT
And conversion to ONNX first is a more standardized way to handle YoloV5 from the official page: https://github.com/ultralytics/yolov5.

You can choose either way to solve your problem and I hope they do not clash with each other.

@ersheng Thanks!

@ersheng I’ll try it both ways since @CJR is busy/unavailable currently. I’ll go with Pytorch -> Onnx -> TRT approach. It would be great if you can help out with the custom parsing functions and config files for smooth implementation of yolov5 in TRT!
Thanks

@ersheng Thanks a lot. I try this way and it works!
But there seems have some wrong about results.

And it returns warning info.

WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:34 [TRT]: Explicit batch network detected and batch size specified, use enqueue without batch size instead.


Implements

I change the input size to width=320 height=512

And get onnx from Darknet but not pytorch. And set batchsize=1 using this command:

python demo_darknet2onnx.py yolov4.cfg yolov4.weights ./data/dog.jpg 1

onnx2tensorrt

trtexec --onnx=yolov4_1_3_512_320.onnx --explicitBatch --saveEngine=yolov4_1_3_320_512_fp16.engine --workspace=4096 --fp16

Questions

When I set batchsize=4, it gives errors and quit. Does the batchsize have be 1 and input size 320*512? Must I use the Pytorch model? Can the workflow be darknet -> ONNX -> TensoRT?

@jiejing_ma

For the warning

I agree that this warning is annoying but you can now simply ignore it.
It is a historical remaining issue caused by backward compatibility to Caffe and Uff models.
It will be removed in later TensorRT verisons.

For the error

In which step the program quit with error? As I know batch size should be consistent in the workflow: ONNX -> TRT -> DS pipeline:

           batchsize=4  batchsize=4     batchsize=4
Darknet  ->   ONNX   ->   TensorRT  ->  DS pipeline

Have you configured batch size of both [streammux] and [primary-gie]?

For ratio of input

I think the model input ratio should agree with the original image ratio, or at least close to each other.
For example, if your image input is 1080 * 1920, 320 * 512 or 320 * 608 may be a good ratio;
if your image input is 1280 * 1280, then 416 * 416 or 512 * 512 or 608 * 608 may be recommended for the model.

There is an argument named maintain-aspect-ratio in config_infer_primary_yoloV4.txt.
If maintain-aspect-ratio=1, the image will get padded to make its ratio consistent with model input, otherwise, the image will get stretched vertically or horizontally if image ratio does not meet model input.

DarkNet or Pytorch

Convert from darknet to onnx if you just want to use the YoloV4 official pretrained model.
Convert from pytorch to onnx if you want to use the model trained by Pytorch.

1 Like

Hi @jiejing_ma @ersheng
I have implemented Yolov3 with deepstream,but I had a failed attempt with Yolov4.
Can you please share your workflow, and some links which you have referred.
I wish to reproduce your results, [the results you have obtained in the screenshot shared]

I wish to reproduce these . Please help me with a summary or workflow or reference links,
Thanks

@pinktree3

Follow this guidline:
YoloV4: DarkNet or Pytorch -> ONNX -> TensorRT -> DeepStream

1 Like

Hi @ersheng
I followed this link [https://github.com/Tianxiaomo/pytorch-YOLOv4]
I generated the ONNX from the Darknet,
After that, I go to the Nvidia tensorrt container, and the execute the command :

trtexec --onnx=yolov4_1_3_608_608.onnx --explicitBatch --saveEngine=yolov4_1_3_608_608_fp16.engine --workspace=4096 --fp16

I get the following error at the end:
.
.
.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Invalid control characters encountered in text.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:12: Invalid control characters encountered in text.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:14: Message type “onnx2trt_onnx.ModelProto” has no field named “pytorch”.
Failed to parse ONNX model from fileyolov4_1_3_608_608.onnx
[07/09/2020-10:16:02] [E] [TRT] Network must have at least one output
[07/09/2020-10:16:02] [E] [TRT] Network validation failed.
[07/09/2020-10:16:02] [E] Engine creation failed
[07/09/2020-10:16:02] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # trtexec --onnx=yolov4_1_3_608_608.onnx --explicitBatch --saveEngine=yolov4_1_3_608_608_fp16.engine --workspace=4096 --fp16

any suggestions?

@pinktree3

What versions of Pytorch and TensorRT are you using?

@ersheng
I did configure batch size of both [streammux] and [primary-gie]. I will try agine and upload the error info later. Thanks

Hi, @pinktree3
post # 12 is useful. I just refered to it.
For the error you met, you can check your version of pytorch and tensorrt.

Darknet2ONNX

Pytorch 1.4.0 for TensorRT 7.0 and higher
Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher

ONNX2TensorRT

TensorRT version Recommended: 7.0, 7.1

thanks @ersheng @jiejing_ma

@ersheng Followed your yolov4 repo to make TRT engine for yolov5 which was built successfully. Compared the output with pytorch mode and they are both same. But when I hook it up in DeepStream I am not getting any boxes. I have uploaded the code and relevant files here. Let me know if you have any pointers!

Thanks

Hi y14uc339,

Please help to open a new topic for your issue. Thanks

@kayccc I have opened a new topic its been 2 days! Do you mind having a look Yolov5 giving unexpected outputs

Thank you for the detailed steps. I followed all your steps.
I’m using custom yolov4 model. While I try to run deepstream-app. I got following error

nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initResource() <nvdsinfer_context_impl.cpp:667> [UID = 1]: Detect-postprocessor failed to init resource because dlsym failed to get func NvDsInferParseCustomYoloV4 pointer

Solved:

@y14uc339 Thank you, I find out the problem.
@ersheng config_infer_primary_yoloV4.txt I changed
parse-bbox-func-name=NvDsInferParseCustomYoloV4 —> parse-bbox-func-name=NvDsInferParseYoloV4

Then it’s works.

@karthick define the parsing box function just like the other functions are declared in the custom impl cpp file and you will see off this error. This works:

extern "C" bool NvDsInferParseYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList);



static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx, const float& by, const float& bw,
                                     const float& bh, const uint& netW, const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution
    float xCenter = bx * netW;
    float yCenter = by * netH;

    float w = bw * netW;
    float h = bh * netH;

    float x0 = xCenter - w * 0.5;
    float y0 = yCenter - h * 0.5;
    float x1 = x0 + w;
    float y1 = y0 + h;

    x0 = clamp(x0, 0, netW);
    y0 = clamp(y0, 0, netH);
    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);

    b.left = x0;
    b.width = clamp(x1 - x0, 0, netW);
    b.top = y0;
    b.height = clamp(y1 - y0, 0, netH);

    return b;
}

static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
                     const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);
    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;
    bbi.classId = maxIndex;
    binfo.push_back(bbi);
}

static std::vector<NvDsInferParseObjectInfo>
decodeYoloV4Tensor(
    const float* detections, const uint num_bboxes,
    NvDsInferParseDetectionParams const& detectionParams,
    const uint& netW, const uint& netH)
{
    std::vector<NvDsInferParseObjectInfo> binfo;

    uint bbox_location = 0;
    for (uint b = 0; b < num_bboxes; ++b)
    {
        float bx = detections[bbox_location];
        float by = detections[bbox_location + 1];
        float bw = detections[bbox_location + 2];
        float bh = detections[bbox_location + 3];

        float maxProb = 0.0f;
        int maxIndex = -1;

        uint cls_location = bbox_location + 4;
        for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
        {
            float prob = detections[cls_location + c];
            if (prob > maxProb)
            {
                maxProb = prob;
                maxIndex = c;
            }
        }

        if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
        {
            addBBoxProposalYoloV4(bx, by, bw, bh, netW, netH, maxIndex, maxProb, binfo);
        }

        bbox_location += 4 + detectionParams.numClassesConfigured;
    }

    return binfo;
}

extern "C"  bool NvDsInferParseYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList)
{
    if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
    {
        std::cerr << "WARNING: Num classes mismatch. Configured:"
                  << detectionParams.numClassesConfigured
                  << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
    }

    std::vector<NvDsInferParseObjectInfo> objects;

        const NvDsInferLayerInfo &layer = outputLayersInfo[0]; // num_boxes x (4 + num_classes)

        // 2 dimensional: [num_boxes, 4 + num_classes]
        assert(layer.inferDims.numDims == 2);
        // The second dimension should be 4 + num_classes
        assert(detectionParams.numClassesConfigured == layer.inferDims.d[1] - 4);

        uint num_bboxes = layer.inferDims.d[0];

        // std::cout << "Network Info: " << networkInfo.height << "  " << networkInfo.width << std::endl;

        std::vector<NvDsInferParseObjectInfo> outObjs =
            decodeYoloV4Tensor(
                (const float*)(layer.buffer), num_bboxes, detectionParams,
                networkInfo.width, networkInfo.height);

        objects.insert(objects.end(), outObjs.begin(), outObjs.end());

    objectList = objects;

    return true;
}

/* This is a sample bounding box parsing function for the sample YoloV3 detector model */
static NvDsInferParseObjectInfo convertBBox(const float& bx, const float& by, const float& bw,
                                     const float& bh, const int& stride, const uint& netW,
                                     const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution
    float xCenter = bx * stride;
    float yCenter = by * stride;
    float x0 = xCenter - bw / 2;
    float y0 = yCenter - bh / 2;
    float x1 = x0 + bw;
    float y1 = y0 + bh;

    x0 = clamp(x0, 0, netW);
    y0 = clamp(y0, 0, netH);
    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);

    b.left = x0;
    b.width = clamp(x1 - x0, 0, netW);
    b.top = y0;
    b.height = clamp(y1 - y0, 0, netH);

    return b;
}

static void addBBoxProposal(const float bx, const float by, const float bw, const float bh,
                     const uint stride, const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBox(bx, by, bw, bh, stride, netW, netH);
    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;
    bbi.classId = maxIndex;
    binfo.push_back(bbi);
}

/* Check that the custom function has been defined correctly */
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloV4);

@y14uc339 Thank you. Problem in config_infer_primary_yoloV4.txt file.
parse-bbox-func-name=NvDsInferParseYoloV4