Get wrong infer results while testing yolov4 on deepstream 5.0

jiejing_ma · May 30, 2020, 11:29am

Envs

• Hardware Platform (Jetson / GPU) GeForce GTX 1070
• DeepStream Version 5.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 7.0.0.11
• NVIDIA GPU Driver Version (valid for GPU only) 440.33.01

Problem Description

I refer to the tensorrtx to generate yolov4 engine file. It runs well when test tensorrtx yolov4.

Then I add engine file to deepstream 5.0 refer to deepstream-app. I have changed config files and rewrite nvdsparsebbox_Yolo.cpp and nvdsinfer_yolo_engine.cpp etc. I can get infer results via std::vector<NvDsInferLayerInfo> const &outputLayersInfo. but the result is different from tensorrtx and seems wrong .

Errors Print

I print the results (before nms) get from deepstream as follow:

...
x: 72 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 80 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 88 y: 272 w: inf h: inf det Confidence: 1 id: 3 class Confidence: 1
x: 96 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 136 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 144 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 152 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 160 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 200 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 208 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 216 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 224 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 264 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 272 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 280 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 288 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 104 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 112 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 120 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 128 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 168 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
x: 176 y: 272 w: inf h: inf det Confidence: 1 id: 1 class Confidence: 1
...

Implements

I doubt that infer pipeline in deepstream gets wrong results. But tensorrtx infer results are right , and the doInference function :

void doInference(IExecutionContext &context, float *input, float *output, int batchSize)
{
    const ICudaEngine &engine = context.getEngine();

    // Pointers to input and output device buffers to pass to engine.
    // Engine requires exactly IEngine::getNbBindings() number of buffers.
    assert(engine.getNbBindings() == 2);
    void *buffers[2];

    // In order to bind the buffers, we need to know the names of the input and output tensors.
    // Note that indices are guaranteed to be less than IEngine::getNbBindings()
    const int inputIndex = engine.getBindingIndex(INPUT_BLOB_NAME);
    const int outputIndex = engine.getBindingIndex(OUTPUT_BLOB_NAME);

    // Create GPU buffers on device
    CHECK(cudaMalloc(&buffers[inputIndex], batchSize * 3 * INPUT_H * INPUT_W * sizeof(float)));
    CHECK(cudaMalloc(&buffers[outputIndex], batchSize * OUTPUT_SIZE * sizeof(float)));

    // Create stream
    cudaStream_t stream;
    CHECK(cudaStreamCreate(&stream));

    // DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
    CHECK(cudaMemcpyAsync(buffers[inputIndex], input, batchSize * 3 * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, stream));
    context.enqueue(batchSize, buffers, stream, nullptr);
    CHECK(cudaMemcpyAsync(output, buffers[outputIndex], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));
    cudaStreamSynchronize(stream);

    // Release stream and buffers
    cudaStreamDestroy(stream);
    CHECK(cudaFree(buffers[inputIndex]));
    CHECK(cudaFree(buffers[outputIndex]));
}

Additional

I also refer Iplugin tensorrt engine error for ds5.0 #5, but I can’t get engine file when run $ sudo /usr/local/TensorRT-7.0.0.11/bin/trtexec --onnx=yolov4_4_3_608_608.onnx --workspace=4096 --saveEngine=yolov4.engine --fp16 --explicitBatch. Errors :

(yolov4) dreamdeck@mjj:~/Documents/code/test/yolov4/pytorch-YOLOv4$ sudo /usr/local/TensorRT-7.0.0.11/bin/trtexec --onnx=yolov4_4_3_608_608.onnx --workspace=4096 --saveEngine=yolov4.engine --fp16 --explicitBatch
&&&& RUNNING TensorRT.trtexec # /usr/local/TensorRT-7.0.0.11/bin/trtexec --onnx=yolov4_4_3_608_608.onnx --workspace=4096 --saveEngine=yolov4.engine --fp16 --explicitBatch
[05/30/2020-18:16:23] [I] === Model Options ===
[05/30/2020-18:16:23] [I] Format: ONNX
[05/30/2020-18:16:23] [I] Model: yolov4_4_3_608_608.onnx
[05/30/2020-18:16:23] [I] Output:
[05/30/2020-18:16:23] [I] === Build Options ===
[05/30/2020-18:16:23] [I] Max batch: explicit
[05/30/2020-18:16:23] [I] Workspace: 4096 MB
[05/30/2020-18:16:23] [I] minTiming: 1
[05/30/2020-18:16:23] [I] avgTiming: 8
[05/30/2020-18:16:23] [I] Precision: FP16
[05/30/2020-18:16:23] [I] Calibration: 
[05/30/2020-18:16:23] [I] Safe mode: Disabled
[05/30/2020-18:16:23] [I] Save engine: yolov4.engine
[05/30/2020-18:16:23] [I] Load engine: 
[05/30/2020-18:16:23] [I] Inputs format: fp32:CHW
[05/30/2020-18:16:23] [I] Outputs format: fp32:CHW
[05/30/2020-18:16:23] [I] Input build shapes: model
[05/30/2020-18:16:23] [I] === System Options ===
[05/30/2020-18:16:23] [I] Device: 0
[05/30/2020-18:16:23] [I] DLACore: 
[05/30/2020-18:16:23] [I] Plugins:
[05/30/2020-18:16:23] [I] === Inference Options ===
[05/30/2020-18:16:23] [I] Batch: Explicit
[05/30/2020-18:16:23] [I] Iterations: 10
[05/30/2020-18:16:23] [I] Duration: 3s (+ 200ms warm up)
[05/30/2020-18:16:23] [I] Sleep time: 0ms
[05/30/2020-18:16:23] [I] Streams: 1
[05/30/2020-18:16:23] [I] ExposeDMA: Disabled
[05/30/2020-18:16:23] [I] Spin-wait: Disabled
[05/30/2020-18:16:23] [I] Multithreading: Disabled
[05/30/2020-18:16:23] [I] CUDA Graph: Disabled
[05/30/2020-18:16:23] [I] Skip inference: Disabled
[05/30/2020-18:16:23] [I] Inputs:
[05/30/2020-18:16:23] [I] === Reporting Options ===
[05/30/2020-18:16:23] [I] Verbose: Disabled
[05/30/2020-18:16:23] [I] Averages: 10 inferences
[05/30/2020-18:16:23] [I] Percentile: 99
[05/30/2020-18:16:23] [I] Dump output: Disabled
[05/30/2020-18:16:23] [I] Profile: Disabled
[05/30/2020-18:16:23] [I] Export timing to JSON file: 
[05/30/2020-18:16:23] [I] Export output to JSON file: 
[05/30/2020-18:16:23] [I] Export profile to JSON file: 
[05/30/2020-18:16:23] [I] 
----------------------------------------------------------------
Input filename:   yolov4_4_3_608_608.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    pytorch
Producer version: 1.5
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/30/2020-18:16:24] [W] [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
[05/30/2020-18:16:24] [W] [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
[05/30/2020-18:16:24] [E] [TRT] Layer: (Unnamed Layer* 426)[Select]'s output can not be used as shape tensor.
[05/30/2020-18:16:24] [E] [TRT] Network validation failed.
[05/30/2020-18:16:24] [E] Engine creation failed
[05/30/2020-18:16:24] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /usr/local/TensorRT-7.0.0.11/bin/trtexec --onnx=yolov4_4_3_608_608.onnx --workspace=4096 --saveEngine=yolov4.engine --fp16 --explicitBatch

I have no ideas for how to solve it.

Thanks.

ersheng · June 1, 2020, 6:18am

DeepStream issue

The DS YOLO pipeline was designed for YOLOv3.
There is no suitable DS pipeline for YOLOv4 yet.
The new pipeline for YOLOv4 is still under development.

YOLOv4 onnx file parsing issue

The ONNX module of pytorch 1.5 seems to behave differently from earlier pytorch versions while dealing with constant parameters for expand operations.
Try to generate onnx file with pytorch 1.4 or pytorch 1.3 .

Please see compatible pytorch version in TensorRT 7 release note: Release Notes :: NVIDIA Deep Learning TensorRT Documentation

Pytorch && ONNX are evolving quickly and we are trying best to catch up.
Inform me if TensorRT reports error again.

jiejing_ma · June 2, 2020, 3:56am

Thanks. I will try to generate YOLOv4 onnx file with pytorch1.4.

But I don’t understand that

There is no suitable DS pipeline for YOLOv4 yet.

I have already implemented yolo layer define and generate engine file, and it runs well. What deepstream need to do is just run engine and infer. Is it right? Or means that DS also do something else, likes preprocessing ect.?

ersheng · June 17, 2020, 6:26am

@jiejing_ma

Yes, preprocessing of images is included in DS.

We recommend focusing on the ONNX standard to convert models from other DL frameworks into ONNX first, and then convert into TensorRT engine.

Please pull the latest source from https://github.com/Tianxiaomo/pytorch-YOLOv4 and try to follow section 2, 3, 4, 5 of README on it.

I am now looking into the DS pipeline to check the compatibility of post-processing.

gaylord · June 23, 2020, 3:14pm

Hey there. Any news on this? We would really like to try YOLO-4 with our DS application.
Benchmarks for YOLO-4 look impressing…

cheers,
Gaylord

ersheng · June 30, 2020, 5:17am

@gaylord

Integration solution of YoloV4 and DS is now under development.
Manuals and new code release will be available in the near future.

hymanzhu1983 · July 6, 2020, 9:34am

Hi ersheng,
any news about the integration of YoloV4 and DS? When will it be release?
Thanks

y14uc339 · July 7, 2020, 9:41am

@ersheng so does this mean that yolov5 is also not working because of DeepStream compatibility @CJR says the reason for incorrect results is due to wrong execution of cuda kernels . Do you mind throwing some light on what is the main issue? Thanks

ersheng · July 7, 2020, 10:40am

@gaylord @hymanzhu1983 @y14uc339 @jiejing_ma

Current Yolo implementation via CUDA kernel in DeepStream is based on old Yolo models (v2, v3) so it may not suit new Yolo models like YoloV4. Location: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/kernels.cu

We are trying to embed Yolo layer into tensorRT engine before deploying to DeepStream, which would cause Yolo cuda kernel in DeepStream no longer to be used.

We have not officially released YoloV4 solutions for DeepStream yet but you can try following steps:

go to https://github.com/Tianxiaomo/pytorch-YOLOv4 to generate a TensorRT engine according to this workflow: DarkNet or Pytorch → ONNX → TensorRT.
Add following C++ functions into objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp and rebuild libnvdsinfer_custom_impl_Yolo.so
Here are configuration files for you as references (You have to update a little to suit your environment):
config_infer_primary_yoloV4.txt (3.4 KB)
deepstream_app_config_yoloV4.txt (3.8 KB)

static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx1, const float& by1, const float& bx2,
                                     const float& by2, const uint& netW, const uint& netH)
{
    NvDsInferParseObjectInfo b;
    // Restore coordinates to network input resolution

    float x1 = bx1 * netW;
    float y1 = by1 * netH;
    float x2 = bx2 * netW;
    float y2 = by2 * netH;

    x1 = clamp(x1, 0, netW);
    y1 = clamp(y1, 0, netH);
    x2 = clamp(x2, 0, netW);
    y2 = clamp(y2, 0, netH);

    b.left = x1;
    b.width = clamp(x2 - x1, 0, netW);
    b.top = y1;
    b.height = clamp(y2 - y1, 0, netH);

    return b;
}

static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
                     const uint& netW, const uint& netH, const int maxIndex,
                     const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
    NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);
    if (bbi.width < 1 || bbi.height < 1) return;

    bbi.detectionConfidence = maxProb;
    bbi.classId = maxIndex;
    binfo.push_back(bbi);
}

static std::vector<NvDsInferParseObjectInfo>
decodeYoloV4Tensor(
    const float* boxes, const float* scores,
    const uint num_bboxes, NvDsInferParseDetectionParams const& detectionParams,
    const uint& netW, const uint& netH)
{
    std::vector<NvDsInferParseObjectInfo> binfo;

    uint bbox_location = 0;
    uint score_location = 0;
    for (uint b = 0; b < num_bboxes; ++b)
    {
        float bx1 = boxes[bbox_location];
        float by1 = boxes[bbox_location + 1];
        float bx2 = boxes[bbox_location + 2];
        float by2 = boxes[bbox_location + 3];

        float maxProb = 0.0f;
        int maxIndex = -1;

        for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
        {
            float prob = scores[score_location + c];
            if (prob > maxProb)
            {
                maxProb = prob;
                maxIndex = c;
            }
        }

        if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
        {
            addBBoxProposalYoloV4(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
        }

        bbox_location += 4;
        score_location += detectionParams.numClassesConfigured;
    }

    return binfo;
}

extern "C" bool NvDsInferParseCustomYoloV4(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo,
    NvDsInferParseDetectionParams const& detectionParams,
    std::vector<NvDsInferParseObjectInfo>& objectList)
{
    if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
    {
        std::cerr << "WARNING: Num classes mismatch. Configured:"
                  << detectionParams.numClassesConfigured
                  << ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
    }

    std::vector<NvDsInferParseObjectInfo> objects;

    const NvDsInferLayerInfo &boxes = outputLayersInfo[0]; // num_boxes x 4
    const NvDsInferLayerInfo &scores = outputLayersInfo[1]; // num_boxes x num_classes

    // 3 dimensional: [num_boxes, 1, 4]
    assert(boxes.inferDims.numDims == 3);
    // 2 dimensional: [num_boxes, num_classes]
    assert(scores.inferDims.numDims == 2);

    // The second dimension should be num_classes
    assert(detectionParams.numClassesConfigured == scores.inferDims.d[1]);
    
    uint num_bboxes = boxes.inferDims.d[0];

    // std::cout << "Network Info: " << networkInfo.height << "  " << networkInfo.width << std::endl;

    std::vector<NvDsInferParseObjectInfo> outObjs =
        decodeYoloV4Tensor(
            (const float*)(boxes.buffer), (const float*)(scores.buffer), num_bboxes, detectionParams,
            networkInfo.width, networkInfo.height);

    objects.insert(objects.end(), outObjs.begin(), outObjs.end());

    objectList = objects;

    return true;
}

y14uc339 · July 7, 2020, 10:58am

@ersheng thanks! But my question was mainly regarding yolov5 compatibility which was released recently!

ersheng · July 7, 2020, 11:08am

@y14uc339

YoloV5 may have similar problems too.
However, we have not thoroughly studied compatibilities of YoloV5 yet.
We may add YoloV5 into our agenda soon.

y14uc339 · July 8, 2020, 5:46am

Hi @ersheng. Since, DeepStream supports TensorRT and we implemented the cuda kernel for yolov5 which works fine in TensorRT. Why is that cuda kernel not working in DeepStream when DS is using the same TRT. I mean what exactly is causing the problem because here @CJR says that it should work in DS. Any thoughts on this?
Thanks!!

ersheng · July 8, 2020, 7:53am

@y14uc339

Highest Yolo version the cuda kernel in /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/ can support is YoloV3.

We are trying to embed Yolo layer into tensorRT engine before deploying to DeepStream, which would cause Yolo cuda kernel in DeepStream no longer to be used. You can have a look at my previous post here: YoloV4 Solution.

YoloV5 may have a similar problem and we will work on it applying the same solution. But you can also imitate this YoloV4 solution to solve your YoloV5 problem by yourself.

y14uc339 · July 8, 2020, 8:02am

@ersheng this might be a dumb question!! I understand that Highest Yolo version the cuda kernel in /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/ can support is YoloV3. BUt I am not using that kernel to implement yolov5 but a different kernel. So, even a different implementation of cuda kernel that works for yolov5 in TRT would not work in DeepStream is that what you are trying to say?

ersheng · July 8, 2020, 8:36am

@y14uc339 @CJR

Sorry for the misunderstanding.
CJR is providing you a solution to suit the YoloV5 from https://github.com/wang-xinyu/tensorrtx in this stream, and you can continue to follow this stream.

However, I can give you my suggestions which follows a different workflow:
Pytorch → ONNX → TRT
And conversion to ONNX first is a more standardized way to handle YoloV5 from the official page: https://github.com/ultralytics/yolov5.

You can choose either way to solve your problem and I hope they do not clash with each other.

y14uc339 · July 8, 2020, 8:42am

@ersheng Thanks!

y14uc339 · July 8, 2020, 9:09am

@ersheng I’ll try it both ways since @CJR is busy/unavailable currently. I’ll go with Pytorch → Onnx → TRT approach. It would be great if you can help out with the custom parsing functions and config files for smooth implementation of yolov5 in TRT!
Thanks

jiejing_ma · July 8, 2020, 1:17pm

@ersheng Thanks a lot. I try this way and it works!
But there seems have some wrong about results.

And it returns warning info.

WARNING: …/nvdsinfer/nvdsinfer_func_utils.cpp:34 [TRT]: Explicit batch network detected and batch size specified, use enqueue without batch size instead.

Implements

I change the input size to width=320 height=512

And get onnx from Darknet but not pytorch. And set batchsize=1 using this command:

python demo_darknet2onnx.py yolov4.cfg yolov4.weights ./data/dog.jpg 1

onnx2tensorrt

trtexec --onnx=yolov4_1_3_512_320.onnx --explicitBatch --saveEngine=yolov4_1_3_320_512_fp16.engine --workspace=4096 --fp16

Questions

When I set batchsize=4, it gives errors and quit. Does the batchsize have be 1 and input size 320*512? Must I use the Pytorch model? Can the workflow be darknet → ONNX → TensoRT?

ersheng · July 9, 2020, 5:17am

@jiejing_ma

For the warning

I agree that this warning is annoying but you can now simply ignore it.
It is a historical remaining issue caused by backward compatibility to Caffe and Uff models.
It will be removed in later TensorRT verisons.

For the error

In which step the program quit with error? As I know batch size should be consistent in the workflow: ONNX → TRT → DS pipeline:

           batchsize=4  batchsize=4     batchsize=4
Darknet  ->   ONNX   ->   TensorRT  ->  DS pipeline

Have you configured batch size of both [streammux] and [primary-gie]?

For ratio of input

I think the model input ratio should agree with the original image ratio, or at least close to each other.
For example, if your image input is 1080 * 1920, 320 * 512 or 320 * 608 may be a good ratio;
if your image input is 1280 * 1280, then 416 * 416 or 512 * 512 or 608 * 608 may be recommended for the model.

There is an argument named maintain-aspect-ratio in config_infer_primary_yoloV4.txt.
If maintain-aspect-ratio=1, the image will get padded to make its ratio consistent with model input, otherwise, the image will get stretched vertically or horizontally if image ratio does not meet model input.

DarkNet or Pytorch

Convert from darknet to onnx if you just want to use the YoloV4 official pretrained model.
Convert from pytorch to onnx if you want to use the model trained by Pytorch.

pinktree3 · July 9, 2020, 6:17am

Hi @jiejing_ma @ersheng
I have implemented Yolov3 with deepstream,but I had a failed attempt with Yolov4.
Can you please share your workflow, and some links which you have referred.
I wish to reproduce your results, [the results you have obtained in the screenshot shared]

I wish to reproduce these . Please help me with a summary or workflow or reference links,
Thanks

Topic		Replies	Views
Iplugin tensorrt engine error for ds5.0 DeepStream SDK	29	4205	October 12, 2021
SOS! TensorRT engine different infer output in yolov4 on DeepStream5.0 DeepStream SDK	10	524	October 12, 2021
App Run Fails With Errors DeepStream SDK jetson-inference , yolo , debugging-and-troubleshooting	28	1840	October 23, 2023
Loss of precision to onnx converter for engine by deepstream 6.3 DeepStream SDK tensorrt , gstreamer , inference-server-triton	31	536	August 2, 2024
Yolov4 not working in deepstream app? TAO Toolkit	26	1288	August 28, 2021
Running YOLOV4 on DS 5.0 DeepStream SDK python	3	1134	October 12, 2021
Help needed to convert yolov4-tiny model to tensorRT engine (DS 5) DeepStream SDK	2	2175	October 12, 2021
Deepstream SDK for Yolov4 losing detections after 5 streams DeepStream SDK tensorrt , cuda	9	396	October 10, 2023
Tlt3.0 train yolov4 of resnet10, "tlt yolo_v4 inference" could get right bboxes, but deepstream5.1 get wrong result TAO Toolkit	9	683	October 12, 2021
TensorRT engine giving wrong/different output in DeepStream DeepStream SDK	26	4163	February 22, 2022

Get wrong infer results while testing yolov4 on deepstream 5.0

Envs

Problem Description

Errors Print

Implements

Additional

DeepStream issue

YOLOv4 onnx file parsing issue

Implements

Questions

For the warning

For the error

For ratio of input

DarkNet or Pytorch

Related topics