YoloV4 with OpenCV

caruofc · November 4, 2020, 2:08am

Hello experts,

Need your opinion. I am testing YoloV4 with OpenCV4.4 compiled with CUDA and cuDNN on JP 4.4. With tiny yolo I am getting close to 2fps when inferring every frame on Nano. Its pretty straight forward to implement/integrate in C++ if you want to use Yolo with OpenCV. Other option is to use TensorRT as nvidia recommends. However, the implementation of Yolo using TensorRT is not as straight forward as OpenCV. So, my question is what benefits that I may expect if I choose TensorRT path instead of OpenCV using Darknet?

AastaLLL · November 4, 2020, 3:18am

Hi,

We do have an example for YOLOv4 that inferences with TensorRT.
You can check it directly and see if you want to use TensorRT or not.

By the way, you can also use TensorRT to replace darknet inferencing but remain OpenCV for camera.
An example can be found here:

Thanks.

yurikleb · November 4, 2020, 9:43am

Have a look at this blog:

We are getting about 12fps on the Nano with a custom Tiny-Yolo-V4 Model, after following the tutorials in that post.

caruofc · November 4, 2020, 9:52pm

I am trying yolov4_deepstream that you mentioned in your post and currently trying to darknet2onnx to convert yolov4 to onnx. However, I am not being able to install onnxruntime using $pip install onnxruntime. It is saying,

$ pip install onnxruntime
Collecting onnxruntime
Could not find a version that satisfies the requirement onnxruntime (from versions: )
No matching distribution found for onnxruntime

I am using JP4.4. Also, what I am expecting following this step is to produce yolov4.onnx file from darknet yolov4 weights. So the .py that I need to run is the one in tool directory (darknet2onnx.py). Please correct me if I am wrong.

yurikleb · November 5, 2020, 10:51am

Seems like if you want to use onnxruntime on the Jetson you will need to get a docker image or install it from Nvidia servers not the default PyPl Package index :
https://developer.nvidia.com/blog/announcing-onnx-runtime-for-jetson/

Might also worth just doing the conversion on a different machine.

caruofc · November 5, 2020, 5:47pm

Thank you for your reply.

I am having difficulty in following the steps describe in the link below and would appreciate if somebody could clarify.

github.com

NVIDIA-AI-IOT/yolov4_deepstream/blob/master/tensorrt_yolov4/README.md

# YOLOv4 Standalone Program of Multi-Tasks 

## 1. Contents

- **`common`** Some common code dependencies and utilities 
- **`source`** Source code of standalone Program
    - `main.cpp`: Program main entrance where parameters are configured here
    - `SampleYolo.hpp`: YOLOv4 inference class definition file
    - `SampleYolo.cpp`: YOLOv4 inference class functions definition file
    - `onnx_add_nms_plugin.py`: Python script to add BatchedNMSPlugin node into ONNX model
    - `generate_coco_image_list.py`: Python script to get list of image names from MS COCO annotation or information file

- **`data`** This directory saves:
    - `yolov4.onnx`: the ONNX model (User generated)
    - `yolov4.engine`: the TensorRT engine model (would be generated by this program)
    - `demo.jpg`: The demo image (Already exists)
    - `demo_out.jpg`: Image detection output of the demo image (Already exists, but would be renewed by the program)
    - `names.txt`: MS COCO dataset label names (have to be downloaded or generated via COCO API)
    - `categories.txt`: MS COCO dataset categories where IDs and names are separated by `"\t"` (have to be generated via COCO API)
    - `val2017.txt`: MS COCO validation set image list (have to be generated from corresponding COCO annotation file)

This file has been truncated. show original

My target is to run YoloV4 using TensorRT on Nano using C++ (not Python). I have compiled TensorRT OSS Plugin (libnvinfer_plugin.so.7.1.3) on Nano and replaced the original one in “/usr/lib/aarch64-linux-gnu/”.

My next step is to generate yolov4.onnx. Since I could not install onnxruntime, I downloaded a readymade yolov4.onnx file from models/vision/object_detection_segmentation/yolov4 at main · onnx/models · GitHub.

As the downloaded yolov4.onnx does not include BatchedNMSPlugin (I assume) my next step as I understood is to do Step 2 of section 2.3 described in yolov4_deepstream/README.md at master · NVIDIA-AI-IOT/yolov4_deepstream · GitHub. However, I probably need to do this in a different PC as I can’t install onnxruntime in Nano. Alternatively, if anybody could send me a link from where I can download yolov4.onnx with BatchedNMSPlugin that would also work for me.

Now, if I go to section 3 of yolov4_deepstream/README.md at master · NVIDIA-AI-IOT/yolov4_deepstream · GitHub, it confuses me totally.
To compile and build it says to go to
cd <dir_on_your_machine>/yolov4_sample/yolo_cpp_standalone/source_gpu_nms

where is this directory?

What I will be ultimately needing for my purpose as I see is the SampleYolo Class and use my own wrapper to integrate into my program to run in realtime using a camera. So for that as I understood I need the libnvinfer_plugin.so.7.1.3 (for YoloV4) which I already did, I downloaded yolov4.onnx (without BatchedNMSPlugin and I think I need to include that, need help on how to do that) and probably that’s it.

Please let me know if anything is missing in my understanding. Thanks

UPDATE:
The downloaded yolov4.onnx seems wont work, as it is saying

[11/05/2020-13:27:41] [W] [TRT] “onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.”
[11/05/2020-13:27:41] [I] Building TensorRT engine…/data/yolov4.engine
[11/05/2020-13:27:46] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[11/05/2020-13:27:46] [E] [TRT] Network validation failed.

caruofc · November 9, 2020, 9:27pm

UPDATE:

I tried to build and run standalone tensorrt_yolov4 using the following link:

github.com

NVIDIA-AI-IOT/yolov4_deepstream/blob/master/tensorrt_yolov4/README.md

# YOLOv4 Standalone Program of Multi-Tasks 

## 1. Contents

- **`common`** Some common code dependencies and utilities 
- **`source`** Source code of standalone Program
    - `main.cpp`: Program main entrance where parameters are configured here
    - `SampleYolo.hpp`: YOLOv4 inference class definition file
    - `SampleYolo.cpp`: YOLOv4 inference class functions definition file
    - `onnx_add_nms_plugin.py`: Python script to add BatchedNMSPlugin node into ONNX model
    - `generate_coco_image_list.py`: Python script to get list of image names from MS COCO annotation or information file

- **`data`** This directory saves:
    - `yolov4.onnx`: the ONNX model (User generated)
    - `yolov4.engine`: the TensorRT engine model (would be generated by this program)
    - `demo.jpg`: The demo image (Already exists)
    - `demo_out.jpg`: Image detection output of the demo image (Already exists, but would be renewed by the program)
    - `names.txt`: MS COCO dataset label names (have to be downloaded or generated via COCO API)
    - `categories.txt`: MS COCO dataset categories where IDs and names are separated by `"\t"` (have to be generated via COCO API)
    - `val2017.txt`: MS COCO validation set image list (have to be generated from corresponding COCO annotation file)

This file has been truncated. show original

Everything went smoothly, I did the following steps

Download and install TensorRT 7.1.3.4
Download TensorRT OSS, compiled and replaced libnvinfer_plugin.so.7.1.3
Generate yolov4 ONNX using GitHub - Tianxiaomo/pytorch-YOLOv4: PyTorch ,ONNX and TensorRT implementation of YOLOv4 (Step 1)
Added BatchedNMSPlugin into yolov4 ONNX model using GitHub - Tianxiaomo/pytorch-YOLOv4: PyTorch ,ONNX and TensorRT implementation of YOLOv4 (Step 2)

Next I build the yolov4 standalone program as described in yolov4_deepstream/README.md at master · NVIDIA-AI-IOT/yolov4_deepstream · GitHub (section 3.1 to 3.4)

Now, I am trying to execute yolov4 (section 3.5) and the following error I am receining:

&&&& RUNNING TensorRT.sample_yolo # ./yolov4 --fp16
There are 0 coco images to process
[11/09/2020-14:02:36] [I] Building and running a GPU inference engine for Yolo
[11/09/2020-14:02:37] [I] Parsing ONNX file: …/data/yolov4_1_3_416_416.onnx.nms.onnx
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [I] [TRT] ModelImporter.cpp:135: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[11/09/2020-14:02:37] [I] [TRT] builtin_op_importers.cpp:3659: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[11/09/2020-14:02:37] [I] [TRT] builtin_op_importers.cpp:3676: Successfully created plugin: BatchedNMS_TRT
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [I] Building TensorRT engine…/data/yolov4_1_3_416_416.engine
[11/09/2020-14:02:37] [W] [TRT] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
[11/09/2020-14:02:38] [E] [TRT] …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[11/09/2020-14:02:38] [E] [TRT] …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
&&&& FAILED TensorRT.sample_yolo # ./yolov4 --fp16

Any ideas?
BTW, I am using GTX 1050.

caruofc · November 10, 2020, 2:27am

Solved the issue by reducing the GPU memory usage from 4GB (default) to 1GB for Nano in SampleYolo.cpp as follows

config->setMaxWorkspaceSize(4096_MiB);

I had to change the above line to

config->setMaxWorkspaceSize(1024_MiB);

Thanks to everyone for your feedback.

frank.huang · December 4, 2020, 8:59am

Hi @caruofc,
Did you have the problem for BatchedNMS_TRT ?
I follow your suggestion and build the tensorrt oss successfully.
But when I run ../bin/yolov4 to convert model from .onnx to .engine , the error showed like this,

&&&& RUNNING TensorRT.sample_yolo # ../bin/yolov4 --fp16
There are 0 coco images to process
[12/04/2020-15:59:06] [I] Building and running a GPU inference engine for Yolo
[12/04/2020-15:59:08] [I] Parsing ONNX file: ../data/yolov4.onnx
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:10] [I] [TRT] ModelImporter.cpp:135: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[12/04/2020-15:59:10] [I] [TRT] builtin_op_importers.cpp:3659: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace: 
[12/04/2020-15:59:10] [I] [TRT] builtin_op_importers.cpp:3676: Successfully created plugin: BatchedNMS_TRT
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [I] Building TensorRT engine../data/yolov4.engine
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] Layer (Unnamed Layer* 3429) [PluginV2Ext] failed validation
[12/04/2020-15:59:10] [E] [TRT] Network validation failed.
&&&& FAILED TensorRT.sample_yolo # ../bin/yolov4 --fp16

Do you know how to solve this?
Any suggestion is appreciated, thanks.

joestump · January 2, 2021, 2:59am

Able to convert but with zero detection: There are some errors after adding "BatchedNMS_TRT" layer · Issue #3 · NVIDIA-AI-IOT/yolov4_deepstream · GitHub

joestump · January 11, 2021, 2:14am

Crisis averted: There are some errors after adding "BatchedNMS_TRT" layer · Issue #3 · NVIDIA-AI-IOT/yolov4_deepstream · GitHub

fre_deric · June 25, 2021, 3:49pm

Hi @caruofc , did you find this directory?

Topic		Replies	Views
Yolo V4 on Jetson Nano with JP4.6 Jetson Nano yolo	2	2052	May 3, 2022
Python wrapper for tensorrt implementation of Yolo (currently v2) Jetson Nano	32	8075	July 2, 2020
How to run YoloV3 provided in the samples? Jetson Nano	12	2109	October 18, 2021
YoloV3 Deepstream SDK 4.0 performance on Jetson Nano DeepStream SDK	14	4467	October 12, 2021
Iplugin tensorrt engine error for ds5.0 DeepStream SDK	29	4220	October 12, 2021
Yolov3 is very slow Jetson Nano	21	20371	October 14, 2021
Get wrong infer results while testing yolov4 on deepstream 5.0 DeepStream SDK	46	9487	October 12, 2021
Yolov4 not working in deepstream app? TAO Toolkit	26	1309	August 28, 2021
DeepStream DeepStream SDK	9	592	October 12, 2021
run yolov3-tiny with tensorRT model Jetson Nano	7	3455	January 4, 2020

YoloV4 with OpenCV

Related topics