Need your opinion. I am testing YoloV4 with OpenCV4.4 compiled with CUDA and cuDNN on JP 4.4. With tiny yolo I am getting close to 2fps when inferring every frame on Nano. Its pretty straight forward to implement/integrate in C++ if you want to use Yolo with OpenCV. Other option is to use TensorRT as nvidia recommends. However, the implementation of Yolo using TensorRT is not as straight forward as OpenCV. So, my question is what benefits that I may expect if I choose TensorRT path instead of OpenCV using Darknet?
I am trying yolov4_deepstream that you mentioned in your post and currently trying to darknet2onnx to convert yolov4 to onnx. However, I am not being able to install onnxruntime using $pip install onnxruntime. It is saying,
$ pip install onnxruntime
Collecting onnxruntime
Could not find a version that satisfies the requirement onnxruntime (from versions: )
No matching distribution found for onnxruntime
I am using JP4.4. Also, what I am expecting following this step is to produce yolov4.onnx file from darknet yolov4 weights. So the .py that I need to run is the one in tool directory (darknet2onnx.py). Please correct me if I am wrong.
I am having difficulty in following the steps describe in the link below and would appreciate if somebody could clarify.
My target is to run YoloV4 using TensorRT on Nano using C++ (not Python). I have compiled TensorRT OSS Plugin (libnvinfer_plugin.so.7.1.3) on Nano and replaced the original one in “/usr/lib/aarch64-linux-gnu/”.
As the downloaded yolov4.onnx does not include BatchedNMSPlugin (I assume) my next step as I understood is to do Step 2 of section 2.3 described in yolov4_deepstream/README.md at master · NVIDIA-AI-IOT/yolov4_deepstream · GitHub. However, I probably need to do this in a different PC as I can’t install onnxruntime in Nano. Alternatively, if anybody could send me a link from where I can download yolov4.onnx with BatchedNMSPlugin that would also work for me.
What I will be ultimately needing for my purpose as I see is the SampleYolo Class and use my own wrapper to integrate into my program to run in realtime using a camera. So for that as I understood I need the libnvinfer_plugin.so.7.1.3 (for YoloV4) which I already did, I downloaded yolov4.onnx (without BatchedNMSPlugin and I think I need to include that, need help on how to do that) and probably that’s it.
Please let me know if anything is missing in my understanding. Thanks
UPDATE:
The downloaded yolov4.onnx seems wont work, as it is saying
[11/05/2020-13:27:41] [W] [TRT] “onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.”
[11/05/2020-13:27:41] [I] Building TensorRT engine…/data/yolov4.engine
[11/05/2020-13:27:46] [E] [TRT] Network has dynamic or shape inputs, but no optimization profile has been defined.
[11/05/2020-13:27:46] [E] [TRT] Network validation failed.
Now, I am trying to execute yolov4 (section 3.5) and the following error I am receining:
&&&& RUNNING TensorRT.sample_yolo # ./yolov4 --fp16
There are 0 coco images to process
[11/09/2020-14:02:36] [I] Building and running a GPU inference engine for Yolo
[11/09/2020-14:02:37] [I] Parsing ONNX file: …/data/yolov4_1_3_416_416.onnx.nms.onnx
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[11/09/2020-14:02:37] [I] [TRT] ModelImporter.cpp:135: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[11/09/2020-14:02:37] [I] [TRT] builtin_op_importers.cpp:3659: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[11/09/2020-14:02:37] [I] [TRT] builtin_op_importers.cpp:3676: Successfully created plugin: BatchedNMS_TRT
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [W] [TRT] Output type must be INT32 for shape outputs
[11/09/2020-14:02:37] [I] Building TensorRT engine…/data/yolov4_1_3_416_416.engine
[11/09/2020-14:02:37] [W] [TRT] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
[11/09/2020-14:02:38] [E] [TRT] …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[11/09/2020-14:02:38] [E] [TRT] …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
&&&& FAILED TensorRT.sample_yolo # ./yolov4 --fp16
Hi @caruofc,
Did you have the problem for BatchedNMS_TRT ?
I follow your suggestion and build the tensorrt oss successfully.
But when I run ../bin/yolov4 to convert model from .onnx to .engine , the error showed like this,
&&&& RUNNING TensorRT.sample_yolo # ../bin/yolov4 --fp16
There are 0 coco images to process
[12/04/2020-15:59:06] [I] Building and running a GPU inference engine for Yolo
[12/04/2020-15:59:08] [I] Parsing ONNX file: ../data/yolov4.onnx
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:09] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[12/04/2020-15:59:10] [I] [TRT] ModelImporter.cpp:135: No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[12/04/2020-15:59:10] [I] [TRT] builtin_op_importers.cpp:3659: Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[12/04/2020-15:59:10] [I] [TRT] builtin_op_importers.cpp:3676: Successfully created plugin: BatchedNMS_TRT
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [I] Building TensorRT engine../data/yolov4.engine
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] (Unnamed Layer* 3429) [PluginV2Ext]: PluginV2Layer must be V2DynamicExt when there are runtime input dimensions.
[12/04/2020-15:59:10] [E] [TRT] Layer (Unnamed Layer* 3429) [PluginV2Ext] failed validation
[12/04/2020-15:59:10] [E] [TRT] Network validation failed.
&&&& FAILED TensorRT.sample_yolo # ../bin/yolov4 --fp16
Do you know how to solve this?
Any suggestion is appreciated, thanks.