Follow this guidline:
YoloV4: DarkNet or Pytorch → ONNX → TensorRT → DeepStream
Hi @ersheng
I followed this link [GitHub - Tianxiaomo/pytorch-YOLOv4: PyTorch ,ONNX and TensorRT implementation of YOLOv4]
I generated the ONNX from the Darknet,
After that, I go to the Nvidia tensorrt container, and the execute the command :
trtexec --onnx=yolov4_1_3_608_608.onnx --explicitBatch --saveEngine=yolov4_1_3_608_608_fp16.engine --workspace=4096 --fp16
I get the following error at the end:
.
.
.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Invalid control characters encountered in text.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:12: Invalid control characters encountered in text.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:14: Message type “onnx2trt_onnx.ModelProto” has no field named “pytorch”.
Failed to parse ONNX model from fileyolov4_1_3_608_608.onnx
[07/09/2020-10:16:02] [E] [TRT] Network must have at least one output
[07/09/2020-10:16:02] [E] [TRT] Network validation failed.
[07/09/2020-10:16:02] [E] Engine creation failed
[07/09/2020-10:16:02] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # trtexec --onnx=yolov4_1_3_608_608.onnx --explicitBatch --saveEngine=yolov4_1_3_608_608_fp16.engine --workspace=4096 --fp16
any suggestions?
What versions of Pytorch and TensorRT are you using?
@ersheng
I did configure batch size of both [streammux] and [primary-gie]. I will try agine and upload the error info later. Thanks
Hi, @pinktree3
post # 12 is useful. I just refered to it.
For the error you met, you can check your version of pytorch and tensorrt.
Darknet2ONNX
Pytorch 1.4.0 for TensorRT 7.0 and higher
Pytorch 1.5.0 and 1.6.0 for TensorRT 7.1.2 and higher
ONNX2TensorRT
TensorRT version Recommended: 7.0, 7.1
thanks @ersheng @jiejing_ma
@ersheng Followed your yolov4 repo to make TRT engine for yolov5 which was built successfully. Compared the output with pytorch mode and they are both same. But when I hook it up in DeepStream I am not getting any boxes. I have uploaded the code and relevant files here. Let me know if you have any pointers!
Thanks
Hi y14uc339,
Please help to open a new topic for your issue. Thanks
@kayccc I have opened a new topic its been 2 days! Do you mind having a look Yolov5 giving unexpected outputs
Thank you for the detailed steps. I followed all your steps.
I’m using custom yolov4 model. While I try to run deepstream-app. I got following error
nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initResource() <nvdsinfer_context_impl.cpp:667> [UID = 1]: Detect-postprocessor failed to init resource because dlsym failed to get func NvDsInferParseCustomYoloV4 pointer
Solved:
@y14uc339 Thank you, I find out the problem.
@ersheng config_infer_primary_yoloV4.txt I changed
parse-bbox-func-name=NvDsInferParseCustomYoloV4 —> parse-bbox-func-name=NvDsInferParseYoloV4
Then it’s works.
@karthick define the parsing box function just like the other functions are declared in the custom impl cpp file and you will see off this error. This works:
extern "C" bool NvDsInferParseYoloV4(
std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
NvDsInferNetworkInfo const& networkInfo,
NvDsInferParseDetectionParams const& detectionParams,
std::vector<NvDsInferParseObjectInfo>& objectList);
static NvDsInferParseObjectInfo convertBBoxYoloV4(const float& bx, const float& by, const float& bw,
const float& bh, const uint& netW, const uint& netH)
{
NvDsInferParseObjectInfo b;
// Restore coordinates to network input resolution
float xCenter = bx * netW;
float yCenter = by * netH;
float w = bw * netW;
float h = bh * netH;
float x0 = xCenter - w * 0.5;
float y0 = yCenter - h * 0.5;
float x1 = x0 + w;
float y1 = y0 + h;
x0 = clamp(x0, 0, netW);
y0 = clamp(y0, 0, netH);
x1 = clamp(x1, 0, netW);
y1 = clamp(y1, 0, netH);
b.left = x0;
b.width = clamp(x1 - x0, 0, netW);
b.top = y0;
b.height = clamp(y1 - y0, 0, netH);
return b;
}
static void addBBoxProposalYoloV4(const float bx, const float by, const float bw, const float bh,
const uint& netW, const uint& netH, const int maxIndex,
const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
NvDsInferParseObjectInfo bbi = convertBBoxYoloV4(bx, by, bw, bh, netW, netH);
if (bbi.width < 1 || bbi.height < 1) return;
bbi.detectionConfidence = maxProb;
bbi.classId = maxIndex;
binfo.push_back(bbi);
}
static std::vector<NvDsInferParseObjectInfo>
decodeYoloV4Tensor(
const float* detections, const uint num_bboxes,
NvDsInferParseDetectionParams const& detectionParams,
const uint& netW, const uint& netH)
{
std::vector<NvDsInferParseObjectInfo> binfo;
uint bbox_location = 0;
for (uint b = 0; b < num_bboxes; ++b)
{
float bx = detections[bbox_location];
float by = detections[bbox_location + 1];
float bw = detections[bbox_location + 2];
float bh = detections[bbox_location + 3];
float maxProb = 0.0f;
int maxIndex = -1;
uint cls_location = bbox_location + 4;
for (uint c = 0; c < detectionParams.numClassesConfigured; ++c)
{
float prob = detections[cls_location + c];
if (prob > maxProb)
{
maxProb = prob;
maxIndex = c;
}
}
if (maxProb > detectionParams.perClassPreclusterThreshold[maxIndex])
{
addBBoxProposalYoloV4(bx, by, bw, bh, netW, netH, maxIndex, maxProb, binfo);
}
bbox_location += 4 + detectionParams.numClassesConfigured;
}
return binfo;
}
extern "C" bool NvDsInferParseYoloV4(
std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
NvDsInferNetworkInfo const& networkInfo,
NvDsInferParseDetectionParams const& detectionParams,
std::vector<NvDsInferParseObjectInfo>& objectList)
{
if (NUM_CLASSES_YOLO != detectionParams.numClassesConfigured)
{
std::cerr << "WARNING: Num classes mismatch. Configured:"
<< detectionParams.numClassesConfigured
<< ", detected by network: " << NUM_CLASSES_YOLO << std::endl;
}
std::vector<NvDsInferParseObjectInfo> objects;
const NvDsInferLayerInfo &layer = outputLayersInfo[0]; // num_boxes x (4 + num_classes)
// 2 dimensional: [num_boxes, 4 + num_classes]
assert(layer.inferDims.numDims == 2);
// The second dimension should be 4 + num_classes
assert(detectionParams.numClassesConfigured == layer.inferDims.d[1] - 4);
uint num_bboxes = layer.inferDims.d[0];
// std::cout << "Network Info: " << networkInfo.height << " " << networkInfo.width << std::endl;
std::vector<NvDsInferParseObjectInfo> outObjs =
decodeYoloV4Tensor(
(const float*)(layer.buffer), num_bboxes, detectionParams,
networkInfo.width, networkInfo.height);
objects.insert(objects.end(), outObjs.begin(), outObjs.end());
objectList = objects;
return true;
}
/* This is a sample bounding box parsing function for the sample YoloV3 detector model */
static NvDsInferParseObjectInfo convertBBox(const float& bx, const float& by, const float& bw,
const float& bh, const int& stride, const uint& netW,
const uint& netH)
{
NvDsInferParseObjectInfo b;
// Restore coordinates to network input resolution
float xCenter = bx * stride;
float yCenter = by * stride;
float x0 = xCenter - bw / 2;
float y0 = yCenter - bh / 2;
float x1 = x0 + bw;
float y1 = y0 + bh;
x0 = clamp(x0, 0, netW);
y0 = clamp(y0, 0, netH);
x1 = clamp(x1, 0, netW);
y1 = clamp(y1, 0, netH);
b.left = x0;
b.width = clamp(x1 - x0, 0, netW);
b.top = y0;
b.height = clamp(y1 - y0, 0, netH);
return b;
}
static void addBBoxProposal(const float bx, const float by, const float bw, const float bh,
const uint stride, const uint& netW, const uint& netH, const int maxIndex,
const float maxProb, std::vector<NvDsInferParseObjectInfo>& binfo)
{
NvDsInferParseObjectInfo bbi = convertBBox(bx, by, bw, bh, stride, netW, netH);
if (bbi.width < 1 || bbi.height < 1) return;
bbi.detectionConfidence = maxProb;
bbi.classId = maxIndex;
binfo.push_back(bbi);
}
/* Check that the custom function has been defined correctly */
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloV4);
@y14uc339 Thank you. Problem in config_infer_primary_yoloV4.txt file.
parse-bbox-func-name=NvDsInferParseYoloV4
I followed this solution and worked like a charm
However, I can’t seem to correctly modify the functions to work with Yolov4 tiny
Do you have any tips on what I can do?
Thanks
@y14uc339 @pinktree3 @jiejing_ma @gaylord @hymanzhu1983
Output format on https://github.com/Tianxiaomo/pytorch-YOLOv4 is now split from a single output
[batch_size, num_boxes, 4 + num_classes]
into
[batch_size, num_boxes, 1, 4]
and [batch_size, num_boxes, num_classes]
.
For each bounding box, [x_center, y_center, H, W]
mode is changed to [x1, y1, x2, y2]
.
So, there are corresponding updates in objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp
.
Follow this updated guide if you want to use the latest updates of YoloV4: YoloV4 Manual
@ersheng So, I used pytorch-YOLOv4 repo. Created an onnx model with batch size 2:
python demo_darknet2onnx.py cfg/yolov4-tiny.cfg yolov4-tiny.weights data/dog.jpg 2
then created a tensorrt engine:
trtexec --onnx=yolov4_2_3_416_416_fp16.onnx --explicitBatch --saveEngine=yolov4_2_3_416_416_fp16.engine --workspace=2048 --fp16
Then made changes in the deepstream-app config file as:
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/sixth_video.mp4
uri=file:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/sixth_video.mp4
#uri=file:/opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_720p.h264
num-sources=2
[streammux]
batch-size=2
In the detector config file:
[property]
batch-size=2
The error that I am getting is:
root@eca7bf78ed85:/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4# deepstream-app -c deepstream_app_config_yoloV4.txt
WARNING: ../nvdsinfer/nvdsinfer_func_utils.cpp:34 [TRT]: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
0:00:02.490156301 128 0x5599102016d0 INFO nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1577> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/yolov4_2_3_416_416_fp16.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:685 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT input 3x416x416
1 OUTPUT kFLOAT boxes 2535x1x4
2 OUTPUT kFLOAT confs 2535x80
0:00:02.490251454 128 0x5599102016d0 WARN nvinfer gstnvinfer.cpp:599:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1518> [UID = 1]: Backend has maxBatchSize 1 whereas 2 has been requested
0:00:02.490269099 128 0x5599102016d0 WARN nvinfer gstnvinfer.cpp:599:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1689> [UID = 1]: deserialized backend context :/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/yolov4_2_3_416_416_fp16.engine failed to match config params, trying rebuild
0:00:02.493503318 128 0x5599102016d0 INFO nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1591> [UID = 1]: Trying to create engine from model files
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:934 failed to build network since there is no model file matched.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:872 failed to build network.
0:00:02.493790285 128 0x5599102016d0 ERROR nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1611> [UID = 1]: build engine file failed
0:00:02.493815136 128 0x5599102016d0 ERROR nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1697> [UID = 1]: build backend context failed
0:00:02.493829151 128 0x5599102016d0 ERROR nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1024> [UID = 1]: generate backend failed, check config file settings
0:00:02.494012819 128 0x5599102016d0 WARN nvinfer gstnvinfer.cpp:781:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:02.494030430 128 0x5599102016d0 WARN nvinfer gstnvinfer.cpp:781:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/config_infer_primary_yoloV4.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:651>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(781): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_YoloV4/config_infer_primary_yoloV4.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed
So, I want to ask how to run yolov4 with batch-size>1 and process multiple streams parallely. Thanks
I got the same error, have you found the solution?
@hymanzhu1983 No not yet!
Problem has been reproduced.
We are looking into the issue with inference via DeepStream when explicit_batch_size > 1
I’m testing with karthick blog implementation, i’m able to run with explicit_batch_size 1 with 416x416 fp16 tensorRT model, however, i got a very bad result compare to original darknet yolov4 model, I tried with RTSP CCTV camera as real time input, what I figure out is that, when a lot of classes exists in a frame, for example mouse, keyboard monitor and person is exist in a frame, it able to predict the correct result, however, when a class only exists in a frame, for example only “person” in the frame, it can’t predict anything… is this have the problem done with the yolov3 kernels.cu things?
Hi I followed the instructions DeepStream SDK FAQ - #7 by bcao
and it seams to run somehow.
But I dont get any visual output.
Unknown or legacy key specified 'is-classifier' for group [property]
Opening in BLOCKING MODE
0:00:03.976461743 698 0x26425b60 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1701> [UID = 1]: deserialized trt engine from :/xavier_ssd/deepstream/deepstream-5.0/sources/objectDetector_Yolo/yolov4_1_3_608_608_fp16.engine
INFO: [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT input 3x608x608
1 OUTPUT kFLOAT boxes 22743x1x4
2 OUTPUT kFLOAT confs 22743x80
0:00:03.976671001 698 0x26425b60 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1805> [UID = 1]: Use deserialized engine model: /xavier_ssd/deepstream/deepstream-5.0/sources/objectDetector_Yolo/yolov4_1_3_608_608_fp16.engine
0:00:03.997042392 698 0x26425b60 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/xavier_ssd/deepstream/deepstream-5.0/sources/objectDetector_Yolo/config_infer_primary_yoloV4.txt sucessfully
Runtime commands:
h: Print this help
q: Quit
p: Pause
r: Resume
NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
To go back to the tiled display, right-click anywhere on the window.
** INFO: <bus_callback:181>: Pipeline ready
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
** INFO: <bus_callback:167>: Pipeline running
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
H264: Profile = 66, Level = 0
avg bitrate=0 for CBR, force to CQP mode
**PERF: FPS 0 (Avg)
**PERF: 28.07 (26.81)
**PERF: 28.44 (28.37)
**PERF: 28.54 (28.38)
**PERF: 28.36 (28.39)
**PERF: 28.41 (28.39)
**PERF: 28.52 (28.43)
**PERF: 28.35 (28.42)
**PERF: 28.40 (28.42)
**PERF: 28.45 (28.41)
**PERF: 28.39 (28.41)
Any suggestions what I’m doing wrong?