Object Detector YOLO with Darknet CSPNet CFG/Weights?

I’m currently just trying to run the cfg/weights csresnext50-panet-spp-original-optimal as made available by GitHub - AlexeyAB/darknet: YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

My first impression is, that it’s not enough to simply create new deepstream_app_config.txt and config_infer_primary.txt files, but also new additions to trt_utils.cpp are needed?

Here are my configs+output when trying them (I renamed the cfg+weights to yolov3-cs.cfg/weights due to the checks made on the filenames)

config_infer_primary_cs.txt:

[property]
gpu-id=0
net-scale-factor=1
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=yolov3-cs.cfg
model-file=yolov3-cs.weights
#model-engine-file=model-engine-file=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/cs_model_b1_fp32.engine
labelfile-path=labels.txt
int8-calib-file=yolov3-calibration.table.trt5.1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=80
gie-unique-id=1
is-classifier=0
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so

deepstream_app_config_cs.txt:

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=1
camera-width=1280
camera-height=720
camera-fps-n=30
#camera-fps-d=1
#camera-csi-sensor-id=0
camera-v4l2-dev-node=6
#uri=file://../../samples/streams/sample_1080p_h264.mp4
#num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1280
height=720
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
#model-engine-file=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/cs_model_b1_fp32.engine
labelfile-path=labels.txt
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
#interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_cs.txt

[tests]
file-loop=0

“deepstream-app -c deepstream_app_config_cs.txt”

(deepstream-app:2104): GStreamer-CRITICAL **: 15:35:43.108: passed '0' as denominator for `GstFraction'

(deepstream-app:2104): GStreamer-WARNING **: 15:35:43.109: Name 'src_cap_filter' is not unique in bin 'src_sub_bin0', not adding
Creating LL OSD context new
0:00:00.259336218  2104 0x7f75240022c0 INFO                 nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
Loading pre-trained weights...
Loading complete!
Total Number of weights read : 57030845
      layer               inp_size            out_size       weightPtr
(1)   conv-bn-leaky     3 x 608 x 608      64 x 304 x 304    9664  
(2)   maxpool          64 x 304 x 304      64 x 152 x 152    9664  
(3)   conv-bn-leaky    64 x 152 x 152     128 x 152 x 152    18368 
(4)   route                  -             64 x 152 x 152    18368 
(5)   conv-bn-leaky    64 x 152 x 152      64 x 152 x 152    22720 
(6)   conv-bn-leaky    64 x 152 x 152     128 x 152 x 152    31424 
(7)   conv-bn-leaky   128 x 152 x 152     128 x 152 x 152    179392
deepstream-app: trt_utils.cpp:235: nvinfer1::ILayer* netAddConvBNLeaky(int, std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >&, std::vector<float>&, std::vector<nvinfer1::Weights>&, int&, int&, nvinfer1::ITensor*, nvinfer1::INetworkDefinition*): Assertion `block.at("activation") == "leaky"' failed.
Aborted (core dumped)

There is new property “groups” in [convolutional], maybe different weight size with the current netAddConvBNLeaky() implementation.

[convolutional]
batch_normalize=1
filters=128
size=3
groups=32
stride=1
pad=1
activation=leaky

I will go on checking.

We only support the four standard yolo models now, so please try to modify the custom yolo lib (engine builder) to build your custom network.

Oh ok. Thanks for checking!

I wanted to try if I can run the state of the art model in deepstream and how much faster it would be than darknet. I currently can’t spare much time extending the engine builder, especially as I’m not familiar with deepstream (yet). Maybe later.

If anyone finds a solution in the meantime, I would appreciate it! :)

I would also like to import my custom YOLO libraries into deepstream, can anyone help us? Thank you

I wanted to try if I can run the state of the art model in deepstream and how much faster it would be than darknet.

Can you use yoloV3 to compare ?

I would also like to import my custom YOLO libraries into deepstream

Does it use tensorRT layer by layer? Our yolo library source code has been opened. What’s difference of your custom yolo library?

Yes I can, and deepstream was 30-40% faster IIRC (I just tested on my Quadro M1200 though).
But my actual goal is to run a custom yolov3-tiny-prn as well as a CSPN model. (Nobody really uses vanilla yolo versions anymore.) I also tried switching to vanilla yolov3-tiny as well as vanilla yolov3 (instead of CSPN) and they perform much worse for our use cases. So for now I will keep using the darknet library for our project, as I currently can’t estimate the efforts needed to get them working in deepstream and also am unsure how I (for example) get the bounding-boxes from the deepstream implementation.

Not trying to talk it down, I think it’s nice Nvidia provides example implementation for vanilla YOLO models.

get the bounding-boxes from the deepstream implementation

BBox is saved in metadata. Here is document:
https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html#page/DeepStream_Plugin_Manual%2Fdeepstream_plugin_metadata.03.2.html

Hi Kriegera
Can you provide cfg/weight or link?

Sure

CFG: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/csresnext50-panet-spp-original-optimal.cfg
Weights: https://drive.google.com/open?id=1_NnfVgj0EDtb_WLNoXV8Mo7WKgwdYZCc

Thanks for sharing.
The Yolo sample provided in DeepStream supports only the standard Yolo models and you will have to make changes to the probes based on their custom network architecture, for example, how to make netAddConvBNLeaky() support group convolution