I recently used this solution to implement Yolov4 in DeepStream 5.0 on a Jetson Nano
The problem is the FPS is pretty low, so I was thinking of using the Yolov4 tiny model.
My question is, I have to change the functions in the cpp according to the model or only the txt files?
I think there are at least 3 steps for you do to with YoloV4 tiny.
Step 1: Study differences between YoloV4 and YoloV4 tiny on behalf of implementations of Yolo layers. As I know there are only 2 yolo layers for YoloV4 tiny, so if each yolo layer uses 3 anchor boxes, there would be only 6 anchor boxes for YoloV4 tiny. But YoloV4 has 3 yolo layers and 9 anchor boxes.
Step 2: Download YoloV4 tiny cfg file, and create a YoloV4 tiny pytorch model. You can use darknet2pytorch or pure pytorch implementation of YoloV4 as references. You can use pre-trained darknet weights of YoloV4 tiny or train it with your own dataset.
Step 3: Deploy your trained YoloV4 tiny to DeepStream. If you strictly follow the instructions and output shapes of your YoloV4 tiny obey these formats: [batch, num_boxes, 1, 4] and [batch, num_boxes, num_classes], there would be little change in these C++ parsing functions.
I have been able to produce the tensorrt engine of yolov4-tiny on my jetson nano (I did it by commenting only one line in yolo.cpp). having the tensorrt engine at hand, All I need is some modications needed in the nvdsparsebbox_Yolo.cpp file to produce a “parse-bbox-func-name=NvDsInferParseCustomYoloV4Tiny” to be used by config_infer_primary_yoloV4_tiny.txt. I have found a sample for yolov4 on the net (that you have mentioned in your above post), but that is for yolov4 and not for yolov4-tiny, and I cannot figure out how to embed the anchors and masks in it.
do you have any such code available for yolov4-tiny?
Did you modify nvdspasebbox_Yolo.cpp? (yolotiny_v4 format). I am trying to use yolov4tiny on jetson nano, but I dont know the code format in that file?
If you convert the yolov4-tiny into a tensorRT model using darknet2pytorch, you can load the model directly into deepstream without caring about the anchors and masks. Just set the model-engine to the tensorRt model in the pgie configuration and update the nvdsparsebbox_Yolo.cpp as already described in instructions. I’ve tested it to work.
I gave up on DeepStream and tried to find another solution, tkDNN.
My env was Ubuntu 18, CUDA 10.2, CUDNN 8.0.0, and TensorRT 7.1.3.
Using a custom YOLOv4 tiny model with four classes, I got an average of 38 FPS in F16 with batch size 1 and 4, on Jetson Nano.
The only problem I had was with the building process, where I had to install especially the 3.17.4 version of cmake, but the rest was smooth.
Hope it helps