Yolov4 tiny in DeepStream 5.0

• Hardware Platform (Jetson Nano)
• DeepStream 5.0
• TensorRT

I recently used this solution to implement Yolov4 in DeepStream 5.0 on a Jetson Nano
The problem is the FPS is pretty low, so I was thinking of using the Yolov4 tiny model.
My question is, I have to change the functions in the cpp according to the model or only the txt files?



I think there are at least 3 steps for you do to with YoloV4 tiny.

  • Step 1: Study differences between YoloV4 and YoloV4 tiny on behalf of implementations of Yolo layers. As I know there are only 2 yolo layers for YoloV4 tiny, so if each yolo layer uses 3 anchor boxes, there would be only 6 anchor boxes for YoloV4 tiny. But YoloV4 has 3 yolo layers and 9 anchor boxes.

  • Step 2: Download YoloV4 tiny cfg file, and create a YoloV4 tiny pytorch model. You can use darknet2pytorch or pure pytorch implementation of YoloV4 as references. You can use pre-trained darknet weights of YoloV4 tiny or train it with your own dataset.

  • Step 3: Deploy your trained YoloV4 tiny to DeepStream. If you strictly follow the instructions and output shapes of your YoloV4 tiny obey these formats: [batch, num_boxes, 1, 4] and [batch, num_boxes, num_classes], there would be little change in these C++ parsing functions.


Hi there,

I have been able to produce the tensorrt engine of yolov4-tiny on my jetson nano (I did it by commenting only one line in yolo.cpp). having the tensorrt engine at hand, All I need is some modications needed in the nvdsparsebbox_Yolo.cpp file to produce a “parse-bbox-func-name=NvDsInferParseCustomYoloV4Tiny” to be used by config_infer_primary_yoloV4_tiny.txt. I have found a sample for yolov4 on the net (that you have mentioned in your above post), but that is for yolov4 and not for yolov4-tiny, and I cannot figure out how to embed the anchors and masks in it.

do you have any such code available for yolov4-tiny?

Hi Barzanhayati,

Did you modify nvdspasebbox_Yolo.cpp? (yolotiny_v4 format). I am trying to use yolov4tiny on jetson nano, but I dont know the code format in that file?

Hi Linh89kr,

that is exactly the same thing I have problem with (the bounding box parser function with anchors and masks set in it).

Hey @barzanhayati and @linh89kr ,

If you convert the yolov4-tiny into a tensorRT model using darknet2pytorch, you can load the model directly into deepstream without caring about the anchors and masks. Just set the model-engine to the tensorRt model in the pgie configuration and update the nvdsparsebbox_Yolo.cpp as already described in instructions. I’ve tested it to work.


I gave up on DeepStream and tried to find another solution, tkDNN.
My env was Ubuntu 18, CUDA 10.2, CUDNN 8.0.0, and TensorRT 7.1.3.
Using a custom YOLOv4 tiny model with four classes, I got an average of 38 FPS in F16 with batch size 1 and 4, on Jetson Nano.
The only problem I had was with the building process, where I had to install especially the 3.17.4 version of cmake, but the rest was smooth.
Hope it helps

Thanks @ralfs @saifullah3396

I should check them.