Reproducing darknet YOLO4 results in TAO

we are currently trying to reproduce a training we did on YOLO4 on the darknet repository (GitHub - AlexeyAB/darknet: YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )) with the TAO toolkit. We already did some hyperparameter-tuning last week, but the training always topped out at around 83%. With the darknet repository, we were able to achieve 92%.

In my opinion, the gap looks too large to be attributed exclusively to hyperparameters and I am not sure where it comes from. What we tried on optimizing the hyperparameters was:

  • Adapt the anchors to our custom dataset
  • change optimizer (SGD, Adam)
  • change learning rate
  • change batch size
  • change training epochs
  • change regularizer weighting
  • activate random input sizing (mAP curve became very fluctuating, but also reached 83%)

We took the spec file from the CV samples 1.4.1 as a base for our training.
yolo_v4_train_cspdarknet53 - Copy.txt (2.3 KB)

I am aware, that we are using the pre-trained backbone on the OpenImages dataset, and that it would be better to train the backbone on ImageNet, but I am not sure if this can explain the 10% difference. Maybe you guys have some insights on this.

Maybe you have some hints about what else we can try.

Also: Is there a roadmap or a timeline for TAO?
We are very looking forward that the BYOM feature will be extended to Object Detection.

Thanks in advance.

• Hardware: V100
• Network Type: Yolo_v4 with cspdarknet_53
• TLT Version: tao-toolkit-tf:v3.22.05-tf1.15.5-py3

To get the SOTA, please refer to tao_toolkit_recipes/tao_object_dection/yolov4 at main · NVIDIA-AI-IOT/tao_toolkit_recipes · GitHub

I understand that pretraining on these large datasets is an important factor in reaching SOTA performance. Do you have any data on how big the difference in terms of mAP is when using ImageNet/Coco pretrained weights compared to OpenImages?

Also maybe you can give me some rough estimation of when we can expect the BYOM functionality for Object Detection?


There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.

Pretrained weights trained on the ImageNet dataset tend to provide good accuracy for object detection. So we have not the result for using OpenImages pretrained weights since we focus on using Imagenet pretrained weights to run the training. We wrote the recipes in tao_toolkit_recipes/tao_object_dection/yolov4 at main · NVIDIA-AI-IOT/tao_toolkit_recipes · GitHub.

For BYOM functionality for Object Detection, it is on the roadmap. But not sure when it can be available yet.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.