Training Speed is too low While training

TLT Version → docker_tag: v3.21.08-py3
Network Type → Yolov4
Config File → spec.txt (2.5 KB)

Hi,

After training command I observe that the training speed is too low. I ran training part till 120 epochs and it tooks 5 hrs to complete. I am not able to understand why it is taking that much time in order to complete 120 epochs.

I have also attached the configuration file for your reference.

Please use latest 3.21.11 docker.
In 3.21.08, below setting is not correct. It is compatible with 3.21.11 docker.
loss_loc_weight: 1.0
loss_neg_obj_weights: 1.0
loss_class_weights: 1.0

More improving training speed, please consider
• use AMP if your GPU supports it. See more in Optimizing the Training Pipeline — TAO Toolkit 3.21.11 documentation
• try tfrecord data loader. In this way, please disable mosaic. See more in YOLOv4 — TAO Toolkit 3.21.11 documentation

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.