Hi, I’m training my model (detectnet and yolov4) on gtx3090 and a100. (which means I tried on two different machines)
If found that on COCO dataset, its training speed is very low than I expected. (3 hours per epoch.)
When I use another training script, its 3x ~ 5x faster than 3 hours.
Do you have any becnchmarks on training hours on certain dataset or gpu machines?
Thanks in advance.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one.
Nope, there is not benchmark for training hours.
• Try to use AMP since your GPU supports it. See more in https://docs.nvidia.com/tao/tao-toolkit/text/qat_and_amp_for_training.html#automatic-mixed-precision
• Try tfrecord data loader. In this way, please disable mosaic. See more in YOLOv4 — TAO Toolkit 3.22.05 documentation
• Set randomize_input_shape_period to 0.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.