Training speed issue

Hi, I’m training my model (detectnet and yolov4) on gtx3090 and a100. (which means I tried on two different machines)

If found that on COCO dataset, its training speed is very low than I expected. (3 hours per epoch.)
When I use another training script, its 3x ~ 5x faster than 3 hours.

Do you have any becnchmarks on training hours on certain dataset or gpu machines?

Thanks in advance.

Nope, there is not benchmark for training hours.
• Try to use AMP since your GPU supports it. See more in
• Try tfrecord data loader. In this way, please disable mosaic. See more in YOLOv4 — TAO Toolkit 3.22.05 documentation
• Set randomize_input_shape_period to 0.