Training error

Get the following error when running the code

!tao yolo_v4 train -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                   -k $KEY \
                   --gpus 1

Epoch 1/80
2022-04-19 10:56:17,282 [ERROR] iva.common.utils: Ran out of GPU memory, please lower the batch size, use a smaller input resolution, use a smaller backbone, or enable model parallelism for supported TLT architectures (see TLT documentation). 2022-04-19 16:26:19,244 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

How much GPU memory is required to train yolo model on kitti dataset?

According to another topic you filed, your gpu memory is about 2GB.
For TAO, the minimum is expected to be 4 GB of GPU RAM. IVA Getting Started Guide :: Metropolis Documentation

As mentioned in the log, you can try to lower the batch size, use a smaller input resolution, use a smaller backbone, etc.

Okay. Will upgrade the server and restart the process again. This time won’t be working on wsl:)

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.