Training error

ronakkhandelwal · April 19, 2022, 11:02am

Get the following error when running the code

!tao yolo_v4 train -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                   -k $KEY \
                   --gpus 1

Epoch 1/80
2022-04-19 10:56:17,282 [ERROR] iva.common.utils: Ran out of GPU memory, please lower the batch size, use a smaller input resolution, use a smaller backbone, or enable model parallelism for supported TLT architectures (see TLT documentation). 2022-04-19 16:26:19,244 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

ronakkhandelwal · April 19, 2022, 11:07am

How much GPU memory is required to train yolo model on kitti dataset?

Morganh · April 19, 2022, 2:01pm

According to another topic you filed, your gpu memory is about 2GB.
For TAO, the minimum is expected to be 4 GB of GPU RAM. IVA Getting Started Guide :: Metropolis Documentation

As mentioned in the log, you can try to lower the batch size, use a smaller input resolution, use a smaller backbone, etc.

ronakkhandelwal · April 19, 2022, 2:55pm

Okay. Will upgrade the server and restart the process again. This time won’t be working on wsl:)

system · May 3, 2022, 2:56pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Memory usage continue growing up when training TAO Toolkit	5	302	July 4, 2023
GPU memory requirements during training TAO Toolkit	11	897	July 20, 2022
TLT yolo_v4 slow training TAO Toolkit	11	838	October 12, 2021
[TLT] YoloV4 training fails. training process asigned to CPU instead of GPU? TAO Toolkit	8	439	August 9, 2022
Unable to train on lpd yolo_v4_tiny TAO Toolkit	3	373	May 12, 2022
TAO yolov4_tiny training fails with error TAO Toolkit	4	560	February 2, 2023
Train yolov3 TAO Toolkit	21	751	October 12, 2021
Not able to train on other systems TAO Toolkit	3	562	March 4, 2022
Out of memory running tao evaluate on exported model TAO Toolkit	2	482	August 15, 2022
Extremely slow train and evaluation of yolo_v4_tiny TAO Toolkit yolo , tao	12	1223	April 12, 2023

Training error

Related topics