Jetson nano freezes when i run train.py (YOLOV7)

Hi, im trying to train yolov7, and when i run the train.py script with the following command:
python train.py --workers 1 --device 0 --batch-size 8 --Img 640 640 --weights Yolov7.pt --cfg cfg/training/yolov7-custom --data data/dataset.yaml --epochs 100 --name try
the terminal emulator freezes at 0% progress and the jetson heats up to much.
Im using torch 1.8 and torchvision 0.9.0 with cuda support. Im also using python 3.6.9.
The data set is around 240 images and labels, normal quality. I dont know what can be causing the issue. If you need more information just ask.
Thanks in advance

@jcanrua I haven’t used YOLO training scripts on Jetson (although I have trained other models such as ResNet-18 and SSD-Mobilenet on Nano with PyTorch), so I’m not sure if training YOLO will fit in Nano’s memory with that or not, and I suspect the board is running out of memory or excessive swapping is occurring.

In other terminal, try running tegrastats and keeping an eye on the memory usage. Also, reduce the batch size to 1 and minimize memory usage in any other ways like here. Also PyTorch can take long to start-up and do the first training epoch when in training mode on platforms with limited memory.

Thanks! I tried everything you said and it works know. Maybe it’s a little bit to slow but I will play with the input until I find a decent time.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.