During running train,py in section “Re-training on the Cat/Dog Dataset”, the machine was hang and restarted suddenly. I tried to change the batch-size from 4 to 1, the situation is still the same. May I ask what should I do to diagnose? Thank you.
The command I use is:
“python3 train.py --model-dir=models/cat_dog --batch-size=1 --workers=1 --epochs=1 data/cat_dog”
Thank you so much, @AastaLLL . I tried to allocate the SWAP and it seems that the issue was solved. However, when I tried to train my own model, by following the ‘tools’ model in the video. The issue came again. Therefore, I ran your command to capture the information. Please find the attachment ‘tegrastats.txt’ for detail.
The command I run is:
"python3 train.py --model-dir=models/anthony_data1 -batch-size=1 --workers=1 --epochs=1 data/anthony_data1