Training custom model on Jetson Nano doesnt work

Hello, I am running on a Jetson Nano, JetPack 4.6 and Ubuntu 18.04. I installed jetson-inference and ran the docker. Everything works fine except for training the detection model. At first I changed the shm-size to 2gb and then I got an error OSError: [Errno 12] cannot allocate memory. Then I tried mounting swap from the Jetson Hello AI world.
sudo systemctl disable nvzramconfig
sudo fallocate -l 4G /mnt/4GB.swap
sudo mkswap /mnt/4GB.swap
sudo swapon /mnt/4GB.swap
Now the jetson nano just freezes when training with a batch size of 2, workers = 1 and epochs = 1.

Worst case scenario, I train on my Windows PC but I am unsure of how to do that and if possible, I would want to train it on the Jetson Nano. Thanks!

Hi,

Please try it with --batch-size=1 and --workers=0.
Below is a related topic for your reference:

Thanks.

Hello, I tried that and tried doing sudo init 3 (sudo init 3 brings me to a dark screen and I am unable to do anything) and the screen continues to freeze, however, I noticed that the time on the top updates once at the 10 minute mark and then the process says killed.

Hi,

Killed is usually caused by running out of memory.
Thanks.

Hi @yjain0102, I believe if you press Ctrl+Alt+F1, Ctrl+Alt+F2, ect, it will bring you to a login prompt.

I’m not sure what this setting is or if you might want to change it back to the original (I had not needed to set this)

What is the command-line you are using to launch train_ssd.py, and what dataset are you using for training?