Training custom model on Jetson Nano doesnt work

yjain0102 · January 14, 2024, 4:10am

Hello, I am running on a Jetson Nano, JetPack 4.6 and Ubuntu 18.04. I installed jetson-inference and ran the docker. Everything works fine except for training the detection model. At first I changed the shm-size to 2gb and then I got an error OSError: [Errno 12] cannot allocate memory. Then I tried mounting swap from the Jetson Hello AI world.
sudo systemctl disable nvzramconfig
sudo fallocate -l 4G /mnt/4GB.swap
sudo mkswap /mnt/4GB.swap
sudo swapon /mnt/4GB.swap
Now the jetson nano just freezes when training with a batch size of 2, workers = 1 and epochs = 1.

Worst case scenario, I train on my Windows PC but I am unsure of how to do that and if possible, I would want to train it on the Jetson Nano. Thanks!

AastaLLL · January 15, 2024, 3:37am

Hi,

Please try it with --batch-size=1 and --workers=0.
Below is a related topic for your reference:

Thanks.

yjain0102 · January 17, 2024, 2:49am

Hello, I tried that and tried doing sudo init 3 (sudo init 3 brings me to a dark screen and I am unable to do anything) and the screen continues to freeze, however, I noticed that the time on the top updates once at the 10 minute mark and then the process says killed.

AastaLLL · January 22, 2024, 8:18am

Hi,

Killed is usually caused by running out of memory.
Thanks.

dusty_nv · January 22, 2024, 2:50pm

Hi @yjain0102, I believe if you press Ctrl+Alt+F1, Ctrl+Alt+F2, ect, it will bring you to a login prompt.

I’m not sure what this setting is or if you might want to change it back to the original (I had not needed to set this)

What is the command-line you are using to launch train_ssd.py, and what dataset are you using for training?

Topic		Replies	Views
Out of memory during training Jetson Nano jetson-inference , ai-training	8	2453	September 19, 2021
Training on jetson nano is killed Jetson Nano ai-training	3	483	February 12, 2024
Jetson-inference: cannot train model with custom data set Jetson Nano jetson-inference	11	2113	March 9, 2022
I can't disable GUI for complete the train_ssd.py script Jetson Nano jetson-inference	13	1123	February 4, 2022
The machine was hang and restarted during running train,py in section "Re-training on the Cat/Dog Dataset" Jetson Nano jetson-inference , python	4	436	July 11, 2022
OSError: [Errno 12] Cannot allocate memory Error in the jetson nano Jetson Nano yolo	3	1671	December 15, 2021
Getting low memory problem while running object detction model on jetson nano Jetson Nano jetson-inference	3	862	October 15, 2021
Jetson Nano 2GB Killed (Out Of Memory) During Re-Training Jetson Nano ai-training	20	3424	November 22, 2021
How To Use Detection Model From Other Sources? Jetson Nano jetson-inference , nano2gb	3	599	October 15, 2021
Jetson-inference: Retraining cat_dog using train.py is not running Jetson Nano	8	1066	October 14, 2021

Training custom model on Jetson Nano doesnt work

Related topics