train VGG16 on Nano

I recently purchased the Jetson Nano and this is my first real use of a GPU system. I am trying to train a VGG16 image recognition system and getting an error about insufficient memory on the board. From googling this, it does appear that the Nano has very limited memory and (maybe?) is just meant for inference and not for training. Does this align well with how everyone here uses it? Should I be training on my laptop (CPU) or AWS instance, and just use the Nano for inference?

Alternatively, what are the best ways (links to tutorials?) on how to correctly optimize code to made use of the limited on-board memory and maybe do frequent transfers from the SD card to the chip BRAM. Would the memory xfer rate be so slow that doing this would generate more overhead than just using a CPU?

You can try mounting additional swap to see if this allows you to do the training. I recall that VGG16 is a larger/complex network, so if you wish to train onboard the Nano, you might want to try a model like ResNet-18 or ResNet-50. Also I have found training with PyTorch to be more efficient on memory.

Here is a tutorial on transfer learning with PyTorch onboard Nano:

Regarding ML/DL, Jetson is intended for inferencing, but you may be able to train some networks onboard. Larger models and datasets would be better suited to training on a PC with discrete GPU(s) or in the cloud.