I am trying to run the Isaac SDK tutorial “Training Object Detection from Simulation in Docker” (Training Object Detection from Simulation in Docker — ISAAC 2021.1 documentation) on my laptop running Ubuntu 18.04 with a Nvidia Geforce 1650 with 4Gb of GRAM and am getting an “OOM when allocating tensor with shape” error. When running nvidia-smi I can see that it is running out of GPU memory. I had a couple of questions:
Is it possible to tune the training parameters so it used less memory? Speed of execution is not important at all in this case.
I also have a Jetson AGX XAVIER developer kit. Would it be better to run the training on this? I am not sure if all the packages are available for ARM or just x86.
My end goal is to be able to do POSE estimation on the Jetson for new objects that are trained via 3D models and simulation.
If anyone has any hints or tips that would be great.
4GB of memory may be too little to fit the entire training set into GPU memory. You could try reducing the number of training examples (it is likely trying to load all of them at once) or downrezing the training images so that they will fit. We do not recommend training on Jetson.
I tried reducing the number of training examples but this didn’t seem to work for me so I took your second suggestion and downsized the training images from 368x640 to 184x320 which seems to have worked at least for the training. However when I try to run the section “A. Inference on SIM images” I get the error “ERROR packages/ml/ColorCameraEncoderCuda.cpp@50: only downsampling is supported currently” which I am struggling to track down. I have changed the rows and cols parameters based on my new image size.
You mentioned that the Jetson should not be used for training. Could you please provide a bit more information on why that is as it has powerful GPU processing abilities right? Is it to do with the architecture (ARM vs x86)?
Thanks again for your help,
Check that the “rows” and “cols” parameters of ColorCameraEncoderCuda have been correctly update as well for your new image size.
Jetsons are great for inference but can run into memory issues with large training sets with its shared GPU/CPU memory setup.