Memory Usage in Pytorch

coreysenger · August 4, 2021, 7:58pm

Hi, I’ve been trying to run resnet50 through pytorch and feed my IMX219 camera into it as a little hello world project, but it seems that on the 2gb nano pytorch is effectively unusable with cuda. As soon as the cuda context is initialized (as simple as “x = torch.ones((1,)).cuda()”) the memory usage is maxed and the swap is ~30% full. This causes issues with the camera since it starts timing out and then I can’t retrieve images from it because the system performance degrades so much. I also tried initializing the camera after cuda, but it still ended up timing out.

Has there been any solution for this so far? I don’t plan on training on the jetson, but without being able to at least do inference on the gpu I might as well be using a raspberry pi and an auxiliary server.

If there’s a pipeline to turn pytorch models into something with a small memory footprint that’d work too. I’ve been trying to get tensorrt working, but I’m assuming that’s going to end up with the same problem?

dusty_nv · August 4, 2021, 8:40pm

TensorRT has more optimized memory usage than PyTorch, and you can run the camera + TensorRT model on Nano 2GB.

You can check out the jetson-inference to deploy your PyTorch classification model to TensorRT and run the camera on it. Namely, see this part of the tutorial where the classifier is exported to ONNX and then run with TensorRT:

https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-collect.md#training-your-model

You may need to tweak the onnx_export.py script, which is currently setup to use checkpoints that were trained with the code included with the tutorial. You would just want to remove/replace the code that uses the additional variables saved into these checkpoints (like checkpoint['resolution'])

coreysenger · August 5, 2021, 5:57am

After some tinkering I can confirm TensorRT reduces the memory footprint by a lot! Looks like Resnet50 and whatever overhead from nvargus fits just barely inside the 2gb limit

The only weird thing left is that torchvision seems to initialize all of pytorch’s cuda so I have to manually do image transformations with opencv and numpy, but that’s doable at least. Thanks!

dusty_nv · August 5, 2021, 3:12pm

OK cool, glad you got your model working with TensorRT and the camera!

I have some basic image manipulation routines implemented in CUDA here that are exposed in Python through jetson.utils: https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-image.md

Otherwise, you are correct to do the image operations with OpenCV or numpy.

system · October 4, 2021, 3:13pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.