Hi! I have a large NN-inference program that runs on Jetson. Unfortunately there’s not enough memory for it to operate and I need to optimize it somehow. Are there well-known techniques and advice to reduce memory consumption on a Jetson device? Thanks!
More info: TensorRT, Cuda 10.2, L4T 32.4.4, Tegra194
How much memory are you missing? Which jetson device (they each have different amounts of RAM.)
If you’re missing a dozen megabytes, then turning off some kernel daemons you might not need (wifi or bluetooth or whatever) might help, but if you’re off by gigabytes, then chances are you probably isn’t enough that’s possible to remove – the Ubuntu overhead really isn’t all that large, even if you include all the things you DO need that it provides.
If you’re asking about how to make a model smaller, then reducing the layer sizes, and reducing the number of classes detected by the model is typically how one does that. You can also look into the approaches they took in MobileNet to create a smaller network, and maybe apply something like that to your own model.
Thank you for the answer!
I’m missing around 200-400 mb, while there’s total 8gb in the device. OS has been compacted already.
The application has been written for regular CPU+GPU video card scheme and thus incurs lot of copying back and forth. Maybe I should dig into that direction…
That sounds quite fruitful! It might also make for a snappier application overall. Fewer copies are almost always better :-)
Which model do you use?
In our latest TensorRT v8.0, we provide an option to inference without loading cuDNN.
This can save lots of memory but the supported layers are not much currently.
It’s recommended to give it a try. The package is integrated in JetPack 4.6.
We use TX2 and Nano, and optimization is needed for both. Cudnn is used for inference. Thanks!
Do you use TensorRT for inference or other frameworks (ex. TensorFlow)?
If TensorRT is used, you can try to run the model with
The memory usage of MNIST decreases from 832.742 MB to 527.469 MB on JetPack 4.6.
Thank you! Does it reduce performance?
We don’t see a performance regression on the MNIST model.
Please noted that this feature starts from TensorRT v8.0.
So the supported layers are still limited currently.