Using TensorRT acceleration is recommended but not mandatory.
You can inference your model directly with the TensorFlow framework on Nano. It is almost identical to the desktop version.
Here are some docker image with TensorFlow pre-installed for your reference: https://ngc.nvidia.com/catalog/containers/nvidia:l4t-tensorflow
TF-TRT will automatically convert your model into TensorRT and fallback the non-supported layer to TensorFlow implementation.
So you will always get a runable graph but the ratio for TensorRT acceleration will be different due to model architecture.
The only limitation is the memory size.
Nano only has 4G memory and needs to be shared with the operating system.
So it won’t be able to inference a large or complicated model.
Below are some workable model and benchmark result for your reference:
Thank you for your response. I have a few follow up questions.
What’s different between the supported and unsupported layers handling. Are supported layers optimized for execution on the GPU? How exactly are unsupported layers executed - on the ARM processor?
Is there a way to estimate the memory requirement for my model? How much RAM does the official OS consume? My model is just under 3MB in size so I’m guessing the Nano won’t have much trouble handling that.
Is there a way to estimate the power a model will need to run? I’m trying to decide if the 2GB version’s 5V-3A is enough or if I’ll need the 4GB version’s 5V-4A.