Is that all I need to do before loading it in the jnano?
If my model was successfully converted into a TensorRT inference graph does that mean I should be able to run it on jnano without any issues?
To load and infer using the model on jnano do I just boot jnano, open a jupyter notebook, start a session, load the graph and run it?
If I have unsupported layers in my model at what point am I notified about it–when creating the inference graph or only when running the inference on jnano?
If I do have unsupported layers is there a way to split execution between Nvidia gpu and the ARM processor?
Using TensorRT acceleration is recommended but not mandatory.
You can inference your model directly with the TensorFlow framework on Nano. It is almost identical to the desktop version.
Here are some docker image with TensorFlow pre-installed for your reference:
TF-TRT will automatically convert your model into TensorRT and fallback the non-supported layer to TensorFlow implementation.
So you will always get a runable graph but the ratio for TensorRT acceleration will be different due to model architecture.
The only limitation is the memory size.
Nano only has 4G memory and needs to be shared with the operating system.
So it won’t be able to inference a large or complicated model.
Below are some workable model and benchmark result for your reference:
Thank you for your response. I have a few follow up questions.
What’s different between the supported and unsupported layers handling. Are supported layers optimized for execution on the GPU? How exactly are unsupported layers executed - on the ARM processor?
Is there a way to estimate the memory requirement for my model? How much RAM does the official OS consume? My model is just under 3MB in size so I’m guessing the Nano won’t have much trouble handling that.
Is there a way to estimate the power a model will need to run? I’m trying to decide if the 2GB version’s 5V-3A is enough or if I’ll need the 4GB version’s 5V-4A.
1
You can find the support matrix of TensorRT below:
For TF-TRT, non-supported layer is handled by the TensorFlow GPU implementation.
For pure TensorRT, you will need to write the non-supported layer as plugin.
2
It’s recommended to monitor the required memory on a desktop environment first.
Sometimes the model size cannot really reflect the required memory.
3
For TensorRT user, it’s recommended to use 4GB Nano.
Since it takes at least 600Mb to load the required library, 2GB version will limit you to a much smaller DNN model.
When you say it takes at least 600 MB to load the required library , do you mean that’s what the OS takes? If you do have a Jetson nano is it possible for you to check the RAM the OS occupies?