I have been training two models with TAO Toolkit, one with efficientnetB4 and one with efficientnetB5.
After following the steps and pruning both models I have generated an engine using DeepStream to load it in the Jetson Nano.
For the B4 model the speed is 14 fps and for the B5 model it is 9 fps. It seems it is very slow taking into consideration that the models have been pruned.
Do you know if there is any benchmark to validate these values? In case they are the proper values, is there any way to optimize it and reduce them?
Yes, I’m using both networks in classification mode.
We’re considering several options and we though efficientnet would be the best for us. Right now the model I’ve used at 30 fps is B1, but I’d like to know if it could be possible to use more complex architectures like B4 or B5 at similar fps with the Nano.
Yes. I pruned them with a threshold of 0.6 for both models and then I retrained them following the tutorials you have with a few less epochs. As I told you with B1 I’m having 30 fps and the prune threshold is the same
You can try several thresholds to prune more and then run retraining.
Usually the threshold is not the same for different backbones. After pruning, you can find the pruning ratio in the log. Or you can also find the new trainable parameters in the retraining log.
Hi again and sorry for the delay. I’ve tried with several thresholds in the B5 model and with a 0.7 threshold the size of the etlt file is about 6 MB. So when I put it inside the DeepStream application the frame rate is 15 fps, which is much better than the 9 fps we had with 0.6 threshold. My question now is:
with a file of only 6 MB size is it normal to have only 15 fps? Isn’t it a really small file?