My query is what is the pruning threshold and other parameters used in the hosted pruned model? Because depending on the parameters like pruning threshold the architecture is getting changed, which might affect the accuracy.
Does anyone has any ideas please share the pruning parameters used or Is this protected by Nvidia?
It is not protected. Actually the blog https://developer.nvidia.com/blog/training-custom-pretrained-models-using-tlt/ was sharing some pruning parameters.
Different backbone(architecture) might have different pruning parameters.
End users should experiment with this hyperparameter to find the right spot between pruning and accuracy of the model.
I understand its a hyperparameter and needs to be chosen by user. Let me put this way, the accuracy and performance of the model depends on its number of operations(directly proportional to the architecture of pruned model)
Nvidia is able to achieve 10fps Jetson Nano with some pruning threshold chosen and accuracy of 83% on Nvidia’s Internal dataset.
I am trying to match up with the Nvidia’s performance and accuracy. In one of the questions( Accelerating Peoplnet with tlt for jetson nano - #13 by Morganh), you gave instructions to achieve 10 fps on Nano with pth=0.005. I tried to reproduce the same architecture, but got only 7.5 fps(with jetson clocks and performance mode enabled).
I have one more query, Is it possible to transfer the pruned weights to other framework like keras for training?