Recently new version of peoplenet(v2.1) is released.
Q1- What’s difference between pruned_v2.1 and quantized_v2.1 ? I know the quantized_v2.1 model is INT8 deployment and the pruned_v2.1 is fp32, but I want to know the quantized_v2.1 model is only quantized to INT8 with tlt-converter?
Q2- What’s main difference between pruned_v2.0 and pruned_v2.1? The only difference is related to training dataset? Random Rotated?
Refer to NVIDIA NGC
- unpruned_v2.1 - ResNet34 based pre-trained model. Intended for training
- pruned_v2.1 - ResNet34 floating point deployment model.
- quantized_v2.1 - ResNet34 INT8 deployment model. Contains calibration cache for GPU and DLA. DLA one is required if running inference on Jetson AGX Xavier or Xavier NX DLA.
More details, please see PeopleNet — Transfer Learning Toolkit 3.0 documentation
To run the pruned quantized int8 model on the gpu (MX130) changes have to be made to in the source code to run the models