Using *_int8.etlt for fp32/fp16 Inference

I’m referring to PeopleNet released on NGC.

In v2.5, only ‘resnet34_peoplenet_int8.etlt’ is released; but in v2.3, both ‘resnet34_peoplenet.etlt’ and ‘resnet34_peoplenet_int8.etlt are released’.

Does this means if I want to run inference in fp32/fp16 precision, I cannot use ‘resnet34_peoplenet_int8.etlt’ etlt file in v2.5?

The “resnet34_peoplenet_int8.etlt” is a quantized model.
See PeopleNet | NVIDIA NGC, for quantized INT8 model, a third quantization-aware training (QAT) phase is carried out. Regularization is not included in second and third phase. The quantized models share the same structure as the pruned model, however, these models have been trained by employing Quantization Aware Training and is intended for int-8 deployment.

So, “resnet34_peoplenet_int8.etlt” is used in int8 deployment.

Thank you for highlighting this.

I have also seen two tao export commands for exporting etlt file in detectnet_v2.ipynb section 10 (Model Export).

tao detectnet_v2 export \
                  -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet50_detector_pruned.tlt \
                  -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet50_detector.etlt \
                  -k $KEY
tao detectnet_v2 export \
                  -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet50_detector_pruned.tlt \
                  -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet50_detector.etlt \
                  -k $KEY  \
                  --cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor \
                  --data_type int8

Understand the second command will generate etlt with int8 calibration file. But regarding to etlt file, are the etlt files generated by these two commands same? In other words, can etlt file generated by the second command using for fp32/fp16 deployment and vice versa?

Yes, all the etlt models are the same.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.