Export model with QAT enabeld

Hi
When exporting a model trained with QAT enabled, do I need to set cal_image_dir to export task command?

Make sure you have trained a QAT model. If it is, it is not needed.

When exporting a model that was trained with QAT enabled, the tensor scale factors to calibrate the activations are peeled out of the model and serialized to a TensorRT-readable cache file defined by the cal_cache_file argument.

1 Like

Sorry, I read documentations but I did not understand differences between --cal_cache_file and --cal_data_file.
Also I have 2 more questions:

1- when we use --engine_file, it means that tensorRt engine generated during export? Because tlt-convert generates trt engine file.

2-When we set --force_ptq, also need cal_image_dir,
–cal_cache_file and --cal_data_file?

–cal_cache_file is the calibration cache file, i.e. cal.bin . This is the output we need to use for deployment.
–cal_data_file is the calibration tensorfile from the training data. Usually it looks like calibration.tensor. Usually this file is not needed during deployment.

Some networks can also generate trt engine when run “export”. In all networks, the “tlt-converter” can generate the trt engine.

See Improving INT8 Accuracy Using Quantization Aware Training and the NVIDIA TAO Toolkit | NVIDIA Technical Blog

The model generated using the tlt-export command used earlier is not compatible for deployment on INT8 mode in the DLA. To deploy this model with the DLA, you must generate the calibration cache file using PTQ on the QAT-trained .tlt model file. You can do this by setting the force_ptq flag over the command line when running tlt-export .

Need to set –-cal_cache_file , this is the cal.bin which we need.

1 Like

Thank you so much

Hi Morganh. @Morganh
I have 1 more question about this topic.

1-When export models that are trained without QAT,
And set --cal_image_dir, --cal_data_file and
–cal_cache_file, and also generate trt.engine, is post training quantization actually done?
What is the difference with the time when we set
–force_ptq in export and generate the engine?

Thank you.

  • Yes.
  • force_ptq is a flag to force post training quantization using TensorRT for a QAT trained model. This is required if the inference platform is a Jetson with a DLA.
1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.