Hi
When exporting a model trained with QAT enabled, do I need to set cal_image_dir to export task command?
Make sure you have trained a QAT model. If it is, it is not needed.
When exporting a model that was trained with QAT enabled, the tensor scale factors to calibrate the activations are peeled out of the model and serialized to a TensorRT-readable cache file defined by the
cal_cache_file
argument.
Sorry, I read documentations but I did not understand differences between --cal_cache_file and --cal_data_file.
Also I have 2 more questions:
1- when we use --engine_file, it means that tensorRt engine generated during export? Because tlt-convert generates trt engine file.
2-When we set --force_ptq, also need cal_image_dir,
–cal_cache_file and --cal_data_file?
–cal_cache_file is the calibration cache file, i.e. cal.bin
. This is the output we need to use for deployment.
–cal_data_file is the calibration tensorfile from the training data. Usually it looks like calibration.tensor. Usually this file is not needed during deployment.
Some networks can also generate trt engine when run “export”. In all networks, the “tlt-converter” can generate the trt engine.
The model generated using the
tlt-export
command used earlier is not compatible for deployment on INT8 mode in the DLA. To deploy this model with the DLA, you must generate the calibration cache file using PTQ on the QAT-trained .tlt model file. You can do this by setting theforce_ptq
flag over the command line when runningtlt-export
.
Need to set –-cal_cache_file , this is the cal.bin which we need.
Thank you so much
Hi Morganh. @Morganh
I have 1 more question about this topic.
1-When export models that are trained without QAT,
And set --cal_image_dir, --cal_data_file and
–cal_cache_file, and also generate trt.engine, is post training quantization actually done?
What is the difference with the time when we set
–force_ptq in export and generate the engine?
Thank you.
- Yes.
- force_ptq is a flag to force post training quantization using TensorRT for a QAT trained model. This is required if the inference platform is a Jetson with a DLA.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.