Export model with QAT enabeld

–cal_cache_file is the calibration cache file, i.e. cal.bin . This is the output we need to use for deployment.
–cal_data_file is the calibration tensorfile from the training data. Usually it looks like calibration.tensor. Usually this file is not needed during deployment.

Some networks can also generate trt engine when run “export”. In all networks, the “tlt-converter” can generate the trt engine.

See Improving INT8 Accuracy Using Quantization Aware Training and the NVIDIA TAO Toolkit | NVIDIA Technical Blog

The model generated using the tlt-export command used earlier is not compatible for deployment on INT8 mode in the DLA. To deploy this model with the DLA, you must generate the calibration cache file using PTQ on the QAT-trained .tlt model file. You can do this by setting the force_ptq flag over the command line when running tlt-export .

Need to set –-cal_cache_file , this is the cal.bin which we need.

1 Like