Hello, at first thanks for stuffs anwser my question.
I am puzzle about int8 QAT quantization.
I’ve make clear the way that directly quantize my model after training with calib file.
Actually, the accuracy decresed for a lot, about 20% map.
when I do research that how to raise quantization quanlity,
I knew quantization-aware training.
For this ,I’ve some issue that not clear:
- could I quantize model directly after qat retrain ?
- Can I use pytorch qat training and export to onnx to trt engine?