Does `builder->setFp16Mode(true)` perform the post-training quantization?

liujie · December 4, 2019, 8:42am

I have a UFF formatted model from tensorflow and want to parse it and inference on Xavier using TensorRT. Can I use builder->setFp16Mode(true) to perform post-training quantization.

AastaLLL · December 4, 2019, 9:31am

Hi,

YES. The command compile a FLOAT model into FP16 mode.
Detail quantization process can be found here:
https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9659-inference-at-reduced-precision-on-gpus.pdf

Thanks.