I have a UFF formatted model from tensorflow and want to parse it and inference on Xavier using TensorRT. Can I use builder->setFp16Mode(true)
to perform post-training quantization.
Hi,
YES. The command compile a FLOAT model into FP16 mode.
Detail quantization process can be found here:
https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9659-inference-at-reduced-precision-on-gpus.pdf
Thanks.