How to use mixed-precision when converting PyTorch model to TRT model by TensorRT?

Description

Hi.
I want to use mixed-precision when converting PyTorch model to TRT model by TensorRT. My conversion process is PyTorch->ONNX->TRT. I can select a quantization mode by setting builder.int8_mode = True or builder.fp16_mode = True. Is there any way to set mixed-precision in this process? If the mixed-precision can not be set in this process, is there any other way to set it ? I would be very grateful if any help is provided.

Environment

TensorRT Version: 7.1.3.4
GPU Type: 1080Ti
Nvidia Driver Version: 440.40
CUDA Version: 10.2
CUDNN Version: 8.0.2
Operating System + Version: Ubuntu16.04
Python Version (if applicable): 3.6
PyTorch Version (if applicable): 1.5

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Hi @xingxing-123,

Following may help you. Please refer

Hi,
Thanks for your reply. Here is one onnx file of ResNet50. What I want to do is, for example, setting the first layer being FP16 mode and other layers being INT8 mode. Is it possible to realize the mixed-precision by trtexec. I notice that one option of trtexec is “–best” which enbles all precisions. Could this options realize the mixed-precision automatically? Maybe my understanding of mixed-precision is inaccurate, and could you provide more detailed help
?

Hi,
thanks for your reply. From the provided link, I got the information of “you can specify the layer precision using the precision flag: layer.precision = trt.int8” in " Mixed Precision Using The Python API". When I futher followed the guide in int8_caffe_mnist I didn’t find layer.precision. I’m not quite sure what layer is and how to get it programmatically. Could you please provide more information, such as other sample python code?

@xingxing-123,

Sorry for the delayed response.
Yes trtexec --best would enable all supported precisions(mixed) for inference.
https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#mixed-precision
Currently we do not have python sample represents layer precision. Looks like given sample in the doc user builder config calibrator.

Thank you.