I want to use mixed-precision when converting PyTorch model to TRT model by TensorRT. My conversion process is PyTorch->ONNX->TRT. I can select a quantization mode by setting builder.int8_mode = True or builder.fp16_mode = True. Is there any way to set mixed-precision in this process? If the mixed-precision can not be set in this process, is there any other way to set it ? I would be very grateful if any help is provided.
TensorRT Version: 188.8.131.52
GPU Type: 1080Ti
Nvidia Driver Version: 440.40
CUDA Version: 10.2
CUDNN Version: 8.0.2
Operating System + Version: Ubuntu16.04
Python Version (if applicable): 3.6
PyTorch Version (if applicable): 1.5
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
- validating your model with the below snippet
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks for your reply. Here is one onnx file of ResNet50. What I want to do is, for example, setting the first layer being FP16 mode and other layers being INT8 mode. Is it possible to realize the mixed-precision by trtexec. I notice that one option of trtexec is “–best” which enbles all precisions. Could this options realize the mixed-precision automatically? Maybe my understanding of mixed-precision is inaccurate, and could you provide more detailed help
thanks for your reply. From the provided link, I got the information of “you can specify the layer precision using the precision flag: layer.precision = trt.int8” in " Mixed Precision Using The Python API". When I futher followed the guide in int8_caffe_mnist I didn’t find layer.precision. I’m not quite sure what layer is and how to get it programmatically. Could you please provide more information, such as other sample python code?
Sorry for the delayed response.
trtexec --best would enable all supported precisions(mixed) for inference.
Currently we do not have python sample represents layer precision. Looks like given sample in the doc user builder config calibrator.