Convert the TRT model with FP16

Chieh · April 30, 2020, 6:55am

Dear all,

I was converting the onnx model to TensorRT model. I could successfully convert to TensorRT model by FP32 and do the TensorRT influence.

However, if I used the FP16, then I got error.
I used the onnx2trt tool.

The error output :

----------------------------------------------------------------
Input filename:   model.onnx
ONNX IR version:  0.0.4
Opset version:    10
Producer name:    pytorch
Producer version: 1.3
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Parsing model
[2020-04-30 05:44:47 WARNING] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
Building TensorRT engine, FP16 available:1
    Max batch size:     1
    Max workspace size: 1024 MiB
 [2020-04-30 06:31:54   ERROR] ../builder/cudnnBuilderWeightConverters.cpp (482) - Misc Error in operator(): 1 (Weights are outside of fp16 range. A possible fix is to retrain the model with regularization to bring the magnitude of the weights down.)
[2020-04-30 06:31:54   ERROR] ../builder/cudnnBuilderWeightConverters.cpp (482) - Misc Error in operator(): 1 (Weights are outside of fp16 range. A possible fix is to retrain the model with regularization to bring the magnitude of the weights down.)
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to create object
Aborted (core dumped)

If I use the build_engine function by FP32, then I got this error message.

  File "embedmask_engine_2.py", line 81, in <module>
    batch_size=1)
  File "embedmask_engine_2.py", line 18, in build_engine
    if trt_engine_datatype == trt.float16: # mrcnn add chieh
TypeError: __eq__(): incompatible function arguments. The following argument types are supported:
    1. (self: tensorrt.tensorrt.DataType, arg0: tensorrt.tensorrt.DataType) -> bool

Invoked with: DataType.HALF, 32

I am sure my script which can implement successfully on Desktop Linux Ubuntu 18.04 to generate the TRT engine.
But I cannot run it on TX2. (Use the same onnx model.)

[TX2] Environment setting
ubuntu version: 18.04
python3 version: 3.6.9
Tensorflow version: 1.15
TensorRT version: 6.0.1.10
CUDA version: 10.0.326
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Mon_Mar_11_22:13:24_CDT_2019 Cuda compilation tools, release 10.0, V10.0.326
cuDNN version: 7.6.3
docker version: Docker version 18.09.7, build 2d0083d

How can I fix it?

Thank you!!

AastaLLL · April 30, 2020, 7:18am

Hi,

This is a known issue and you can find some information here:
https://github.com/NVIDIA/TensorRT/issues/420#issuecomment-602896899

There is workaround that cutting the model weights into half in pyTorch code, and then converting it into trt fp16 engine.
Would you mind to check if this also works for you?

Thanks.

Chieh · April 30, 2020, 7:53am

Hi @AastaLLL,

Thanks for your information!

After I checked the link, I have a simple question that should I need to setup before I save the onnx model? Or can I setup the particular layer parameter while I am converting the TRT model from onnx model?

On the other hand, how about my second issue?
I set the FP32 and it still got wrong; however, if I use the onnx2trt and set by FP32, it works.

AastaLLL · May 4, 2020, 7:06am

Hi,

To give a further suggestion, could you share your onnx model with us ?
Thanks.

Chieh · May 7, 2020, 9:32am

I am sorry about that after some discussion, my boss decide that we won’t be able to share our onnx model.

However, would you mind to provide us with layer operation list that “onnx2trt” or “onnx parser” support with which kinds of FP operations?

The operations that we use in the onnx model are:

Conv2d
Interpolate
Scale
GroupNorm (customized from BatchNorm2d, it is successful in FP32 with TensorRT)
ReLU

Because we were thinking whether these operations make wrong during converting the onnx model to TRT model by FP16.

Thank you for your help!

AastaLLL · May 8, 2020, 3:35am

Hi,

You can find our support matrix here:

Thanks.

Chieh · May 8, 2020, 5:32am

Got it
Thank you!

Topic		Replies	Views
Can convert to INT32 but not with FP16 TensorRT	3	1012	November 29, 2022
Onnx model to TRT conversion error TensorRT	6	3187	April 15, 2022
Unable to convert ONNX model to TensorRT TensorRT tensorrt , pytorch , onnx	6	3449	September 30, 2020
TensorRT encountered issues when converting weights between types and that could affect accuracy TensorRT	7	1652	September 22, 2023
Tensorrt FP16 conversion issue TensorRT tensorrt , cuda , gstreamer , onnx , deep-learning , deepstream	8	2377	March 6, 2023
Different FP16 inference with tensorrt and pytorch TensorRT	5	4422	October 25, 2021
Inference fp16 engine in c++ get Nan output but inference fp32 engine can get correct result TensorRT	13	1217	October 10, 2023
Meet some problem with --precisionConstraints=obey --layerPrecisions TensorRT	1	1380	January 17, 2023
Converting FCN8-ResNet18 from Pytorch to TensorRT for inference on Jetson Nano TensorRT tensorrt , jetson-inference , pytorch , python , onnx	3	2236	October 12, 2021
TensorRT problem on NVIDIA APEX ORIN NX TensorRT tensorrt , jetson-inference , cudnn	1	29	August 29, 2024

Convert the TRT model with FP16

Related topics