Convert the TRT model with FP16

Dear all,

I was converting the onnx model to TensorRT model. I could successfully convert to TensorRT model by FP32 and do the TensorRT influence.

However, if I used the FP16, then I got error.
I used the onnx2trt tool.

The error output :

----------------------------------------------------------------
Input filename:   model.onnx
ONNX IR version:  0.0.4
Opset version:    10
Producer name:    pytorch
Producer version: 1.3
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
Parsing model
[2020-04-30 05:44:47 WARNING] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
Building TensorRT engine, FP16 available:1
    Max batch size:     1
    Max workspace size: 1024 MiB
 [2020-04-30 06:31:54   ERROR] ../builder/cudnnBuilderWeightConverters.cpp (482) - Misc Error in operator(): 1 (Weights are outside of fp16 range. A possible fix is to retrain the model with regularization to bring the magnitude of the weights down.)
[2020-04-30 06:31:54   ERROR] ../builder/cudnnBuilderWeightConverters.cpp (482) - Misc Error in operator(): 1 (Weights are outside of fp16 range. A possible fix is to retrain the model with regularization to bring the magnitude of the weights down.)
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to create object
Aborted (core dumped)

If I use the build_engine function by FP32, then I got this error message.

  File "embedmask_engine_2.py", line 81, in <module>
    batch_size=1)
  File "embedmask_engine_2.py", line 18, in build_engine
    if trt_engine_datatype == trt.float16: # mrcnn add chieh
TypeError: __eq__(): incompatible function arguments. The following argument types are supported:
    1. (self: tensorrt.tensorrt.DataType, arg0: tensorrt.tensorrt.DataType) -> bool

Invoked with: DataType.HALF, 32

I am sure my script which can implement successfully on Desktop Linux Ubuntu 18.04 to generate the TRT engine.
But I cannot run it on TX2. (Use the same onnx model.)

[TX2] Environment setting
ubuntu version: 18.04
python3 version: 3.6.9
Tensorflow version: 1.15
TensorRT version: 6.0.1.10
CUDA version: 10.0.326
nvcc: NVIDIA ® Cuda compiler driver Copyright © 2005-2019 NVIDIA Corporation Built on Mon_Mar_11_22:13:24_CDT_2019 Cuda compilation tools, release 10.0, V10.0.326
cuDNN version: 7.6.3
docker version: Docker version 18.09.7, build 2d0083d

How can I fix it?

Thank you!!

Hi,

This is a known issue and you can find some information here:

There is workaround that cutting the model weights into half in pyTorch code, and then converting it into trt fp16 engine.
Would you mind to check if this also works for you?

Thanks.

Hi @AastaLLL,

Thanks for your information!

After I checked the link, I have a simple question that should I need to setup before I save the onnx model? Or can I setup the particular layer parameter while I am converting the TRT model from onnx model?

On the other hand, how about my second issue?
I set the FP32 and it still got wrong; however, if I use the onnx2trt and set by FP32, it works.

Hi,

To give a further suggestion, could you share your onnx model with us ?
Thanks.

I am sorry about that after some discussion, my boss decide that we won’t be able to share our onnx model.

However, would you mind to provide us with layer operation list that “onnx2trt” or “onnx parser” support with which kinds of FP operations?

The operations that we use in the onnx model are:

  • Conv2d
  • Interpolate
  • Scale
  • GroupNorm (customized from BatchNorm2d, it is successful in FP32 with TensorRT)
  • ReLU

Because we were thinking whether these operations make wrong during converting the onnx model to TRT model by FP16.

Thank you for your help!

Hi,

You can find our support matrix here:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html#layers-precision-matrix

Thanks.

Got it
Thank you!