TensorRT with fp16 return nan for all outputs


onnx model converted to tensorRt engine with fp32 correctly. but with fp16 return nan for outputs.


TensorRT Version: 7.2.2
GPU Type: 1650 super
Nvidia Driver Version: 451.82
CUDA Version: 11.0
CUDNN Version: 8.0.5
Operating System + Version: win10

Relevant Files

you can download model and output json files from bellow links:
onnx model: https://drive.google.com/file/d/1xajwNe-SDlwrgSBWDObQ3-lWEgTp8WA6/view?usp=sharing
fp32 output: https://drive.google.com/file/d/1sZFDB7h6Y11eS6BovvEJu5HBHHijeh1o/view?usp=sharing
fp16 output: https://drive.google.com/file/d/1MPOJkf1MXcNRicVw81oQ9KKu2zeAYo8N/view?usp=sharing

Steps To Reproduce

  • run command with trtexec
    trtexec.exe --onnx=onnx_model.onnx --fp16 --saveEngine=trt_fp16.engine --workspace=2048 exportOutput=output_fp16.json

Hi @keivan.moazami,

Seems no input is fed to TRT, the output could be any value (including NaN).

Thank you.

Is it my mistake? how can I fix that ?
if it is bug in trt you think approximately how long it takes to fix it ?

Hi @keivan.moazami,

Please check trtexec--loadInputs arg to pass input.

Load input values from files (default = generate random inputs). Input names can be wrapped with single quotes (ex: 'Input:0')
                              Input values spec ::= Ival[","spec]
                                           Ival ::= name":"file

Thank you.

I have already integrated tensorRT in my software. allocating buffer and all preprocessing steps are same as fp32 but return nan with half precision. I think the problem should not be related to random or real values.
I need to know does Nvidia have a plan for more and faster support? something like vip users
Because it is important for us that this problem be resolved as soon as possible.

Hi @keivan.moazami,

We see NaN output even with the ONNX-Runtime fp16.
May be problem with the model.

Looks like it’s because of this Conv layer:
[I] onnxrt-runner-N0-02/05/21-12:46:31 | Validating output: model_2/res3a_branch2a/Conv2D__161:0 (check_finite=True, check_nan=True)
[I] Stats: mean=359.25, min=0 at (0, 0, 0, 0), max=9184 at (0, 62, 34, 76)
[D] PASSED | Output: model_2/res3a_branch2a/Conv2D__161:0 is valid
[I] onnxrt-runner-N0-02/05/21-12:46:31 | Validating output: model_2/res3a_branch2a/Conv2D:0 (check_finite=True, check_nan=True)
[I] Stats: mean=-inf, min=-inf at (0, 48, 4, 61), max=65248 at (0, 46, 2, 63)
[E] Encountered one or more non-finite values
[E] Note: Use -vv or set logging verbosity to EXTRA_VERBOSE to display non-finite values
[E] FAILED | Errors detected in output: model_2/res3a_branch2a/Conv2D:0

Please check.

Thank you.