onnx model converted to tensorRt engine with fp32 correctly. but with fp16 return nan for outputs.
TensorRT Version: 7.2.2
GPU Type: 1650 super
Nvidia Driver Version: 451.82
CUDA Version: 11.0
CUDNN Version: 8.0.5
Operating System + Version: win10
you can download model and output json files from bellow links:
onnx model: https://drive.google.com/file/d/1xajwNe-SDlwrgSBWDObQ3-lWEgTp8WA6/view?usp=sharing
fp32 output: https://drive.google.com/file/d/1sZFDB7h6Y11eS6BovvEJu5HBHHijeh1o/view?usp=sharing
fp16 output: https://drive.google.com/file/d/1MPOJkf1MXcNRicVw81oQ9KKu2zeAYo8N/view?usp=sharing
Steps To Reproduce
- run command with trtexec
trtexec.exe --onnx=onnx_model.onnx --fp16 --saveEngine=trt_fp16.engine --workspace=2048 exportOutput=output_fp16.json
Seems no input is fed to TRT, the output could be any value (including NaN).
Is it my mistake? how can I fix that ?
if it is bug in trt you think approximately how long it takes to fix it ?
Please check trtexec
--loadInputs arg to pass input.
Load input values from files (default = generate random inputs). Input names can be wrapped with single quotes (ex: 'Input:0')
Input values spec ::= Ival[","spec]
Ival ::= name":"file
I have already integrated tensorRT in my software. allocating buffer and all preprocessing steps are same as fp32 but return nan with half precision. I think the problem should not be related to random or real values.
I need to know does Nvidia have a plan for more and faster support? something like vip users
Because it is important for us that this problem be resolved as soon as possible.
We see NaN output even with the ONNX-Runtime fp16.
May be problem with the model.
Looks like it’s because of this Conv layer:
[I] onnxrt-runner-N0-02/05/21-12:46:31 | Validating output: model_2/res3a_branch2a/Conv2D__161:0 (check_finite=True, check_nan=True)
[I] Stats: mean=359.25, min=0 at (0, 0, 0, 0), max=9184 at (0, 62, 34, 76)
[D] PASSED | Output: model_2/res3a_branch2a/Conv2D__161:0 is valid
[I] onnxrt-runner-N0-02/05/21-12:46:31 | Validating output: model_2/res3a_branch2a/Conv2D:0 (check_finite=True, check_nan=True)
[I] Stats: mean=-inf, min=-inf at (0, 48, 4, 61), max=65248 at (0, 46, 2, 63)
[E] Encountered one or more non-finite values
[E] Note: Use -vv or set logging verbosity to EXTRA_VERBOSE to display non-finite values
[E] FAILED | Errors detected in output: model_2/res3a_branch2a/Conv2D:0