Fp 16 trt model will output nan value

jp 4.6 trt 8.2
part of my model output will be NaN value when I transfer it with --fp16 and the output is normal with fp32. Any solution?

ONNX model:
pt_nvidia_trt8.2.onnx (3.8 MB)

./trtexec --onnx=pt_nvidia_trt8.2.onnx --saveEngine=pt_nvidia_trt8.2.trt --minShapes=input:1x1x512x512 --optShapes=input:4x1x512x512 --maxShapes=input:4x1x512x512 --workspace=4096 --verbose --fp16

./trtexec --onnx=pt_nvidia_trt8.2.onnx --saveEngine=pt_nvidia_trt8.2.trt --minShapes=input:1x1x512x512 --optShapes=input:4x1x512x512 --maxShapes=input:4x1x512x512 --workspace=4096 --verbose 


We can get the output with trtexec.

$ /usr/src/tensorrt/bin/trtexec --onnx=pt_nvidia_trt8.2.onnx --minShapes=input:1x1x512x512 --optShapes=input:4x1x512x512 --maxShapes=input:4x1x512x512 --fp16 --dumpOutput

Does the NAN value generated from a custom implementation?
If yes, please check if the source can handle the fp16 mode correctly.


I have checked my trt inference code, it can work as expected with other 'fp16models. And also i tested with several random inputs, it will still outputNaN` values. Any solutions?


Would you mind sharing the source so we can check it further?

nvidia_test.cpp (10.8 KB)

Since it’s company property, I have to remove most of the data processing part, and just keep the basic trt engine part.

And the inference in our project can work properly with other fp16 models, except this one, so would you mind helping me to check if it is a specific layer inside the model that will generate NaN value after changing it to trt model? Or will it work on other versions of jp/trt ?


Could you set all the batch size in minShapes, optShapes and maxShapes to the identical and try it again?

For example, set all the size to the 1x1x512x512?