Jetson AGX Xavier fp16 inference nan values

I am using Jetson AGX Xavier(32 GB) to run inference on CRNN model.The model can be found here.

I am using trt7 to generate engine file from model.onnx and pycuda for running inference off that engine file.The model works perfectly well in fp32 mode and gives accurate result when compared with pytorch inference.After switching to fp16 mode,majority of batches return nan values as output.Why is this happening?Is there any workaround?I am using following code to export to onnx-

dummy_input = torch.randn(32, 1, 32, 100)

input_names = [ "input_1" ]

output_names = [ "output_1"]

torch.onnx.export(model, dummy_input, "crnn.onnx", verbose=True, input_names=input_names, output_names=output_names)

Hi,

Have you re-created the engine file for fp16 mode first?
Thanks.

Yes.I have used flag builder.fp16_mode = True while creating fp16 engine file.Please find my attached onnx model,trt engine and image folder that I am running inference on.

demo2.zip (991.0 KB)
crnn.trt
crnn.onnx

Hi,

We try your model with trtexec binary.
The output looks normal and no nan value is found.

$ /usr/src/tensorrt/bin/trtexec --onnx=crnn.onnx --dumpOutput --fp16 --workspace=256

Would you mind to check it again?
If the issue goes on, please provide your source to reproduce the nan output with us.

Thanks.