ONNX Runtime Error: fp16 precision has been set for a layer or layer output, but fp16 is not configured in the builder

I’m trying to run a Yolov5 model (yolov5s.pt) on jetson nano.
Initially, i tried converting the model pytorch model to onnx with fp32 & ran it on nano with CSI camera & code similar to https://developer.nvidia.com/blog/announcing-onnx-runtime-for-jetson . This worked fine but the FPS was low (4 fps) so i wanted to try out fp16.

I converted to model to onnx-fp16 using builtin yolov5 script (TFLite, ONNX, CoreML, TensorRT Export · Issue #251 · ultralytics/yolov5 · GitHub), the conversion was successful (it shrink the model from 28-> 14 mb) but when i try to run it on Nano, I’m getting below error:

2022-01-10 17:24:14.792809142 [W:onnxruntime:Default, tensorrt_execution_provider.h:53 log] [2022-01-10 11:54:14 WARNING] /home/onnxruntime/onnxruntime-py36/cmake/external/onnx-tensorrt/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2022-01-10 17:24:15.842234019 [E:onnxruntime:Default, tensorrt_execution_provider.h:51 log] [2022-01-10 11:54:15 ERROR] 4: [network.cpp::validate::2555] Error Code 4: Internal Error (fp16 precision has been set for a layer or layer output, but fp16 is not configured in the builder)
Traceback (most recent call last):
File “detection.py”, line 87, in
detector = ObjectDetector(model_path )
File “detection.py”, line 26, in init
self.sess = rt.InferenceSession(onnx_model_path, providers=providers)
File “/home/niraj/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py”, line 335, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File “/home/niraj/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py”, line 379, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.EPFail: [ONNXRuntimeError] : 11 : EP_FAIL : TensorRT EP could not build engine for fused node: TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_13938907072320197802_7_0

The attempt is to use onnx-runtime for running a model atop tensorrt (if i’ve understood it correctly). Now I’m not 100% sure if it’s a Nano issue but I would really appreciate any guidance or help with this.

Update : The model works with onnx-runtime when used with CPUExecutionProvider. Only the TensorrtExecutionProvider is causing this error.

Do i have to set any flags before loading fp16 model using onnx-runtime + tensorrt ? am i missing something here?
I’m also attaching the code & model for reference.
yolov5s_fp16.onnx (14.2 MB)

detection.py (3.7 KB)


Please note that the TensorRT engine is not portable.
Have you tried to convert the model directly on the Nano?