Inference on FP16 segmentation model

Hi,
I have a pretrained torchvision segmentation model that i exported to onnx and then to tensorrt (using trtexec) to use on Deepstream. The model ran successfully and as expected. This was with a FP32 precision.

I tried doing the same for FP16, I exported the model from pytorch (after converting it to half precision) to onnx and then to tensorrt (using trtexec with --fp16 flag).

The fp16 engine ran inference successfully using python (without deepstream), then when i tried it in deepstream after changing network-mode to 2, it gave me a Segmentation fault (core dumped) error.

Any advice on how to get it up and running?

• Hardware Platform (Jetson / GPU) Jetson Xavier AGX
• DeepStream Version 5.1
• JetPack Version (valid for Jetson only) Jetpack 4.5.1
• TensorRT Version 7.1.3

Hi,

This will apply FP32 → FP16 conversion twice.
Please use the original FP32 ONNX model and quantize it with trtexec (--fp16) directly.

More, do you run Deepstream with the serialized TensorRT engine or the ONNX file?

Thanks.

Thank you, that was the problem indeed.

I run Deepstream with the serialized TensorRT engine directly.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.