Inference on FP16 segmentation model

Hi,

This will apply FP32 → FP16 conversion twice.
Please use the original FP32 ONNX model and quantize it with trtexec (--fp16) directly.

More, do you run Deepstream with the serialized TensorRT engine or the ONNX file?

Thanks.