Hi,
This will apply FP32 → FP16 conversion twice.
Please use the original FP32 ONNX model and quantize it with trtexec (--fp16) directly.
More, do you run Deepstream with the serialized TensorRT engine or the ONNX file?
Thanks.
Hi,
This will apply FP32 → FP16 conversion twice.
Please use the original FP32 ONNX model and quantize it with trtexec (--fp16) directly.
More, do you run Deepstream with the serialized TensorRT engine or the ONNX file?
Thanks.