Hi! Is it possible to force some layer in some precision?
In my case, I want to convert onnx model to int8-tensorrt format. But I got errors in NMS layer (which is not supported in int8, as I know). Could I infer these concrete layer in fp16 precision or even on CPU?
Also, maybe I’m wrong and there is TRT-Plugin for int8 quantization of NMS layer?
Hi @dara.vinogradova ,
Would you mind trying out the example?
Thanks
But this plugin is only for fp16/fp32, am I wrong?
Hi @dara.vinogradova ,
Apologies for delayed response, can you please share the error logs with us?
Thanks