Infer layer with specific precision

Hi! Is it possible to force some layer in some precision?
In my case, I want to convert onnx model to int8-tensorrt format. But I got errors in NMS layer (which is not supported in int8, as I know). Could I infer these concrete layer in fp16 precision or even on CPU?
Also, maybe I’m wrong and there is TRT-Plugin for int8 quantization of NMS layer?

Hi @dara.vinogradova ,
Would you mind trying out the example?


But this plugin is only for fp16/fp32, am I wrong?