How to quantity tensorRT plugin

Description

how to quantity tensorRT plugin as int8 or as fp16?
i want to improve the speed of inference,custom plugin is the efficiency bottleneck,how to quantity plugin as int8/fp16?

Environment

TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,
Please refer to below links related custom plugin implementation and sample:
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleOnnxMnistCoordConvAC

Thanks!

thank you so much,i’ll read the blog and learn the code

Hi @mrsunqichang,

Hope following doc is also helpful to you.
https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#optimize-plugins

Thank you.