How to quantity tensorRT plugin


how to quantity tensorRT plugin as int8 or as fp16?
i want to improve the speed of inference,custom plugin is the efficiency bottleneck,how to quantity plugin as int8/fp16?


TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Please refer to below links related custom plugin implementation and sample:


thank you so much,i’ll read the blog and learn the code