how to quantity tensorRT plugin as int8 or as fp16?
i want to improve the speed of inference,custom plugin is the efficiency bottleneck，how to quantity plugin as int8/fp16?
Nvidia Driver Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered