Acceleration with INT8 precision using TensorRT

Description

I have successfully converted ResNet-r50 to fp16 using TensorRT with Python and C++ but I am unable to do the same with INT8 precision. I cant quite understand the calibration step involved with the acceleration using the official documentation.

Can anyone help me with understanding the calibration? A good tutorial or reference links might help.

Thanks in advance.

Environment

TensorRT Version: 7.2.2.1
GPU Type: nvidia RTX 3080
Nvidia Driver Version: 460.27.04
CUDA Version: 11.2
Operating System + Version: LINUX 18.04
Python Version: 3.6
TensorFlow Version: 2.3.1

1 Like

Hi , We recommend you to check the supported features from the below link.

Thanks!

Hi @himanipatel,

Please refer following links.

https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleINT8API
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleINT8

Thank you.

Thank you for the quick reply.
I have checked the compatibility and INT8 is is supported in our GPU. I have run the mnist samples available in the following github repository too: https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleINT8

But I do not understand how to implement the same for custom models.

I have referred to these links but I am still having difficulty in converting my custom model. If you have any good tutorials, it would be very helpful.
I am new to the field so sorry if these queries basic or obvious.

Check out my Demo #6: Using INT8 and DLA core of tensorrt_demos. I think you’d be able to reuse most of my calibrator.py code. And the code for building the INT8 TensorRT engine is here.

2 Likes

Thank you so much.