How do I generate INT8 calibration file wiht caffe?

Supine · August 12, 2020, 10:54am

Description

I want to quantize a caffe model with TensorRT, in order to NVDLA.
But I can’t find tutorials about it.
How do I generate INT8 calibration file with cpp or Python API?

NVDLA Low Precision:

github.com

nvdla/sw/blob/1ae4738f4558cd79bd872deddce5f4e618f38217/LowPrecision.md

# Low precision support in NVDLA

Use of low precision such 8-bit, 4-bit, or even lower number of bits for inference is one of the optimization methods used in deep learning. It helps to compress the model reducing memory footprint and to improve performance with a small degradation in accuracy. Using INT8 precision for inference requires quantizing pre-trained models from floating point to INT8 and programming converters in NVDLA for scaling/re-scaling tensors.

### NVDLA architecture for INT8 precision support includes the following:
-	INT8 input/output data read/write
-	32-bit internal pipeline, avoids saturation in mathematical computations
-	Per-tensor input scaling using input converters
-	Per-tensor and per-kernel output re-scaling using output converters

### Steps to generate INT8 quantized model:
-	Analyze the dynamic range of per-layer tensors and calculate scale factors using TensorRT
-	Import scale factors generated using TensorRT to NVDLA JSON format
-	Quantize model weights and determine the converter parameters using scale factors

#### Analyze dynamic range of per-layer tensors and calculate scale factors using TensorRT
A calibration tool collects the dynamic range of the output tensor for each layer over a dataset of images. This dynamic range information can be used to calculate per-tensor scale factors. For NVDLA, calibration interface TensorRT is used to generate scale factors.

Refer to https://github.com/NVIDIA/TensorRT/tree/release/5.1/samples/opensource/sampleINT8 for sample application which explains how to use TensorRT to generate scales factors.

This file has been truncated. show original

AakankshaS · August 12, 2020, 4:47pm

Hi @Supine
Please refer to the below link
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleINT8

Thanks!

Topic		Replies	Views
TensorRT INT8 calibration TensorRT tensorrt , cuda , tensorflow	4	1055	February 15, 2021
TensorRT TensorRT tensorrt , python	1	317	October 27, 2021
Acceleration with INT8 precision using TensorRT TensorRT tensorrt , cuda , deep-learning	6	765	February 13, 2021
Int8 quantization TensorRT	1	499	December 16, 2021
How to do int8 calibration in c++ in tensorRT 5 ? TensorRT	10	4772	October 12, 2021
How to generate calib.table file while generating int8 engine file TensorRT camera , cuda , kernel , jetson-inference , gstreamer , jetson , deepstream , jetson-orin	1	70	December 31, 2024
How to generate int8 calilb table for trtexec engine generation TensorRT tensorrt	7	4423	October 12, 2021
Easiest method to create INT8 Calibration Table using TensorRT (trtexec preferrable) TensorRT	3	3323	December 1, 2020
API for getting INT8 calibration scale factors after calibration is finished? TensorRT	1	409	March 29, 2022
INT8 quantization TensorRT	5	1178	November 20, 2019

How do I generate INT8 calibration file wiht caffe?

Description

Related topics