pre-quantized models on Jetson AGX Xavier

kalyan.c · September 18, 2019, 11:11am

Hi all,

How to avoid implicit quantization in TensorRT to run pre quantized model.Is there any API for this in tensort5.1.6.

AastaLLL · September 19, 2019, 9:28am

Hi,

Do you mean the quantization for different precision.
If this, you can use fp32 node to avoid all the quantization.

Thanks.

kalyan.c · September 19, 2019, 11:10am

Hi,

My model is INT8 model, i want run it in INT8 mode.Is it possible?

Thanks

AastaLLL · September 20, 2019, 3:24am

Hi,

You will need the quantization function to convert the data from FP32 into INT8.
Thanks.

kalyan.c · September 20, 2019, 5:03am

Hi,

I have already quantized caffe model(INT8), can i read the model with caffe parser in 8 bit .And I want run the model in INT8 mode using tensorrt frame work.

Thanks

AastaLLL · September 26, 2019, 9:42am

Hi,

It may not be supported. It’s recommended to give it a try first.
The main reason is that the quantization function may be different among TensorRT and Caffe.

Although you already have an INT8 model, you still need to convert the input data into INT8 precision.
So it’s important that the quantization function of your model and input data is identical or can be calibrated with the calibrated cache.

Thanks.

kalyan.c · October 4, 2019, 4:47am

Hi Aastalll,

sorry for late reply

I got very bad results.

accuracy of floating model 72%

our quantization got 69%

On Jetson AGX Xavier running in int8 mode with entropy-2 calibration not even getting 30%.

Thanks.

kalyan.c · October 5, 2019, 11:28am

Hi,

Let me put it this way, i have all the Q formats for each layer input, output and weights. Is it possible to mention Q formats defined by us, so that tensorrt uses those Q formats for quantizing the model ?

Thanks,
Kalyan.

kalyan.c · October 10, 2019, 7:15am

Hello,

please reply the questions.

How to read quantized weights?

I didn’t see any change in outputs by changing calibration set , by giving dynamic range and by changing the values in calibration file every time getting same output values.

Thanks,
kalyan.

AastaLLL · October 17, 2019, 10:04am

Hi,

Are you finding an API to set up the model weight on your own?
If yes, please check this page:
https://github.com/NVIDIA/TensorRT/blob/release/6.0/samples/opensource/sampleMNISTAPI/sampleMNISTAPI.cpp

It’s recommended to check if both input and weight of your model are in INT8 format first.

Thanks.

Topic		Replies	Views
Alexnet using INT8 GPU-Accelerated Libraries	5	1757	August 29, 2017
TensorRT 8-bit Quantization questions TensorRT	7	4893	April 26, 2018
TensorRT 4.0 Python API INT8 Calibration TensorRT	3	1452	August 27, 2018
Converting to TRT a model from Quantization Aware Training without applying calibration TensorRT	5	1826	February 2, 2021
Data inferencing to INT8U quantized model TensorRT tensorrt	2	453	October 12, 2021
NX & TRT & Jetson-inference - Not setting precision to INT8 Jetson Xavier NX tensorrt , jetson-inference	4	925	October 18, 2021
Inferencing on AGX Xavier in INT8 mode Jetson AGX Xavier jetson-inference	3	1105	December 8, 2021
Failed to use INT8 precision mode when using tf-trt on Xavier Jetson AGX Xavier	4	1020	October 18, 2021
sampleINT8 crash GPU-Accelerated Libraries	4	1063	August 1, 2017
Int8 TensorCores for Jetson Jetson AGX Xavier tensorrt	7	1335	April 26, 2023

pre-quantized models on Jetson AGX Xavier

Related topics