INT8 quantization with Torch-TensorRT fails

ivan.prosperi · June 28, 2022, 8:25am

Description

Hi,

at our company we are having problems using Pytorch-TensorRT with the official Nvidia Docker image, version 22.05.

The problems come with int8 quantization, and are already reported in pytorch bug tracker, unfortunately with no answer [🐛 [Bug] RuntimeError 'bad optional access' on quantization notebook · Issue #1135 · pytorch/TensorRT · GitHub].

We also have found the bug reported here [🐛 [Bug] Segmentation Fault When Trying to Quantize ResNet50 model · Issue #927 · pytorch/TensorRT · GitHub], which is still connected to int8 quantization in Pytorch-TensorRT

Environment

TensorRT Version: (Torch-TensorRT) 1.2.0a0+666a2637
GPU Type: GeForce RTX 2080 Ti
Nvidia Driver Version: GeForce RTX 2080 Ti
CUDA Version: 11.0
CUDNN Version: 8.4
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): –
PyTorch Version (if applicable): 1.11.00
Baremetal or Container (if container which image + tag): Nvidial NGC Container nvcr.io/nvidia/pytorch:22.05-py3

Relevant Files

The first bug is encountered when executing this notebook [https://github.com/pytorch/TensorRT/blob/master/notebooks/vgg-qat.ipynb] in the Nvidia-Pytorch container 22.05.

The stack trace can be found in the link to the Pytorch GitHub issue.

The second bug description is found in its GitHub issue page: a segmentation fault occurs when using int8 quantization with TensorRT.

Steps To Reproduce

The first bug is easily reproducible by executing the notebook inside the container version 22.05.

NVES · June 28, 2022, 8:37am

Hi, Please refer to the below links to perform inference in INT8

Thanks!

ivan.prosperi · June 28, 2022, 8:40am

Hi,

My problem concerns the Pytorch-TensorRT library. Any advice on that?

Ivan

spolisetty · June 29, 2022, 6:17am

Hi,

Regarding the reported bugs, please wait for an update on the git issues.
For QAT using the PyTorch, the following links may be helpful to you.

https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html

Thank you.

Topic		Replies	Views
TensorRT quantization uses int8 or uint8 TensorRT tensorrt	1	923	June 6, 2023
Some questions about TensorRT INT8, PTQ and QAT TensorRT tensorrt	5	1921	December 27, 2021
Tensorrt inferencing getting failed with custom quantized int 8 TensorFlow model TensorRT tensorrt , ubuntu , python , cudnn	1	76	March 28, 2025
Int8 quantization TensorRT	1	545	December 16, 2021
Segmentation fault (cored dumped) when using TensorRT while quantizing Stable Diffusion 1.5 to Int8 TensorRT cudnn	1	326	May 31, 2024
Building TensorRT int8 engine fails TensorRT	1	388	January 20, 2021
About pytorch QAT and torch to tensorrt DRIVE AGX Xavier General driveos-dl	2	812	November 1, 2021
TensorRT 7 INT8 quantization TensorRT tensorrt	3	452	May 30, 2022
Comparing int8's gives Error Code 2: Internal Error (Assertion engine != nullptr failed. ) TensorRT cudnn	1	368	June 30, 2024
How to quantize a model for Tensorrt? TensorRT tensorrt , pytorch , python	0	158	February 6, 2025

INT8 quantization with Torch-TensorRT fails

Description

Environment

Relevant Files

Steps To Reproduce

Related topics