at our company we are having problems using Pytorch-TensorRT with the official Nvidia Docker image, version 22.05.
The problems come with int8 quantization, and are already reported in pytorch bug tracker, unfortunately with no answer [https://github.com/pytorch/TensorRT/issues/1135].
We also have found the bug reported here [https://github.com/pytorch/TensorRT/issues/927], which is still connected to int8 quantization in Pytorch-TensorRT
TensorRT Version: (Torch-TensorRT) 1.2.0a0+666a2637
GPU Type: GeForce RTX 2080 Ti
Nvidia Driver Version: GeForce RTX 2080 Ti
CUDA Version: 11.0
CUDNN Version: 8.4
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): –
PyTorch Version (if applicable): 1.11.00
Baremetal or Container (if container which image + tag): Nvidial NGC Container nvcr.io/nvidia/pytorch:22.05-py3
The first bug is encountered when executing this notebook [https://github.com/pytorch/TensorRT/blob/master/notebooks/vgg-qat.ipynb] in the Nvidia-Pytorch container 22.05.
The stack trace can be found in the link to the Pytorch GitHub issue.
The second bug description is found in its GitHub issue page: a segmentation fault occurs when using int8 quantization with TensorRT.
The first bug is easily reproducible by executing the notebook inside the container version 22.05.