TensorRT INT8 conversion from an ONNX model


I’m encountering a segmentation fault when trying to convert an onnx model to INT8 using trtexec
I have tried the sample MNIST example of converting a caffe model to INT8 (first by getting the calibration.cache file and then using trtexec to save a .trt file) which got converted successfully. When the same is applied to any ONNX model (off the shelf or trained by us), landing at a segmentation fault


TensorRT Version:
GPU Type: Quadro RTX 4000
Nvidia Driver Version: 460.80
CUDA Version: 11.2
CUDNN Version: 8.1
Operating System + Version: Ubuntu 20.04.1 LTS
Python Version (if applicable): 3.8.5
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:21.02-py3

Relevant Files

All the files required to reproduce the issue (with off the shelf Resnet model) are placed in the following link.

Steps To Reproduce

There is a README.txt which tells about the steps taken to reproduce the issue
Reiterating the same steps here

  1. Get calibration cache file
    python sample.py
  2. Use calibration cache file to save trt model in int8 format
    trtexec --onnx=ResNet50.onnx --explicitBatch --optShapes=000_net:4x3x224x224 --maxShapes=000_net:4x3x224x224 --minShapes=000_net:1x3x224x224 --shapes=000_net:4x3x224x224 --calib=calibration.cache --int8 --saveEngine=ResNet50_int8_batch4.trt

Hi, Please refer to the below links to perform inference in INT8


We have seen this documentation. We are going for post training quantization using the libraries provided by Nvidia.

We are using the python wrapper to generate calibration cache file as per the sample code given for python and converting to .trt using trtexec application

Want to understand why the sample works only for caffe model and when tried with .onnx its giving a segmentation fault - want to understand how to proceed on this front.



Are you facing this issue on latest TRT container ?