TensorRT INT8 conversion from an ONNX model

krupa.gopal · December 16, 2021, 10:19am

Description

I’m encountering a segmentation fault when trying to convert an onnx model to INT8 using trtexec
I have tried the sample MNIST example of converting a caffe model to INT8 (first by getting the calibration.cache file and then using trtexec to save a .trt file) which got converted successfully. When the same is applied to any ONNX model (off the shelf or trained by us), landing at a segmentation fault

Environment

TensorRT Version:
GPU Type: Quadro RTX 4000
Nvidia Driver Version: 460.80
CUDA Version: 11.2
CUDNN Version: 8.1
Operating System + Version: Ubuntu 20.04.1 LTS
Python Version (if applicable): 3.8.5
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt:21.02-py3

Relevant Files

All the files required to reproduce the issue (with off the shelf Resnet model) are placed in the following link.

Steps To Reproduce

There is a README.txt which tells about the steps taken to reproduce the issue
Reiterating the same steps here

Get calibration cache file
python sample.py
Use calibration cache file to save trt model in int8 format
trtexec --onnx=ResNet50.onnx --explicitBatch --optShapes=000_net:4x3x224x224 --maxShapes=000_net:4x3x224x224 --minShapes=000_net:1x3x224x224 --shapes=000_net:4x3x224x224 --calib=calibration.cache --int8 --saveEngine=ResNet50_int8_batch4.trt

NVES · December 16, 2021, 10:38am

Hi, Please refer to the below links to perform inference in INT8

Thanks!

krupa.gopal · December 16, 2021, 10:52am

We have seen this documentation. We are going for post training quantization using the libraries provided by Nvidia.

We are using the python wrapper to generate calibration cache file as per the sample code given for python and converting to .trt using trtexec application

Want to understand why the sample works only for caffe model and when tried with .onnx its giving a segmentation fault - want to understand how to proceed on this front.

Regards,
Krupa

spolisetty · January 6, 2022, 12:47pm

Hi,

Are you facing this issue on latest TRT container ?

Egorundel · July 29, 2024, 12:36pm

@krupa.gopal Hello, have you managed to solve your problem?

I also wanted to ask you why your code sample.py returns an error:

sample.py:69: DeprecationWarning: Use network created with NetworkDefinitionCreationFlag::EXPLICIT_BATCH flag instead.
  builder.max_batch_size = batch_size
Traceback (most recent call last):
  File "sample.py", line 91, in <module>
    main()
  File "sample.py", line 88, in main
    context = build_int8_engine(model_file, calib, batch_size)
  File "sample.py", line 70, in build_int8_engine
    builder.max_workspace_size = common.GiB(1)
AttributeError: 'tensorrt_bindings.tensorrt.Builder' object has no attribute 'max_workspace_size'

What could be the problem?

Topic		Replies	Views
ONNX Model INT8 Engine Build TensorRT tensorrt , jetson-inference , calibration , onnx	3	1958	July 26, 2022
Segmentation fault when using TensorRT to compile a model TensorRT	1	1385	June 27, 2022
INT8 Calibration in Python with TensorRT 8.6 TensorRT tensorrt	5	3844	July 12, 2023
Tensorrt convert efficientnet-b0 model published by nvidia github failed for out of range error TensorRT	4	926	May 6, 2022
Converting to TRT a model from Quantization Aware Training without applying calibration TensorRT	5	1720	February 2, 2021
Segmentation fault while converting the model with trt8.0.1.6 TensorRT tensorrt	5	1422	November 21, 2021
TensorRT with onnx model TensorRT tensorrt , tensorflow , onnx	7	1569	September 2, 2021
TensorRT INT8 inference accuracy TensorRT	2	503	May 9, 2022
Converting ONNX model to INT8 TensorRT	3	3312	April 9, 2021
INT8 calibration cache doesn't created TensorRT tensorrt	3	1069	March 24, 2022

TensorRT INT8 conversion from an ONNX model

Description

Environment

Relevant Files

Steps To Reproduce

Related topics