TensorRT INT8 calibration in C++ api

Description

I have my own onnx network and want to run INT8 quantized mode in TensorRT7 env (C++).
I’ve tried to run this onnx model using “config->setFlag(nvinfer1::BuilderFlag::kFP16)” and succeed.

I googled and found the NVIDIA example of TensorRT MNIST INT8 example in here.
But the thing is that, it uses MNISTBatchStream class, not the general one.

So my question is that,

  1. Can you give me more general example in the case of using normal jpg like image files?
  2. How much calibration images(meaning like jpg png..) are needed to get my onnx model is being calibrated?
    (Does it need whole the training data or just few images of it?)
  3. Do I need pre-processing for calibration inputs?
    (I’ve done ((input-mean)/std) normalization while the model training.)

Please help me.
Thanks.

Environment

TensorRT Version: TensorRT7
GPU Type: GTX1650
Nvidia Driver Version: 460.x
CUDA Version: 11.1
CUDNN Version: 8.4x
Operating System + Version: ubuntu18.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): tensorrt20:10-py3

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi, Please refer to the below links to perform inference in INT8

Thanks!

Dear NVES,
The thing is that I want to interface the INT8 data type.
I’ve already quantize my ONNX model to INT8 and inferenced it.

For example what I want to do in the sampleINT8 example,
change the the line in the SampleINT8::processInput function
from

std::memcpy(hostDataBuffer, data, mParams.batchSize * samplesCommon::volume(mInputDims) * sizeof(float));

to

std::memcpy(hostDataBuffer, data, mParams.batchSize * samplesCommon::volume(mInputDims) * sizeof(signed char));

Is it possible?
If it is, then how to do it??

I’ve already tried the sampleUffPluginV2Ext example, but it outputs INT8 to each FP32 buffer, so the code copies the data in FP32 datatype (4 Byte per each outputs).