Problem to quantize the INT8 model

lee9999 · January 19, 2022, 10:41am

Description

1.

I have problem when using cpp-api to quantize the model to int8.

Model has a scatterND plugin and other CNNs modules. there are two inputs, features and indices. I have prepared the corresponding calibration dataset.

I got the following info. under the default log level

[01/19/2022-10:31:27] [I] [TRT] Calibration table does not match calibrator algorithm type.
[01/19/2022-10:31:28] [I] [TRT] Detected 2 inputs and 6 output network tensors.
[01/19/2022-10:31:29] [I] [TRT] Calibration completed in 1.77067 seconds.
[01/19/2022-10:31:29] [E] [TRT] Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
[01/19/2022-10:31:29] [E] [TRT] Builder failed while configuring INT8 mode.

2.model build code

void build()
{
   ..........
    auto builder = SampleUniquePtr<nvinfer1::IBuilder>(nvinfer1::createInferBuilder(sample::gLogger.getTRTLogger()));
    if (!builder)
    {
        return false;
    }

    const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
    auto network = SampleUniquePtr<nvinfer1::INetworkDefinition>(builder->createNetworkV2(explicitBatch));
    if (!network)
    {
        return false;
    }

    auto config = SampleUniquePtr<nvinfer1::IBuilderConfig>(builder->createBuilderConfig());
    if (!config)
    {
        return false;
    }

    auto parser
            = SampleUniquePtr<nvonnxparser::IParser>(nvonnxparser::createParser(*network, sample::gLogger.getTRTLogger()));
    if (!parser)
    {
        return false;
    }
    auto constructed = constructNetwork(builder, network, config, parser);
    if (!constructed)
    {
        return false;
    }

    mEngine = std::shared_ptr<nvinfer1::ICudaEngine>(
            builder->buildEngineWithConfig(*network, *config), samplesCommon::InferDeleter());
    if (!mEngine)
    {
        return false;
    }
.....
}


bool SampleCenterPoint::constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
                                         SampleUniquePtr<nvinfer1::INetworkDefinition>& network, SampleUniquePtr<nvinfer1::IBuilderConfig>& config,
                                         SampleUniquePtr<nvonnxparser::IParser>& parser)
{
    auto parsed = parser->parseFromFile(locateFile(mParams.onnxFileName, mParams.dataDirs).c_str(),
                                        static_cast<int>(sample::gLogger.getReportableSeverity()));
    if (!parsed)
    {
        return false;
    }

    config->setMaxWorkspaceSize(1_GiB);
    if (mParams.fp16)
    {
        config->setFlag(BuilderFlag::kFP16);
    }
    else if (mParams.int8)
    {
        config->setFlag(BuilderFlag::kINT8);
    }
    builder->setMaxBatchSize(mParams.batchSize);

    if (mParams.int8)
    {
        std::unique_ptr<IInt8Calibrator> calibrator;
        CenterPointBatchStream calibrationStream(1, 10, "feature.bin", "indices.bin", {"/workspace/Data"});
        calibrator.reset(new Int8EntropyCalibrator2<CenterPointBatchStream>(
            calibrationStream, 0, "centerpoint", {"input.1", "indices_input"}));
        config->setInt8Calibrator(calibrator.get());
    }

    // samplesCommon::enableDLA(builder.get(), config.get(), mParams.dlaCore);

    return true;
}

3. Other qustion

3.1 Can I set which layer is quantized and which layer is not?
3.2 Because I add the plugin scatterND when quantifying, do I need to make corresponding changes to the plugin code?

Environment

TensorRT Version: 7.2.3
GPU Type: P4000 / RTX3090
Nvidia Driver Version: 460.91.03
Operating System + Version: ubuntu/ docker

spolisetty · February 11, 2022, 11:22am

Hi,

Sorry for the delay in addressing this issue. Are you still facing this issue ?
We recommend you to please try on the latest TensorRT version 8.2.
https://developer.nvidia.com/nvidia-tensorrt-8x-download

Thank you.

NVES · February 15, 2022, 12:28pm

Hi, Please refer to the below links to perform inference in INT8

Thanks!

Topic		Replies	Views
Classification model of densenet converted to int8 that outputs result is error! TensorRT	4	1133	October 28, 2019
How to do int8 calibration in c++ in tensorRT 5 ? TensorRT	10	4767	October 12, 2021
Int8 calibration TensorRT	1	2270	December 17, 2021
TensorRT INT8 calibration in C++ api TensorRT tensorrt	2	1763	February 14, 2022
TensorRT fails to build FasterRCNN GIE model with using INT8 TensorRT	28	9206	May 3, 2018
INT8 Calibration with PReLU Plugin Layer TensorRT	3	1176	September 14, 2018
INT8 Calibration in Python with TensorRT 8.6 TensorRT tensorrt	5	3165	July 12, 2023
INT8 quantization with Torch-TensorRT fails TensorRT tensorrt , pytorch	3	869	June 29, 2022
How do I generate INT8 calibration file wiht caffe? TensorRT tensorrt	1	824	August 12, 2020
ONNX Model INT8 Engine Build TensorRT tensorrt , jetson-inference , calibration , onnx	3	1823	July 26, 2022

Problem to quantize the INT8 model

Description

1.

2.model build code

3. Other qustion

Environment

Related topics