Problem to quantize the INT8 model

Description

1.

I have problem when using cpp-api to quantize the model to int8.

Model has a scatterND plugin and other CNNs modules. there are two inputs, features and indices. I have prepared the corresponding calibration dataset.

I got the following info. under the default log level

[01/19/2022-10:31:27] [I] [TRT] Calibration table does not match calibrator algorithm type.
[01/19/2022-10:31:28] [I] [TRT] Detected 2 inputs and 6 output network tensors.
[01/19/2022-10:31:29] [I] [TRT] Calibration completed in 1.77067 seconds.
[01/19/2022-10:31:29] [E] [TRT] Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
[01/19/2022-10:31:29] [E] [TRT] Builder failed while configuring INT8 mode.

2.model build code

void build()
{
   ..........
    auto builder = SampleUniquePtr<nvinfer1::IBuilder>(nvinfer1::createInferBuilder(sample::gLogger.getTRTLogger()));
    if (!builder)
    {
        return false;
    }

    const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
    auto network = SampleUniquePtr<nvinfer1::INetworkDefinition>(builder->createNetworkV2(explicitBatch));
    if (!network)
    {
        return false;
    }

    auto config = SampleUniquePtr<nvinfer1::IBuilderConfig>(builder->createBuilderConfig());
    if (!config)
    {
        return false;
    }

    auto parser
            = SampleUniquePtr<nvonnxparser::IParser>(nvonnxparser::createParser(*network, sample::gLogger.getTRTLogger()));
    if (!parser)
    {
        return false;
    }
    auto constructed = constructNetwork(builder, network, config, parser);
    if (!constructed)
    {
        return false;
    }

    mEngine = std::shared_ptr<nvinfer1::ICudaEngine>(
            builder->buildEngineWithConfig(*network, *config), samplesCommon::InferDeleter());
    if (!mEngine)
    {
        return false;
    }
.....
}


bool SampleCenterPoint::constructNetwork(SampleUniquePtr<nvinfer1::IBuilder>& builder,
                                         SampleUniquePtr<nvinfer1::INetworkDefinition>& network, SampleUniquePtr<nvinfer1::IBuilderConfig>& config,
                                         SampleUniquePtr<nvonnxparser::IParser>& parser)
{
    auto parsed = parser->parseFromFile(locateFile(mParams.onnxFileName, mParams.dataDirs).c_str(),
                                        static_cast<int>(sample::gLogger.getReportableSeverity()));
    if (!parsed)
    {
        return false;
    }

    config->setMaxWorkspaceSize(1_GiB);
    if (mParams.fp16)
    {
        config->setFlag(BuilderFlag::kFP16);
    }
    else if (mParams.int8)
    {
        config->setFlag(BuilderFlag::kINT8);
    }
    builder->setMaxBatchSize(mParams.batchSize);

    if (mParams.int8)
    {
        std::unique_ptr<IInt8Calibrator> calibrator;
        CenterPointBatchStream calibrationStream(1, 10, "feature.bin", "indices.bin", {"/workspace/Data"});
        calibrator.reset(new Int8EntropyCalibrator2<CenterPointBatchStream>(
            calibrationStream, 0, "centerpoint", {"input.1", "indices_input"}));
        config->setInt8Calibrator(calibrator.get());
    }

    // samplesCommon::enableDLA(builder.get(), config.get(), mParams.dlaCore);

    return true;
}


3. Other qustion

3.1 Can I set which layer is quantized and which layer is not?
3.2 Because I add the plugin scatterND when quantifying, do I need to make corresponding changes to the plugin code?

Environment

TensorRT Version: 7.2.3
GPU Type: P4000 / RTX3090
Nvidia Driver Version: 460.91.03
Operating System + Version: ubuntu/ docker

Hi,

Sorry for the delay in addressing this issue. Are you still facing this issue ?
We recommend you to please try on the latest TensorRT version 8.2.
https://developer.nvidia.com/nvidia-tensorrt-8x-download

Thank you.

Hi, Please refer to the below links to perform inference in INT8

Thanks!