Data type for NvDsInferCudaEngineGetFromTltModel

alexandru.cocinda · July 26, 2022, 12:56pm

I am implementing my custom inference plugin and I got to the TLT model parsing part. The method used for generating the engine has this declaration:

extern "C"
bool NvDsInferCudaEngineGetFromTltModel(nvinfer1::IBuilder * const builder,
        nvinfer1::IBuilderConfig * const builderConfig,
        const NvDsInferContextInitParams * const initParams,
        nvinfer1::DataType dataType,
        nvinfer1::ICudaEngine *& cudaEngine);

I would expect to pass a dataType that corresponds to the precision used by my network, but the DeepStream code does this for INT8 (nvdsinfer_model_builder.cpp:654):

                /* modelDataType should be FLOAT for INT8 */
                modelDataType = nvinfer1::DataType::kFLOAT;

Can someone explain why?

yuweiw · July 27, 2022, 6:44am

If you read the whole code, you’ll find that different branches will be selected according to different conditions. You should pass the dataType that corresponds to the precision of your network.

if (networkMode == NvDsInferNetworkMode_INT8)
.......
if (networkMode == NvDsInferNetworkMode_FP16)
.......
if (networkMode == NvDsInferNetworkMode_FP32)

alexandru.cocinda · July 27, 2022, 8:53am

Alright, but why does the INT8 branch set the dataType to kFLOAT instead of kINT8?

    if (networkMode == NvDsInferNetworkMode_INT8)
    {
        /* Check if platform supports INT8 else use FP16 */
        if (m_Builder->platformHasFastInt8())
        {
            if (m_Int8Calibrator != nullptr)
            {
                /* Set INT8 mode and set the INT8 Calibrator */
                m_BuilderConfig->setFlag(nvinfer1::BuilderFlag::kINT8);
                m_BuilderConfig->setInt8Calibrator(m_Int8Calibrator.get());
                /* modelDataType should be FLOAT for INT8 */
                modelDataType = nvinfer1::DataType::kFLOAT;
            }
            else if (cudaEngineGetFcn != nullptr || cudaEngineGetDeprecatedFcn != nullptr)
            {
                dsInferWarning("INT8 calibration file not specified/accessible. "
                        "INT8 calibration can be done through setDynamicRange "
                        "API in 'NvDsInferCreateNetwork' implementation");
            }
            else
            {
                dsInferWarning("INT8 calibration file not specified. Trying FP16 mode.");
                networkMode = NvDsInferNetworkMode_FP16;
            }
        }
        else
        {
            dsInferWarning("INT8 not supported by platform. Trying FP16 mode.");
            networkMode = NvDsInferNetworkMode_FP16;
        }
    }

yuweiw · July 27, 2022, 10:14am

It’s designed in this way for TensorRT. Basically, you needn’t to follow the implementation details. You just need to send the value correctly. Thanks

alexandru.cocinda · July 27, 2022, 1:21pm

Alright, but since I am implementing my own plugin, should I set kFLOAT when using INT8 too?

yuweiw · July 28, 2022, 3:04am

Yeah, you can refer this code to set the kFLOAT when using INT8. If you encounter any problems when you run the pipeline, you can continue to open new topic on the forum.

system · August 11, 2022, 3:05am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.