Saving and loading serialized engine in Windows 10

solarflarefx · June 2, 2020, 2:31pm

Description

I am trying to serialize an engine, save it to file, and then later load the engine from and deserialize it.

I saw the documentation on this, which suggests:

IHostMemory *serializedModel = engine->serialize();
// store model to disk
// <…>
serializedModel->destroy();

And for loading:

IRuntime* runtime = createInferRuntime(gLogger);
ICudaEngine* engine = runtime->deserializeCudaEngine(modelData, modelSize, nullptr);

I also looked at this blog post which also discusses serializing and deserializing the engine.

Finally, I found this github issue which shows someone doing this in a Windows environment.

I then attempted to do this with the MNist ONNX example.

I added the following to the top of my code:

std::string engineFile = "SerializedEngineTest.engine";
bool serializeEngine = false;

And in the main routine:

if (serializeEngine) {
    if (!sample.build())
    {
        return gLogger.reportFail(sampleTest);
    }
}
else {
    if (!sample.derserializeEngineFromFile())
    {
        return gLogger.reportFail(sampleTest);
    }
}

I defined two functions:

bool SampleOnnxMNIST::build()
{
    auto builder = SampleUniquePtr<nvinfer1::IBuilder>(nvinfer1::createInferBuilder(gLogger.getTRTLogger()));
    if (!builder)
    {
        return false;
    }

    const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
    auto network = SampleUniquePtr<nvinfer1::INetworkDefinition>(builder->createNetworkV2(explicitBatch));
    if (!network)
    {
        return false;
    }

    auto config = SampleUniquePtr<nvinfer1::IBuilderConfig>(builder->createBuilderConfig());
    if (!config)
    {
        return false;
    }

    auto parser = SampleUniquePtr<nvonnxparser::IParser>(nvonnxparser::createParser(*network, gLogger.getTRTLogger()));
    if (!parser)
    {
        return false;
    }

    auto constructed = constructNetwork(builder, network, config, parser);
    if (!constructed)
    {
        return false;
    }

    mEngine = std::shared_ptr<nvinfer1::ICudaEngine>(
        builder->buildEngineWithConfig(*network, *config), samplesCommon::InferDeleter());

    if (!mEngine)
    {
        gLogInfo << "false engine..?" << std::endl;
        return false;
    }

    // Serialize engine and save to file
    std::ofstream p(engineFile.c_str(), std::ios::binary);
    IHostMemory* m_ModelStream = mEngine->serialize();
    p.write(reinterpret_cast<const char*>(m_ModelStream->data()), m_ModelStream->size());
    p.close();

    gLogInfo << "network->getNbInputs()" << network->getNbInputs() << std::endl;
    assert(network->getNbInputs() == 1);


    mInputDims = network->getInput(0)->getDimensions();
    gLogInfo << "mInputDims = " << mInputDims << std::endl;
    gLogInfo << "mInputDims.nbDims = " << mInputDims.nbDims << std::endl;
    assert(mInputDims.nbDims == 4);

    gLogInfo << "network->getNbOutputs() = " << network->getNbOutputs() << std::endl;
    gLogInfo << "network->getOutput(0)->getDimensions() = " << network->getOutput(0)->getDimensions() << std::endl;
    assert(network->getNbOutputs() == 1);
    mOutputDims = network->getOutput(0)->getDimensions();
    gLogInfo << "mOutputDims = " << mOutputDims << std::endl;
    gLogInfo << "mOutputDims.nbDims = " << mOutputDims.nbDims << std::endl;
    assert(mOutputDims.nbDims == 2);

    return true;
}

And:

bool SampleOnnxMNIST::derserializeEngineFromFile()
{
    // Load engine from file and de-serialize
    std::vector<char> trtModelStreamfromFile;
    size_t size{ 0 };
    std::ifstream file(engineFile, std::ios::binary);
    if (file.good())
    {
        file.seekg(0, file.end);
        size = file.tellg();
        file.seekg(0, file.beg);
        trtModelStreamfromFile.resize(size);
        file.read(trtModelStreamfromFile.data(), size);
        file.close();
        IRuntime* runtime = createInferRuntime(gLogger.getTRTLogger());
        ICudaEngine* mEngine = runtime->deserializeCudaEngine(trtModelStreamfromFile.data(), size, nullptr);
    }

    if (!mEngine)
    {
        gLogInfo << "false engine..?" << std::endl;
        return false;
    }

    return true;
}

When I set

bool serializeEngine = true;

and run the program, it saves to file.

When I set:

bool serializeEngine = false;

and run the program, in the derserializeEngineFromFile() method, on the line:

if (!mEngine)

I am noticing that mEngine is empty.

What am I doing incorrectly?

Environment

TensorRT Version: 7.0
CUDA Version: 10.2
CUDNN Version: 7.6.5
Operating System + Version: Windows 10 64-bit
Visual Studio Version: 2019

SunilJB · June 2, 2020, 4:26pm

Please refer to this sample in case it helps:

github.com

NVIDIA/TensorRT/blob/572d54f91791448c015e74a4f1d6923b77b79795/samples/opensource/sampleUffFasterRCNN/sampleUffFasterRCNN.cpp#L163


      
              std::vector<int> nms_classifier(std::vector<float>& boxes_per_cls, std::vector<float>& probs_per_cls,
                  float NMS_OVERLAP_THRESHOLD, int NMS_MAX_BOXES);
          
          
    //!
              //! \brief Helper function to dump bbox-overlayed images as PPM files.
              //!
              void visualize_boxes(int img_num, int class_num, std::vector<float>& pred_boxes, std::vector<float>& pred_probs,
                  std::vector<int>& pred_cls_ids, std::vector<int>& box_num_per_img, std::vector<vPPM>& ppms);
          };
          
          
bool SampleUffFasterRcnn::build()
          {
              initLibNvInferPlugins(&gLogger.getTRTLogger(), "");
          
          
    if (mParams.loadEngine.size() > 0)
              {
                  std::vector<char> trtModelStream;
                  size_t size{0};
                  std::ifstream file(mParams.loadEngine, std::ios::binary);
                  if (file.good())
                  {

Thanks

solarflarefx · June 2, 2020, 5:01pm

@SunilJB Thanks for the reply. So as a test I am trying to run the example, but I get this error:

SunilJB · June 8, 2020, 6:16am

Refer to the GitHub: sampleUffFasterRCNN/README.md file for detailed information about how this sample works, sample code, and step-by-step instructions on how to run and verify its output.

Thanks

Topic		Replies	Views
TensorRt5: How to save the engine after it has built TensorRT	7	7797	March 23, 2020
Problem on exporting Tensor RT engine to file and reimport it. TensorRT	5	2049	October 12, 2021
Trt_yolo_app TensorRT	4	1666	October 12, 2021
TRT8 serialize() return nullptr Jetson AGX Orin tensorrt	15	605	July 24, 2023
How to serialize TensorRT model into file and deserialize from file? TensorRT	3	5302	November 29, 2018
How to serialize TensorRT model into file and deserialize from file? (Solved) TensorRT	3	1402	October 12, 2021
Runtime.deserialize_cuda_engine return a NoneType, how to fix ti? TensorRT tensorrt	10	2609	July 15, 2022
Loading of the tensorRT Engine in C++ API Jetson TX1	24	19616	October 18, 2021
ERROR: runtime->deserializeCudaEngine build a engine ,report error "Serialization assertion sizeRead == static_cast<uint64_t>(mEnd - mCurrent) failed" TensorRT	2	545	October 28, 2022
Deserializing TensorRT engine file Jetson AGX Xavier tensorrt , pytorch , onnx	2	1890	March 2, 2022

Saving and loading serialized engine in Windows 10

Description

Environment

Related topics