can we write IHostMemory into a file, and read the file to deserializeCudaEngine?

328703810 · December 12, 2018, 12:50pm

how serialize:

trtModelStream = engine->serialize();
	fstream ifile;
	ifile.open(filename, fstream::in | fstream::out | fstream::trunc);
	std::cout << (int)trtModelStream->type() << "\n"; // why type is kINT8? our net handle float
	ifile.write((char*)trtModelStream->data(), trtModelStream->size()); // write into a file

how deserialize:

fstream ifile;
	ifile.open(engine_txt_path, fstream::in | fstream::out |  std::ios::ate);
	ifstream::pos_type pos = ifile.tellg();
	size_t fsize = (size_t)pos;
	std::cout << pos << " " << fsize << "\n";
	char* tmp_data = new char[pos];
	ifile.seekg(0, ios_base::beg);
	ifile.read(tmp_data, pos);
	nvinfer1::ICudaEngine* engine = runtime->deserializeCudaEngine(tmp_data, fsize, nullptr);

and invalid memory access in deserializeCudaEngine?

what should i do?

NVES · December 12, 2018, 1:51pm

Hello,

I don’t see obvious issues with your serial/deserialize code. Can you share a small repro that demonstrates how you are getting the memory access error during deserialization?

Can you also post the full error logs?

1043140993 · September 27, 2019, 2:44am

hello,
I also have the same problem. Have you solved this issue?

tom.petersy1wb7 · September 27, 2019, 12:47pm

Hi 328703810, 1043140993,

Especially if you’re on Windows, be careful to read and write in binary mode by adding

std::ios::binary

to your stream flags.
(through the

operator)

You can see examples of this in the trtexec source from github. Reading an engine: https://github.com/NVIDIA/TensorRT/blob/07ed9b57b1ff7c24664388e5564b17f7ce2873e5/samples/common/sampleEngines.cpp#L450

And writing an engine: https://github.com/NVIDIA/TensorRT/blob/07ed9b57b1ff7c24664388e5564b17f7ce2873e5/samples/common/sampleEngines.cpp#L480

Cheers,
Tom

Amy_21 · October 9, 2019, 9:07am

Hi tom,

I tried your code for read binary model on linux. There still got core dumped when I try to deserializeCudaEngine. I’m very confused.

errors:
[I] build engine 1…
[I] read gie model 1…
[I] read gie model 9…
Segmentation fault (core dumped)

Here is my code:

IRuntime* runtime = createInferRuntime(gSSHLogger.getTRTLogger());
assert(runtime != nullptr);
PluginFactory pluginFactory;
gLogInfo << "build engine 1..." << std::endl;
ICudaEngine* engine = nullptr;
if (trtModelStream != nullptr)
{
    engine = runtime->deserializeCudaEngine(
        trtModelStream->data(), trtModelStream->size(), &pluginFactory);
    gLogInfo << "build engine 3..." << std::endl;
}
else
{
    gLogInfo << "read gie model 1..." << std::endl;
    std::ifstream engineFile(cache_path, std::ios::binary);

    engineFile.seekg(0, engineFile.end);
    long int fsize = engineFile.tellg();
    engineFile.seekg(0, engineFile.beg);

    std::vector<char> engineData(fsize);
    engineFile.read(engineData.data(), fsize);

    gLogInfo << "read gie model 9..." << std::endl;

    engine = runtime->(engineData.data(), fsize, &pluginFactory);

    gLogInfo << "read gie model 2..." << std::endl;
}

tom.petersy1wb7 · October 9, 2019, 2:18pm

Hi Amy_21,

[note: you can format code in this forum for easier readability with the </> tab]

Did you write in binary mode too?
Can you get a backtrace in, eg, gdb to see where TensorRT is segfaulting?
I see you have plugins, are you sure your plugins properly implement serialization?
You can also run under valgrind to see if you have any wild reads or writes (perhaps in your serialization code), or other memory issues (this takes some time, but is often worth it – sanitizers, on the other hand, don’t seem compatible with CUDA in my experience)

Also, towards the end of your code, you write

engine = runtime->(engineData.data(), fsize, &pluginFactory);

which I’m assuming is a typo.

Good luck,
Tom

Amy_21 · October 10, 2019, 2:43am

Hi Tom,

Thank you for your answer, the typo is a pasting error.
Yes, I write in binary mode too.
My model has a custom layer, and I can see nvidia sample used PluginFactory.
When I deserializeCudaEngine from stream, the code worked. But from a disk file, the code went core dumped.
From the gdb trace, I can’t see where the crash come from.

// deserializeCudaEngine 
    IRuntime* runtime = createInferRuntime(gSSHLogger.getTRTLogger());
    assert(runtime != nullptr);
    PluginFactory pluginFactory;
    ICudaEngine* engine = nullptr;
    if (trtModelStream != nullptr)
    {
        engine = runtime->deserializeCudaEngine(
            trtModelStream->data(), trtModelStream->size(), &pluginFactory);
    }
    else
    {
        gLogInfo << "read gie model 1..." << std::endl;
        std::ifstream engineFile(cache_path, std::ifstream::binary);

        engineFile.seekg(0, engineFile.end);
        long int fsize = engineFile.tellg();
        engineFile.seekg(0, engineFile.beg);
    
        std::vector<char> engineData(fsize);
        engineFile.read(reinterpret_cast<char*>(engineData.data()), fsize);

        gLogInfo << "read gie model 9..." << std::endl;
 
        engine = runtime->deserializeCudaEngine(engineData.data(), fsize, &pluginFactory);

        gLogInfo << "read gie model 2..." << std::endl;
    }


        // serialize
        PluginFactory parserPluginFactory;

        caffeToTRTModel(
                        "test_ssh.prototxt",
                        "SSH.caffemodel",
                        std::vector<std::string> {OUTPUT_BLOB_NAME0, OUTPUT_BLOB_NAME1},
                        N, &parserPluginFactory, trtModelStream);
        parserPluginFactory.destroyPlugin();
        assert(trtModelStream != nullptr);
        saveGIEModel(trtModelStream, &cache_path);

void caffeToTRTModel(const std::string& deployFile,           // Name for caffe prototxt
                     const std::string& modelFile,            // Name for model
                     const std::vector<std::string>& outputs, // Network outputs
                     unsigned int maxBatchSize,               // Batch size - NB must be at least as large as the batch we want to run with)
                     nvcaffeparser1::IPluginFactoryExt* pluginFactory, // factory for plugin layers
                     IHostMemory*& trtModelStream)            // Output stream for the TensorRT model
{
    // Create the builder
    IBuilder* builder = createInferBuilder(gSSHLogger.getTRTLogger());
    assert(builder != nullptr);

    // Parse the caffe model to populate the network, then set the outputs
    INetworkDefinition* network = builder->createNetwork();
    ICaffeParser* parser = createCaffeParser();
    parser->setPluginFactoryExt(pluginFactory);

    bool fp16 = builder->platformHasFastFp16();
    const IBlobNameToTensor* blobNameToTensor = parser->parse(locateMyFile(deployFile).c_str(),
                                                              locateMyFile(modelFile).c_str(),
                                                              *network, fp16 ? DataType::kHALF : DataType::kFLOAT);
    gLogInfo << "support fp16: " << fp16 << std::endl;

    // Specify which tensors are outputs
    for (auto& s : outputs)
        network->markOutput(*blobNameToTensor->find(s.c_str()));

    // Build the engine
    builder->setMaxBatchSize(maxBatchSize);
    builder->setMaxWorkspaceSize(10 << 20); // We need about 6MB of scratch space for the plugin layer for batch size 5
    builder->setFp16Mode(fp16);

    gLogInfo << "Begin building engine..." << std::endl;
    ICudaEngine* engine = builder->buildCudaEngine(*network);
    assert(engine);
    gLogInfo << "End building engine..." << std::endl;

    // We don't need the network any more, and we can destroy the parser
    network->destroy();
    parser->destroy();

    // Serialize the engine, then close everything down
    trtModelStream = engine->serialize();

    engine->destroy();
    builder->destroy();
    shutdownProtobufLibrary();
}

void saveGIEModel(IHostMemory*& trtModelStream, std::string* cache_path)
{
    std::ofstream ofs(*cache_path, std::ofstream::binary);
    ofs.write(reinterpret_cast<char*>(trtModelStream->data()), trtModelStream->size());
    ofs.close();
}

the gdb trace:

[New Thread 0x7fffcf75e700 (LWP 32186)]
[New Thread 0x7fffcef5d700 (LWP 32187)]
[New Thread 0x7fffce6db700 (LWP 32195)]
[I] build engine 1...
[I] read gie model 1...
[I] read gie model 9...

Thread 1 "sample_SSH" received signal SIGSEGV, Segmentation fault.
0x0000555555563998 in std::vector<float, std::allocator<float> >::size() const ()
(gdb)

Best regards,
Amy

tom.petersy1wb7 · October 10, 2019, 4:14pm

Hi Amy,

When you catch the SIGSEGV in gdb, you should run bt to get a backtrace. Also, when using gdb or valgrind, it’s important to build with -g for debug information in your build in order to get sensible backtraces. You may also require a debug build.

As I said, even though it will be slow, I’d still encourage you to run through valgrind to see if you have any memory issues.

Also, make sure your file read succeeded: after engineFile.read, you could add a block like this, from trtexec:

if (!engineFile)
    {
        err << "Error loading engine file: " << engine << std::endl;
        return nullptr;
    }

Cheers,
Tom

Amy_21 · October 12, 2019, 3:18am

Hi Tom,

I ran gdb bt, and found the reason for core dumped was in my custom layer code.
My serialize / read / write functions used vector, which I changed into float, and so my problom got solved.

Thanks for your help.

Best regards,
Amy

Topic		Replies	Views
cannot deserialize engine and segmentation fault(core dumped) TensorRT	1	987	September 6, 2019
cannot deserialize engine and segmentation fault(core dumped) Jetson TX2	2	2177	October 18, 2021
Saving and loading serialized engine in Windows 10 TensorRT tensorrt	3	1119	June 8, 2020
Load TensorRT engine and deserialize in C++ TensorRT	12	5194	February 27, 2025
TensorRt on windows error when loading in deserialized custom plugins TensorRT tensorrt	5	935	March 16, 2021
How to load and deserialize the .engine file? TensorRT	3	4688	January 30, 2020
How to serialize TensorRT model into file and deserialize from file? TensorRT	3	5073	November 29, 2018
TensorRT get bad classifier results GPU-Accelerated Libraries	4	1145	December 26, 2017
How to serialize TensorRT model into file and deserialize from file? (Solved) TensorRT	3	1325	October 12, 2021
Segmentation fault (core dumped) while doing Tensorrt optimization of lenet Jetson TX2	6	6217	October 18, 2021

can we write IHostMemory into a file, and read the file to deserializeCudaEngine?

Related topics