TensorRt5: How to save the engine after it has built

zhhuang · January 22, 2019, 7:25am

Thx for this amazing accelerating lib, it shows up great inference speed after using the tensorRt.

But the time consume in building engine is kind of taking too much time.
Is there any methods that I can save the built engine so that I don’t have to wait for the building each time when I am compiling my code.

NVES · January 24, 2019, 5:39am

Hello,

The build phase can take considerable time, especially when running on embedded
platforms. Therefore, a typical application will build an engine once, and then serialize it
for later use.

zhhuang · January 25, 2019, 9:27am

Hi NVES,

Yep, it does cost a lot of time.
So can I save the engine in some kind of format so that I can skip the building part.

NVES · January 25, 2019, 3:26pm

Hello,

I think you want to serialize your engine. When you serialize, you are transforming the engine into a format to store and use at a later time for inference. To use for inference, you would simply deserialize the engine. Serializing and deserializing are optional. Since creating an engine from the Network Definition can be time consuming, you could avoid rebuilding the engine every time the application reruns by serializing it once and deserializing it while inferencing. Therefore, after the engine is built, users typically want to serialize it for later use.

Please reference: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#serial_model_python

yuvaramsingh94 · April 3, 2019, 10:00am

Hello ,

i am trying to write the serialized engine onto the disk using C++ code . i am not able to find full code for this in the document

where the code is incomplete .

IHostMemory *serializedModel = engine->serialize();
// store model to disk
// <…>
serializedModel->destroy();

can someone complete the store model to disk part and share as it will help me .

Thanks

yuvaramsingh94 · April 4, 2019, 9:09am

HI

i found the answer for my question . code teo save the serialized model onto disk .

trtModelStream = engine->serialize();
ofstream p("googlenet_engine.engine");
p.write((const char*)trtModelStream->data(),trtModelStream->size());
p.close();

pls share the correct code to read back the data ans size of the model . i am new to C++

Thanks
yuvaram

tom.petersy1wb7 · April 4, 2019, 7:37pm

@yuvaram: the tensorRT samples do quite a big of serializing (to memory), and deserializing (from memory). Just grep the samples for serialize( and deserializeCudaEngine. In order to write a model to disk, your code above looks OK (if you’re on Linux), but it would be better if you wrote

std::ofstream p("googlenet_engine.engine", std::ios::binary)"

(and this is I believe is necessary on Windows). In order to read it from disk, you’d do the inverse process using (for example) a std::ifstream, and subsequently calling deserializeCudaEngine as in the samples.

edit: corrected serializeCudaEngine → serialize

Chieh · March 23, 2020, 6:23am

Hi @NVES,
Is there any c++ sample which uses the engine serialize to save the TRT engine?

Thanks.