Why the program pause when deserializeCudaEngine is called?

I use engine->serialize to serialize a caffemodel in tensorRT model as follows.

  1. serialize caffemodel
    gieModelStream = engine->serialize();
    std::ofstream outfile(model_rt_file, std::ios::out | std::ios::binary);
    unsigned char* streamdata = (unsigned char*)gieModelStream->data();
    outfile.write((char*)streamdata, gieModelStream->size());
  2. read tensorRT model
    std::ifstream in_file(model_rt_file, std::ios::in | std::ios::binary);
    std::streampos begin, end;
    begin = in_file.tellg();
    in_file.seekg(0, std::ios::end);
    end = in_file.tellg();
    std::size_t size = end - begin;
    in_file.seekg(0, std::ios::beg);
    std::unique_ptr<unsigned char> engine_data(new unsigned char);
    in_file.read((char*)engine_data.get(), size);
    infer = createInferRuntime(gLogger);
    engine = infer->deserializeCudaEngine((const void*)engine_data.get(), size, &pluginFactory);

But when comes to engine = infer->deserializeCudaEngine((const void*)engine_data.get(), size, &pluginFactory), the program pause and doesn’t output any information. This problem occurs by accident, about once among ten attempts. Did anybody meet the same situation?


We don’t receive a relevant topic before.
Not sure if there is something incorrect when dealing with serialize/de-serialize IO stream.

Could you share a complete source for us to reproduce this issue in our side?

sorry for late reply. Because of some reasons, I could not share the complete code, but the code I paste can already say something. The problem occurs depend on GPU such as Nvidia Tesla P4, but it never occurs when Nvidia Tesla P40 is used.


Could you try another model to check if this issue is model-dependent?