ERROR: runtime->deserializeCudaEngine build a engine ,report error "Serialization assertion sizeRead == static_cast<uint64_t>(mEnd - mCurrent) failed"


when i save a fp32 engine file, using deserializeCudaEngine build a engine is ok, but when i save a fp16 engine file, using deserializeCudaEngine build a engine report this error. Did i miss any step? there is my code ,thanks.
//this is save code:
nvinfer1::IHostMemory* serMem = mEngine->serialize();
std::stringstream gieModelStream;
gieModelStream.seekg(0, gieModelStream.beg);

gieModelStream.write((const char*)serMem->data(), serMem->size());

std::ofstream outFile;"yolo_engine.engine");
outFile << gieModelStream.rdbuf();

// this is load code:
std::ifstream engineFile(“./yolo_engine.engine”, std::ios::binary);
if (!engineFile)
return false;

engineFile.seekg(0, engineFile.end);
long int fsize = engineFile.tellg();
engineFile.seekg(0, engineFile.beg);

std::vector<char> engineData(fsize);, fsize);

SampleUniquePtr<IRuntime> runtime{ createInferRuntime(sample::gLogger.getTRTLogger()) };

auto Engine = runtime->deserializeCudaEngine(, engineData.size(), nullptr);


TensorRT Version: 8.2
GPU Type: 3080
Nvidia Driver Version:
CUDA Version: 11.4
CUDNN Version:
Operating System + Version: win10
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @793187758 ,
Did you try running your model using trtexec?
Please share the model logs and reproducible script with us to assist you better.


Thanks. I found Engine generated by trtexec can solve it yesterday.