nvinfer1::IRuntime::deserializeCudaEngine fails

I am getting the following error when executing the nvinfer1::IRuntime::deserializeCudaEngine command:

engine.cpp:868: bool nvinfer1::rt::Engine::deserialize(const void*, std::size_t, nvinfer1::IGpuAllocator&, nvinfer1::IPluginFactory*): Assertion `size >= bsize && "Mismatch between allocated memory size and expected size of serialized engine."' failed.

This error seems to cover a lot of ills.

  • It happens if the plan is not found. It would be nice if some other error was raised in that case.

What else does it cover?

Am I wrong when I interpret the size param to be the size of the serialized graph passed via the first argument? What is this error really trying to tell me?

The graph runs fine in python3.

Environment:

Machine: Xavier
JetPack: 4.2
Graph: retrained inception_v3 - converted to RT model with FP32