Error loading engine, deserialize_cuda_engine generates Segmentation fault (core dumped)


I already have a custom engine that I create from an ONNX file in python 3.5. The inference and saving the model runs without a problem. However, when I want to load the engine a “Segmentation fault (core dumped)” occurs. The error occurs in the function “deserialize_cuda_engine”

I am using the DGX station. I use the docker of TensorRT: 18.12-py3

I use the code from the developper guide

with open(“sample.engine”, “wb”) as f:
with open(“sample.engine”, “rb”) as f, trt.Runtime(TRT_LOGGER) as runtime:
		engine = runtime.deserialize_cuda_engine(


Can you provide a small repro package with the ONNX file and python code so that we can help debug this further?

NVIDIA Enterprise Support

i get the same error. how do you solve it finally? thanks!

I had the same problem. Have you resolve it?

No, I did not found a solution for that problem.