Segmentation fault while building an engine from an ONNX model


Hi, while running trtexec with an ONNX model, I’ve got a segmentation fault.
Upon the log, segmentation fault occurred on Timing Runner for Myelin-fused foreign node.

[08/28/2023-12:01:48] [V] [TRT] --------------- Timing Runner: /features/0/0/Conv (CaskFlattenConvolution)
[08/28/2023-12:01:48] [V] [TRT] CaskFlattenConvolution has no valid tactics for this config, skipping
[08/28/2023-12:01:48] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 0x7121ec1db3f80c67
[08/28/2023-12:01:48] [V] [TRT] =============== Computing costs for 
[08/28/2023-12:01:48] [V] [TRT] *************** Autotuning format combination: Float(401408,3136,56,1) -> Float(1000,1) ***************
[08/28/2023-12:01:48] [V] [TRT] --------------- Timing Runner: {ForeignNode[/features/0/2/Constant_output_0...(Unnamed Layer* 1783) [ElementWise]]} (Myelin)
Segmentation fault (core dumped)

As the internal code for building an engine is not open-sourced, I’m struggling to debug this problem.
Attaching the input ONNX model and the log file.
ONNX model link (1.1 MB)


TensorRT Version: 8.5.2-1+cuda11.8
GPU Type: RTX A6000
Nvidia Driver Version: 510.108.03
CUDA Version: 12.0
CUDNN Version: 8
Operating System + Version: Ubuntu 20.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): 23.01-py3

Steps To Reproduce

  1. Build trtexec on the above docker image
  2. trtexec --onnx=/workspace/models/simple_swin_b.onnx --saveEngine=simple_swin_b.engine --buildOnly --verbose

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging

Hi, Thank you for the reply.

I attached the ONNX model and the log is from --verbose.
And it passed the ONNX checker as well.
Is there information about what the “Timing Runner” is doing exactly and how to debug the error?
It seems it’s almost impossible to debug it from user because the code is not open.
Any suggestion will be a great help!

Thank you


We recommend you use the latest TensorRT version 8.6.1 -
Using the latest TensorRT version, we could successfully build the TRT engine.

[08/29/2023-06:22:58] [I] Engine deserialized in 0.103842 sec.
[08/29/2023-06:22:58] [I] Skipped inference phase since --skipInference is added.
&&&& PASSED TensorRT.trtexec [TensorRT v8601] # trtexec --onnx=simple_swin_b.onnx --buildOnly --verbose

Thank you.

As your suggestion, it worked with the latest TRT version.
Thanks a lot!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.