Segmentation fault (core dumped) error on Jetson Nano

While running ssd inception v2 coco model over a video with tensorrt engine, after running for 2 mins the process got stopped with Segmentation fault (core dumped) error on Jetson Nano

[TensorRT] INFO: Glob Size is 50485364 bytes.
[TensorRT] INFO: Added linear block of size 4292096
[TensorRT] INFO: Added linear block of size 1522176
[TensorRT] INFO: Added linear block of size 739328
[TensorRT] INFO: Added linear block of size 411648
[TensorRT] INFO: Added linear block of size 204800
[TensorRT] INFO: Added linear block of size 57344
[TensorRT] INFO: Added linear block of size 30720
[TensorRT] INFO: Added linear block of size 20992
[TensorRT] INFO: Added linear block of size 9728
[TensorRT] INFO: Added linear block of size 9216
[TensorRT] INFO: Added linear block of size 2560
[TensorRT] INFO: Added linear block of size 2560
[TensorRT] INFO: Added linear block of size 1024
[TensorRT] INFO: Added linear block of size 512
[TensorRT] INFO: Found Creator FlattenConcat_TRT
[TensorRT] INFO: Found Creator GridAnchor_TRT
[TensorRT] INFO: Found Creator FlattenConcat_TRT
[TensorRT] INFO: Found Creator NMS_TRT
[TensorRT] INFO: Deserialize required 3084695 microseconds.
Gtk-Message: 11:48:55.371: Failed to load module "canberra-gtk-module"
Segmentation fault (core dumped)

Hi,

May I know which sample do you use first?
Is it jetson_inference? https://github.com/dusty-nv/jetson-inference

Thanks.

No, it’s not a jetson_inference, rather I have converted my custom ssd inception v2 coco model to tensorrt engine. I was working fine but crashed all of a sudden after 3 mins with that error.

Hi,

Could you help to provide a simple reproducible source for us debugging?
Or could you try if this issue can be reproduced with our official sample?
Ex. /usr/src/tensorrt/samples/sampleUffSSD

Usually, this should be related to the data source type you use.
Do you run the inference code with live stream or camera?

Thanks.

Hi siddharthdas8,

Have you managed to get issue resolved?
Any result can be shared?

hi

i meet the same issue. error info is :
[TensorRT] ERROR: UffParser: Validator error: truediv_14/Cast: Unsupported operation _Cast
[TensorRT] ERROR: Network must have at least one output
Segmentation fault (core dumped)
I trained a yolo3 model with TF 1.15.0, and then convert it to uff. convertion log is below:
Loading output_frozen_model2.pb
NOTE: UFF has been tested with TensorFlow 1.12.0. Other versions are not guaranteed to work
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
UFF Version 0.6.3
=== Automatically deduced input nodes ===
[name: “input”
op: “Placeholder”
attr {
key: “dtype”
value {
type: DT_FLOAT
}
}
attr {
key: “shape”
value {
shape {
dim {
size: -1
}
dim {
size: -1
}
dim {
size: -1
}
dim {
size: 3
}
}
}
}
]

=== Automatically deduced output nodes ===
[name: “output/boxes”
op: “Identity”
input: “combined_non_max_suppression/CombinedNonMaxSuppression”
attr {
key: “T”
value {
type: DT_FLOAT
}
}
, name: “output/scores”
op: “Identity”
input: “combined_non_max_suppression/CombinedNonMaxSuppression:1”
attr {
key: “T”
value {
type: DT_FLOAT
}
}
, name: “output/labels”
op: “Identity”
input: “combined_non_max_suppression/CombinedNonMaxSuppression:2”
attr {
key: “T”
value {
type: DT_FLOAT
}
}
, name: “output/num_detections”
op: “Identity”
input: “combined_non_max_suppression/CombinedNonMaxSuppression:3”
attr {
key: “T”
value {
type: DT_INT32
}
}
]

1 Placeholder: “input”
2 Shape: “yolov3/Shape”
3 Const: “yolov3/strided_slice/stack”
4 Const: “yolov3/strided_slice/stack_1”
5 Const: “yolov3/strided_slice/stack_2”

I also traind a yolov3 with pytorch. also meet the same error Segmentation fault (core dumped). sample is as below.

def ONNX_build_engine(onnx_file_path):

    G_LOGGER = trt.Logger(trt.Logger.WARNING)
    EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    with trt.Builder(G_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, G_LOGGER) as parser:
        print('1.1--Loading ONNX file from path {}...'.format(onnx_file_path))
    with open(onnx_file_path, 'rb') as model:
        print('1.2--Parsing ONNX model')
        parser.parse(model.read())
        builder.max_batch_size = 100
        builder.max_workspace_size = 1 << 30
        print('1.3--Completed parsing of ONNX model')
    with trt.Builder(TRT_LOGGER) as builder, builder.create_builder_config() as config, builder.build_cuda_engine(network, config) as engine:
        print('1.4--Building an engine from file {}; this may take a while...'.format(onnx_file_path))
        serialized_engine = engine.serialize()
        with trt.Runtime(TRT_LOGGER) as runtime:
            runtime.deserialize_cuda_engine(serialized_engine)
        print("1.5--Completed creating Engine")

Yup, got it running. Earlier I was using another module along with it, it might be that module was creating the issue, running this object detection task only worked for me. Thanks for the response @AastaLLL and @kayccc

Hi guo.feng,

Please open a new topic for your issue. Thanks