Reproducible segmentation faults on restart of inference

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
Jetson Nano
• DeepStream Version
6.0
• JetPack Version (valid for Jetson only)
JetPack SDK version: # R32 (release), REVISION: 7.4, GCID: 33514132, BOARD: t210ref, EABI: aarch64, DATE: Fri Jun 9 04:25:08 UTC 2023

I’m having an inference.py, which runs fine. Whenever I start the script and stop it via CTRL_C, everything works as expected.

Now I wanted to add a little API for start/stop. While I’m able to start the pipeline once and also being able to stop it once, every attempt to issue a subsequent “start” ends in a segmentation fault.

My start sequence in the log, input is /dev/video0:

Using winsys: x11 
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvMultiObjectTracker] Initialized
Deserialize yoloLayer plugin: yolo
0:00:11.644781964 22195   0x7f9853e230 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/home/ubuntu/vx-ai/jetson-inference/models/primary-detector-nano/yolov7-tiny/model_b3_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 5
0   INPUT  kFLOAT data            3x416x416       
1   OUTPUT kFLOAT num_detections  1               
2   OUTPUT kFLOAT detection_boxes 10647x4         
3   OUTPUT kFLOAT detection_scores 10647           
4   OUTPUT kFLOAT detection_classes 10647           

0:00:11.646134594 22195   0x7f9853e230 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2004> [UID = 1]: Use deserialized engine model: /home/ubuntu/vx-ai/jetson-inference/models/primary-detector-nano/yolov7-tiny/model_b3_gpu0_fp16.engine
0:00:12.036657142 22195   0x7f9853e230 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<nvinfer0> [UID 1]: Load new model:/tmp/tmpgh8zzwyw sucessfully
0:00:12.037355176 22195   0x7f98536b20 FIXME           videodecoder gstvideodecoder.c:933:gst_video_decoder_drain_out:<jpegdec0> Sub-class should implement drain()
2024-03-29 12:12:28,228 inference.py         INFO    : [eglglessink0] State changed from NULL to READY.
2024-03-29 12:12:28,229 inference.py         INFO    : [nvegltransform0] State changed from NULL to READY.
2024-03-29 12:12:28,229 inference.py         INFO    : [nvdsosd0] State changed from NULL to READY.
2024-03-29 12:12:28,229 inference.py         INFO    : [nvvideoconvert1] State changed from NULL to READY.
2024-03-29 12:12:28,230 inference.py         INFO    : [nvmultistreamtiler0] State changed from NULL to READY.
2024-03-29 12:12:28,230 inference.py         INFO    : [nvtracker0] State changed from NULL to READY.
2024-03-29 12:12:28,230 inference.py         INFO    : [nvinfer0] State changed from NULL to READY.
2024-03-29 12:12:28,231 inference.py         INFO    : [nvstreammux0] State changed from NULL to READY.
2024-03-29 12:12:28,231 inference.py         INFO    : [capsfilter1] State changed from NULL to READY.
2024-03-29 12:12:28,231 inference.py         INFO    : [nvvideoconvert0] State changed from NULL to READY.
2024-03-29 12:12:28,232 inference.py         INFO    : [videoflip0] State changed from NULL to READY.
2024-03-29 12:12:28,232 inference.py         INFO    : [jpegdec0] State changed from NULL to READY.
2024-03-29 12:12:28,232 inference.py         INFO    : [capsfilter0] State changed from NULL to READY.
2024-03-29 12:12:28,233 inference.py         INFO    : [v4l2src0] State changed from NULL to READY.
2024-03-29 12:12:28,233 inference.py         INFO    : [source-bin-0] State changed from NULL to READY.
2024-03-29 12:12:28,233 inference.py         INFO    : [pipeline0] State changed from NULL to READY.
2024-03-29 12:12:28,233 inference.py         INFO    : [nvegltransform0] State changed from READY to PAUSED.
2024-03-29 12:12:28,234 inference.py         INFO    : [nvdsosd0] State changed from READY to PAUSED.
2024-03-29 12:12:28,234 inference.py         INFO    : [nvvideoconvert1] State changed from READY to PAUSED.
2024-03-29 12:12:28,234 inference.py         INFO    : [nvmultistreamtiler0] State changed from READY to PAUSED.
2024-03-29 12:12:28,235 inference.py         INFO    : [nvtracker0] State changed from READY to PAUSED.
2024-03-29 12:12:28,235 inference.py         INFO    : [nvinfer0] State changed from READY to PAUSED.
2024-03-29 12:12:28,235 inference.py         INFO    : [nvstreammux0] State changed from READY to PAUSED.
2024-03-29 12:12:28,236 inference.py         INFO    : [capsfilter1] State changed from READY to PAUSED.
2024-03-29 12:12:28,236 inference.py         INFO    : [nvvideoconvert0] State changed from READY to PAUSED.
2024-03-29 12:12:28,236 inference.py         INFO    : [videoflip0] State changed from READY to PAUSED.
2024-03-29 12:12:28,237 inference.py         INFO    : [jpegdec0] State changed from READY to PAUSED.
2024-03-29 12:12:28,237 inference.py         INFO    : [capsfilter0] State changed from READY to PAUSED.
2024-03-29 12:12:28,237 inference.py         INFO    : [v4l2src0] State changed from READY to PAUSED.
2024-03-29 12:12:28,237 inference.py         INFO    : [source-bin-0] State changed from READY to PAUSED.
2024-03-29 12:12:28,238 inference.py         INFO    : [pipeline0] State changed from READY to PAUSED.
2024-03-29 12:12:28,238 inference.py         INFO    : [nvegltransform0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,238 inference.py         INFO    : [nvdsosd0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,239 inference.py         INFO    : [nvvideoconvert1] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,239 inference.py         INFO    : [nvmultistreamtiler0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,239 inference.py         INFO    : [nvtracker0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,240 inference.py         INFO    : [nvinfer0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,240 inference.py         INFO    : [nvstreammux0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,240 inference.py         INFO    : [capsfilter1] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,241 inference.py         INFO    : [nvvideoconvert0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,241 inference.py         INFO    : [videoflip0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,242 inference.py         INFO    : [jpegdec0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,242 inference.py         INFO    : [capsfilter0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,243 inference.py         INFO    : [v4l2src0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:28,243 inference.py         INFO    : [source-bin-0] State changed from PAUSED to PLAYING.
0:00:12.074213203 22195   0x7f98536b20 WARN          v4l2bufferpool gstv4l2bufferpool.c:790:gst_v4l2_buffer_pool_start:<v4l2src0:pool:src> Uncertain or not enough buffers, enabling copy threshold
0:00:13.024417113 22195   0x7f98536b20 FIXME           videodecoder gstvideodecoder.c:933:gst_video_decoder_drain_out:<jpegdec0> Sub-class should implement drain()
0:00:13.211542403 22195   0x7f98536b20 WARN                 v4l2src gstv4l2src.c:976:gst_v4l2src_create:<v4l2src0> lost frames detected: count = 1 - ts: 0:00:01.122992461
2024-03-29 12:12:31,076 inference.py         INFO    : [Stream 0] detection fps: 0.00 (0.00)
2024-03-29 12:12:31,140 inference.py         INFO    : [eglglessink0] State changed from READY to PAUSED.
2024-03-29 12:12:31,141 inference.py         INFO    : [eglglessink0] State changed from PAUSED to PLAYING.
2024-03-29 12:12:31,142 inference.py         INFO    : [pipeline0] State changed from PAUSED to PLAYING.

I’m stopping this by self.pipeline.set_state(Gst.State.NULL) followed by a self.loop.quit()(self.loop is GLib.MainLoop())

This is acked by:

2024-03-29 12:12:33,939 inference.py         INFO    : [nvegltransform0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,939 inference.py         INFO    : [nvdsosd0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,940 inference.py         INFO    : [nvvideoconvert1] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,940 inference.py         INFO    : [nvmultistreamtiler0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,941 inference.py         INFO    : [nvtracker0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,941 inference.py         INFO    : [nvinfer0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,942 inference.py         INFO    : [nvstreammux0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,943 inference.py         INFO    : [capsfilter1] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,943 inference.py         INFO    : [nvvideoconvert0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,944 inference.py         INFO    : [videoflip0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,944 inference.py         INFO    : [jpegdec0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,945 inference.py         INFO    : [capsfilter0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,945 inference.py         INFO    : [v4l2src0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,946 inference.py         INFO    : [source-bin-0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,947 inference.py         INFO    : [pipeline0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,947 inference.py         INFO    : [eglglessink0] State changed from PLAYING to PAUSED.
2024-03-29 12:12:33,948 inference.py         INFO    : [eglglessink0] State changed from PLAYING to READY.
2024-03-29 12:12:33,951 inference.py         INFO    : [nvegltransform0] State changed from PAUSED to READY.
2024-03-29 12:12:33,953 inference.py         INFO    : [nvdsosd0] State changed from PAUSED to READY.
2024-03-29 12:12:33,953 inference.py         INFO    : [nvvideoconvert1] State changed from PAUSED to READY.
2024-03-29 12:12:33,954 inference.py         INFO    : [nvmultistreamtiler0] State changed from PAUSED to READY.
[NvMultiObjectTracker] De-initialized
2024-03-29 12:12:33,965 inference.py         INFO    : [nvtracker0] State changed from PAUSED to READY.
2024-03-29 12:12:34,028 inference.py         INFO    : [nvinfer0] State changed from PAUSED to READY.
2024-03-29 12:12:34,031 inference.py         INFO    : [nvstreammux0] State changed from PAUSED to READY.
2024-03-29 12:12:34,032 inference.py         INFO    : [capsfilter1] State changed from PAUSED to READY.
2024-03-29 12:12:34,032 inference.py         INFO    : [nvvideoconvert0] State changed from PAUSED to READY.
2024-03-29 12:12:34,033 inference.py         INFO    : [videoflip0] State changed from PAUSED to READY.
2024-03-29 12:12:34,034 inference.py         INFO    : [jpegdec0] State changed from PAUSED to READY.
2024-03-29 12:12:34,034 inference.py         INFO    : [capsfilter0] State changed from PAUSED to READY.
2024-03-29 12:12:34,055 inference.py         INFO    : [v4l2src0] State changed from PAUSED to READY.
2024-03-29 12:12:34,056 inference.py         INFO    : [source-bin-0] State changed from PAUSED to READY.
2024-03-29 12:12:34,056 inference.py         INFO    : [pipeline0] State changed from PAUSED to READY.
2024-03-29 12:12:34,092 inference.py         INFO    : [eglglessink0] State changed from READY to NULL.
2024-03-29 12:12:34,093 inference.py         INFO    : [nvegltransform0] State changed from READY to NULL.

A subsequent start shows this:

Using winsys: x11 
2024-03-29 12:12:40,566 inference.py         INFO    : [eglglessink1] State changed from NULL to READY.
2024-03-29 12:12:40,567 inference.py         INFO    : [nvegltransform1] State changed from NULL to READY.
2024-03-29 12:12:40,567 inference.py         INFO    : [nvdsosd1] State changed from NULL to READY.
2024-03-29 12:12:40,568 inference.py         INFO    : [nvvideoconvert3] State changed from NULL to READY.
2024-03-29 12:12:40,568 inference.py         INFO    : [nvmultistreamtiler1] State changed from NULL to READY.
2024-03-29 12:12:40,568 inference.py         INFO    : [nvtracker1] State changed from NULL to READY.
2024-03-29 12:12:40,568 inference.py         INFO    : [nvinfer1] State changed from NULL to READY.
2024-03-29 12:12:40,569 inference.py         INFO    : [nvstreammux1] State changed from NULL to READY.
2024-03-29 12:12:40,569 inference.py         INFO    : [capsfilter3] State changed from NULL to READY.
2024-03-29 12:12:40,569 inference.py         INFO    : [nvvideoconvert2] State changed from NULL to READY.
2024-03-29 12:12:40,570 inference.py         INFO    : [videoflip1] State changed from NULL to READY.
2024-03-29 12:12:40,570 inference.py         INFO    : [jpegdec1] State changed from NULL to READY.
2024-03-29 12:12:40,570 inference.py         INFO    : [capsfilter2] State changed from NULL to READY.
2024-03-29 12:12:40,635 inference.py         INFO    : [v4l2src1] State changed from NULL to READY.
2024-03-29 12:12:40,635 inference.py         INFO    : [source-bin-0] State changed from NULL to READY.
2024-03-29 12:12:40,636 inference.py         INFO    : [pipeline1] State changed from NULL to READY.
2024-03-29 12:12:40,636 inference.py         INFO    : [nvegltransform1] State changed from READY to PAUSED.
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
2024-03-29 12:12:40,639 inference.py         INFO    : [nvdsosd1] State changed from READY to PAUSED.
2024-03-29 12:12:40,644 inference.py         INFO    : [nvvideoconvert3] State changed from READY to PAUSED.
2024-03-29 12:12:40,645 inference.py         INFO    : [nvmultistreamtiler1] State changed from READY to PAUSED.
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvMultiObjectTracker] Initialized
Segmentation fault (core dumped)

What can I do to narrow down the issue?

TIA

You can view the stack with the following command.

gdb -ex run --args python3 inference.py

In addition, can you provide a sample based on test1.py/test2.py to reproduce the problem? Not sure the cause of this problem

Thanks for the advice.

This is the output for a subsequent start after stop


Using winsys: x11 
[New Thread 0x7f0cff91f0 (LWP 8330)]
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
gstnvtracker: Batch processing is ON
gstnvtracker: Past frame output is ON
[NvMultiObjectTracker] Initialized
[New Thread 0x7f0d7fa1f0 (LWP 8331)]
[New Thread 0x7eed7fa1f0 (LWP 8332)]

Thread 43 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f0ffff1f0 (LWP 8328)]
0x0000007f82f6d454 in ?? () from /usr/lib/aarch64-linux-gnu/libnvinfer.so.8
(gdb) 

Anything I can inspect from here?

Could you please provide some info where I could find “test1.py/test2.py”?

I mean deepstream_test_1.py / deepstream_test_2.py.

Can you reproduce the problem based on these two samples?

I’m sorry, I must be missing something: I cannot find these files on the SD. I flashed the SD from your template, DS SDK 6.0 installed from apt. Maybe I’m missing some samples?

Do you mean these ?

Yes, you should see these samples when installing pyds.

Thanks. Will give it a try and report.

I might be a bit rusty with all this after years, but how again could I install it? The installation of the whl didn’t bring me the sample sources. Is there any tutorial, how to install the python samples?

EDIT: Got the sources for Py36 and 18.04 and copied it to /opt/nvidia/deepstream/deepstream-6.0/deepstream_python_apps-1.1.1. Seems to be the wrong path somehow…

EDIT2: OK found it. Needs to be put under the samples dir

Good. I cannot reproduce the crash with the USB cam sample.

I couldn’t even reproduce it with the slightly more complex “rtsp-in-out” sample, which comes closer to my use case. So it must be something in my code/model. I think this is a good trigger to start over :)

But could reproduce with yolo7-tiny model. I guess you will say “Not supported” or so :)

If you use yolov7, you can refer to this example.

I think the problem may be related to your model

Great.Thanks.

I can confirm there is no such a segmentation fault anymore with 6.4

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.