Segmentation Fault in Custom YOLO11 Parser with DeepStream 7.1 (Python Pipeline)

Details of Setup

  • Hardware Platform (Jetson / GPU): GPU
  • DeepStream Version: 7.1.0
  • TensorRT Version: 10.7.0.23-1+cuda12.6 amd64
  • NVIDIA GPU Driver Version (valid for GPU only): 560.35.03 (CUDA Version: 12.6)

Environment:

  • DeepStream 7.1
  • YOLOv11 custom parser compiled for NVIDIA TensorRT
  • Ubuntu 22.04 LTS, Dockerized deployment

Issue Type: Bug

Description of Issue:

We’re experiencing a recurring segmentation fault (signal 11) in a DeepStream 7.1 pipeline using a custom YOLOv11 parser (compiled for TensorRT). The crash occurs during MQTT message publishing and appears rooted in the custom inference library.

Error Log Snippet:

2025-02-04T10:49:18.9161781Z ds-tracker-1   | Publish callback with reason code: Success.
2025-02-04T10:49:18.9561721Z ds-tracker-1   | [mosq_mqtt_log_callback] Client null sending PUBLISH (d0, q0, r0, m30638, 'APP_output', ... (1502 bytes))
2025-02-04T10:49:18.9562051Z ds-tracker-1   | Publish callback with reason code: Success.
2025-02-04T10:49:18.9786314Z ds-tracker-1   | [b9e1e42f68d5:21   :0:286] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x55f4434b8700)
2025-02-04T10:49:18.9961495Z ds-tracker-1   | [mosq_mqtt_log_callback] Client null sending PUBLISH (d0, q0, r0, m30639, 'APP_output', ... (1502 bytes))
2025-02-04T10:49:18.9961844Z ds-tracker-1   | Publish callback with reason code: Success.
2025-02-04T10:49:19.0189169Z ds-tracker-1   | ==== backtrace (tid:    286) ====
2025-02-04T10:49:19.0190037Z ds-tracker-1   |  0  /usr/lib/x86_64-linux-gnu/libucs.so.0(ucs_handle_error+0x2e4) [0x7f82877c6fc4]
2025-02-04T10:49:19.0190336Z ds-tracker-1   |  1  /usr/lib/x86_64-linux-gnu/libucs.so.0(+0x24fec) [0x7f82877cafec]
2025-02-04T10:49:19.0190725Z ds-tracker-1   |  2  /usr/lib/x86_64-linux-gnu/libucs.so.0(+0x251aa) [0x7f82877cb1aa]
2025-02-04T10:49:19.0191420Z ds-tracker-1   |  3  /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f8310844520]
2025-02-04T10:49:19.0191838Z ds-tracker-1   |  4  /home/ds_tracker/custom_libs/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so(+0x62265) [0x7f8249475265]
2025-02-04T10:49:19.0192413Z ds-tracker-1   |  5  /home/ds_tracker/custom_libs/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so(+0x62469) [0x7f8249475469]
2025-02-04T10:49:19.0192951Z ds-tracker-1   |  6  /home/ds_tracker/custom_libs/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so(NvDsInferParseYolo+0x34) [0x7f824947557b]
2025-02-04T10:49:19.0193740Z ds-tracker-1   |  7  /opt/nvidia/deepstream/deepstream/lib/libnvds_infer.so(_ZN9nvdsinfer19DetectPostprocessor19fillDetectionOutputERKSt6vectorI18NvDsInferLayerInfoSaIS2_EER24NvDsInferDetectionOutput+0xac) [0x7f828f987b6c]
2025-02-04T10:49:19.0194582Z ds-tracker-1   |  8  /opt/nvidia/deepstream/deepstream/lib/libnvds_infer.so(_ZN9nvdsinfer19DetectPostprocessor14parseEachBatchERKSt6vectorI18NvDsInferLayerInfoSaIS2_EER20NvDsInferFrameOutput+0x17) [0x7f828f964207]
2025-02-04T10:49:19.0195366Z ds-tracker-1   |  9  /opt/nvidia/deepstream/deepstream/lib/libnvds_infer.so(_ZN9nvdsinfer18InferPostprocessor15postProcessHostERNS_14NvDsInferBatchER27NvDsInferContextBatchOutput+0x76a) [0x7f828f96b72a]
2025-02-04T10:49:19.0196094Z ds-tracker-1   | 10  /opt/nvidia/deepstream/deepstream/lib/libnvds_infer.so(_ZN9nvdsinfer20NvDsInferContextImpl18dequeueOutputBatchER27NvDsInferContextBatchOutput+0x108) [0x7f828f9663a8]
2025-02-04T10:49:19.0196705Z ds-tracker-1   | 11  /opt/nvidia/deepstream/deepstream/lib/gst-plugins/libnvdsgst_infer.so(+0x1bd0d) [0x7f82904dfd0d]
2025-02-04T10:49:19.0197027Z ds-tracker-1   | 12  /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0(+0x89ac1) [0x7f830ffb0ac1]
2025-02-04T10:49:19.0197279Z ds-tracker-1   | 13  /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f8310896ac3]
2025-02-04T10:49:19.0197560Z ds-tracker-1   | 14  /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7f8310928850]
2025-02-04T10:49:19.0197759Z ds-tracker-1   | =================================
2025-02-04T10:49:32.4728772Z mqtt-broker-1  | 1738666172: Client auto-4B5A4D3B-FD27-C46B-F6DF-4A29B535CB62 closed its connection.
2025-02-04T10:49:32.5227365Z ds-tracker-1   | /opt/nvidia/deepstream/deepstream-7.1/entrypoint.sh: line 15:    21 Segmentation fault      (core dumped) /opt/nvidia/nvidia_entrypoint.sh $@
2025-02-04T10:49:33.7214370Z Aborting on container exit...
2025-02-04T10:49:33.7214740Z 
2025-02-04T10:49:33.7215944Z [Kds-tracker-1 exited with code 139
2025-02-04T10:49:33.7435201Z  Container vgtr3-ds-tracker-1  Stopping
2025-02-04T10:49:33.7528593Z  Container vgtr3-ds-tracker-1  Stopped
2025-02-04T10:49:33.7528861Z  Container vgtr3-redis-1  Stopping
2025-02-04T10:49:33.7529020Z  Container vgtr3-mqtt-broker-1  Stopping
2025-02-04T10:49:35.5677508Z  Container vgtr3-mqtt-broker-1  Stopped
2025-02-04T10:49:35.9499743Z  Container vgtr3-redis-1  Stopped
2025-02-04T10:49:36.1906090Z ##[error]Process completed with exit code 139.

Observations:

  1. The fault occurs after multiple successful MQTT publishes, suggesting memory corruption during bounding box parsing.
  2. Backtrace implicates NvDsInferParseYolo, specifically at offset 0x62265 in the custom library.

Requested Guidance:

  • Known issues with YOLOv11 output layer configurations in DeepStream 7.1?
  • Best practices for debugging segmentation faults in custom parsers (e.g., heap overflow checks, CUDA-GDB integration).

Additional Note:

  • We’ve validated the model’s ONNX conversion and TensorRT engine creation (no errors).

You can try use End2End with EfficientNMS.

and use this parsing function for all yolo models.

Is this still an DeepStream issue to support? Thanks! Could you simplify the code to narrow down this issue? for example, removing the sending broker temporarily. which code causes the crash?

Yes, thank you for your help! The issue was resolved. NvDsInferParseYolo was replaced NvDsInferParseYoloCuda in our YOLO inference config among other configuration changes and it fixed the issue for us. Now the pipeline runs without any issue.