Deepstream 7.1 Segmentation fault (core dumped)

Hi,
NVIDIA GeForce GTX 1650
deepstream:7.1-triton-multiarch
TensorRT version 10.3.0
Driver Version: 560.81
I am trying to create a module using PGIE resnet34_peoplenet_int8 model and SGIE retinaface_resnet50 model and used sgie_src_pad_buffer_probe and osd_sink_pad_buffer_probe. Pgie is working as expected since i used the deepstream sample_apps test2. Encountering this…

# python3 multicam.py   file:///opt/nvidia/deepstream/deepstream-7.1/samples/streams/sample_1080p_h264.mp4   file:///opt/nvidia/deepstream/deepstream-7.1/samples/streams/sample-videos/face-demographics-walking-and-pause.mp4
[INFO] Creating pipeline...
[INFO] Received 2 source URIs:
[INFO] Creating nvstreammux...
[INFO] Creating nvstreammux...
[INFO] Creating source bin for stream 0...
[INFO] Linked source bin 0 to streammux
[INFO] Creating source bin for stream 1...
[INFO] Linked source bin 1 to streammux
[INFO] Creating processing elements and queues...
[INFO] Linking elements with queues...
[INFO] Pipeline elements linked successfully.
[INFO] Setting pipeline to PLAYING...
0:00:01.641733519 1476928 0x557a0064fa60 WARN                 nvinfer gstnvinfer.cpp:681:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1243> [UID = 2]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:01.885364030 1476928 0x557a0064fa60 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 2]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.1/samples/models/retinaface-onnx/weights/retinaface_resnet50.onnx_b2_gpu0_int8.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:327 [FullDims Engine Info]: layers num: 4
0   INPUT  kFLOAT input           3x640x640       min: 1x3x640x640     opt: 2x3x640x640     Max: 2x3x640x640
1   OUTPUT kFLOAT bbox            16800x4         min: 0               opt: 0               Max: 0
2   OUTPUT kFLOAT confidence      16800x2         min: 0               opt: 0               Max: 0
3   OUTPUT kFLOAT landmark        16800x10        min: 0               opt: 0               Max: 0

0:00:01.885443561 1476928 0x557a0064fa60 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 2]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.1/samples/models/retinaface-onnx/weights/retinaface_resnet50.onnx_b2_gpu0_int8.engine
0:00:01.894263501 1476928 0x557a0064fa60 WARN                 nvinfer gstnvinfer.cpp:1063:gst_nvinfer_start:<secondary1-nvinference-engine> warning: NvInfer asynchronous mode is applicable for secondaryclassifiers only. Turning off asynchronous mode
0:00:01.894788665 1476928 0x557a0064fa60 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<secondary1-nvinference-engine> [UID 2]: Load new model:face_detect_config.txt sucessfully
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:02.114490577 1476928 0x557a0064fa60 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.1/samples/models/Primary_Detector/resnet18_trafficcamnet_pruned.onnx_b2_gpu0_int8.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:327 [FullDims Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1:0       3x544x960       min: 1x3x544x960     opt: 2x3x544x960     Max: 2x3x544x960
1   OUTPUT kFLOAT output_cov/Sigmoid:0 4x34x60         min: 0               opt: 0               Max: 0
2   OUTPUT kFLOAT output_bbox/BiasAdd:0 16x34x60        min: 0               opt: 0               Max: 0

0:00:02.114564769 1476928 0x557a0064fa60 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.1/samples/models/Primary_Detector/resnet18_trafficcamnet_pruned.onnx_b2_gpu0_int8.engine
0:00:02.118990176 1476928 0x557a0064fa60 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary-inference> [UID 1]: Load new model:dstest2_pgie_config.txt sucessfully
[INFO] Pipeline is now playing.
[INFO] Starting GLib Main Loop. Press Ctrl+C to stop.
Failed to query video capabilities: Inappropriate ioctl for device
Failed to query video capabilities: Inappropriate ioctl for device
[DEBUG] Found video pad: video/x-raw, format=(string)Y444, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1; video/x-raw, format=(string)Y444_10LE, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1; video/x-raw, format=(string)P010_10LE, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1; video/x-raw, format=(string)NV12, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1, adding convert chain
[DEBUG] Found video pad: video/x-raw, format=(string)Y444, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1; video/x-raw, format=(string)Y444_10LE, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1; video/x-raw, format=(string)P010_10LE, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1; video/x-raw, format=(string)NV12, width=(int)[ 1, 32768 ], height=(int)[ 1, 32768 ], framerate=(fraction)[ 0/1, 2147483647/1 ], pixel-aspect-ratio=(fraction)1/1, adding convert chain
[DEBUG] Source bin pad linked via NVMM convert chain
[DEBUG] Source bin pad linked via NVMM convert chain
Segmentation fault (core dumped)

Parse function for .so generation

#include "nvdsinfer_custom_impl.h"
#include <algorithm>
#include <cmath>
#include <cstring>
#include <iostream>
#include <vector>

#define CONF_THRESH 0.1f    // Minimum face confidence to keep
#define NMS_THRESH  0.4f    // IoU threshold for NMS
#define VIS_THRESH  0.75f   // (optional) Additional threshold before output

struct Detection {
    float x1, y1, x2, y2;  // pixel coordinates
    float score;
};

static float iou(const Detection &a, const Detection &b)
{
    float x_left   = std::max(a.x1, b.x1);
    float y_top    = std::max(a.y1, b.y1);
    float x_right  = std::min(a.x2, b.x2);
    float y_bottom = std::min(a.y2, b.y2);

    if (x_right <= x_left || y_bottom <= y_top)
        return 0.0f;

    float inter_area = (x_right - x_left) * (y_bottom - y_top);
    float area_a     = (a.x2 - a.x1) * (a.y2 - a.y1);
    float area_b     = (b.x2 - b.x1) * (b.y2 - b.y1);
    return inter_area / (area_a + area_b - inter_area);
}

static void run_nms(std::vector<Detection> &input, float nms_thresh,
                    std::vector<Detection> &output)
{
    std::sort(input.begin(), input.end(),
              [](const Detection &a, const Detection &b) {
                  return a.score > b.score;
              });
    std::vector<bool> suppressed(input.size(), false);

    for (size_t i = 0; i < input.size(); ++i) {
        if (suppressed[i]) 
            continue;
        output.push_back(input[i]);
        for (size_t j = i + 1; j < input.size(); ++j) {
            if (suppressed[j]) 
                continue;
            float overlap = iou(input[i], input[j]);
            if (overlap > nms_thresh) {
                suppressed[j] = true;
            }
        }
    }
}

static bool NvDsInferParseRetinaface(
    std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
    NvDsInferNetworkInfo const &networkInfo,
    NvDsInferParseDetectionParams const &detectionParams,
    std::vector<NvDsInferObjectDetectionInfo> &objectList)
{
    const NvDsInferLayerInfo *bboxLayer       = nullptr;
    const NvDsInferLayerInfo *confLayer       = nullptr;
    const NvDsInferLayerInfo *landmarkLayer   = nullptr;

    for (auto const &layer : outputLayersInfo) {
        if (strcmp(layer.layerName, "bbox") == 0) {
            bboxLayer = &layer;
        } else if (strcmp(layer.layerName, "confidence") == 0) {
            confLayer = &layer;
        } else if (strcmp(layer.layerName, "landmark") == 0) {
            landmarkLayer = &layer;  // if you want landmarks later
        }
    }
    if (!bboxLayer || !confLayer) {
        std::cerr << "ERROR: Missing output layers (expect 'bbox' and 'confidence').\n";
        return false;
    }

    // 2. Extract pointers to raw buffers
    float *bboxData    = (float *)bboxLayer->buffer;       // shape: [16800 × 4]
    float *confData    = (float *)confLayer->buffer;       // shape: [16800 × 2]
    int numAnchors = bboxLayer->inferDims.d[0];  // (= 16800)

    std::vector<Detection> candidates;
    candidates.reserve(numAnchors);

    const float imgW = static_cast<float>(networkInfo.width);
    const float imgH = static_cast<float>(networkInfo.height);

    for (int i = 0; i < numAnchors; ++i) {
        float faceScore = confData[i * 2 + 1];
        if (faceScore < CONF_THRESH)
            continue;

        float nx1 = bboxData[i * 4 + 0];
        float ny1 = bboxData[i * 4 + 1];
        float nx2 = bboxData[i * 4 + 2];
        float ny2 = bboxData[i * 4 + 3];

        // Convert to pixel space and clip
        float x1 = std::max(0.0f, std::min(nx1 * imgW, imgW - 1.0f));
        float y1 = std::max(0.0f, std::min(ny1 * imgH, imgH - 1.0f));
        float x2 = std::max(0.0f, std::min(nx2 * imgW, imgW - 1.0f));
        float y2 = std::max(0.0f, std::min(ny2 * imgH, imgH - 1.0f));

        // Make sure box is valid
        if (x2 <= x1 || y2 <= y1)
            continue;

        // Store candidate
        Detection d;
        d.x1    = x1;
        d.y1    = y1;
        d.x2    = x2;
        d.y2    = y2;
        d.score = faceScore;
        candidates.push_back(d);
    }

    if (candidates.empty())
        return true;  // no detections above CONF_THRESH

    // 5. Run NMS (IoU > NMS_THRESH)
    std::vector<Detection> keep;
    keep.reserve(candidates.size());
    run_nms(candidates, NMS_THRESH, keep);

    // 6. Convert final boxes into DeepStream’s NvDsInferObjectDetectionInfo
    for (auto &det : keep) {
        if (det.score < VIS_THRESH)
            continue;

        NvDsInferObjectDetectionInfo out;
        out.classId            = 0;  // “face” is class 0
        out.detectionConfidence = det.score;
        out.left   = static_cast<unsigned int>(det.x1);
        out.top    = static_cast<unsigned int>(det.y1);
        out.width  = static_cast<unsigned int>(det.x2 - det.x1);
        out.height = static_cast<unsigned int>(det.y2 - det.y1);
        objectList.push_back(out);
    }

    return true;
}

extern "C" bool NvDsInferParseCustomRetinaface(
    std::vector<NvDsInferLayerInfo> const  &outputLayersInfo,
    NvDsInferNetworkInfo      const       &networkInfo,
    NvDsInferParseDetectionParams const  &detectionParams,
    std::vector<NvDsInferObjectDetectionInfo> &objectList)
{
    return NvDsInferParseRetinaface(
        outputLayersInfo, networkInfo, detectionParams, objectList);
}

// Compile‐time check that the function signature matches what DeepStream expects.
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomRetinaface);

dstest2_pgie_config.txt.txt (803 Bytes)
(PGIE CONFIG)
face_detect_config.txt (914 Bytes) (SGIE CONFIG)

I think this may be related to the post-processing library of sgie, you can debug it using gdb --args ./your_application parameters,then viewing the stack after crash.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks