I am trying to build a multi-stage inference pipeline in C++ using DeepStream 7.0. My goal is to run a PeopleNet PGIE, followed by an SCRFD Face Detector as an SGIE using nvinferserver.
However, I’m currently stuck because DeepStream is ignoring my SGIE configuration.
DeepStream logs a warning that it is resetting my SGIE from PROCESS_MODE_CLIP_OBJECTS to PROCESS_MODE_FULL_FRAME. This causes the pipeline to either crash intermittently or run without showing any face detections, even though my custom parser logs show it is finding proposals.
The exact same model (scrfd), Triton setup, and custom parser library work perfectly when I configure them as a PGIE. The issue only occurs when trying to run it as an SGIE.
Here is the warning log:
validatePluginConfig:<sgie_facenet> warning: Configuration file process_mode reset to: PROCESS_MODE_FULL_FRAME.
Setup Details:
-
Hardware Platform: Jetson Orin Nano
-
DeepStream SDK: 7.0
-
JetPack Version: 6.0
-
TensorRT Version: 8.6.2.3
-
Application: C++ (using DeepStream Service Maker,
main.cpp) -
Operating System: Ubuntu 22.04 (via JetPack 6.0)
-
CUDA Version: Bundled with JetPack 6.0
-
Triton Server: Running locally (gRPC
localhost:8001)
What I’m Using:
-
PGIE Model: PeopleNet (
unique-id=1, person classid=0) -
SGIE Model: SCRFD (
unique-id=3), served via Triton -
Model Input Size: 640×640
-
Model Output Tensors: 9 outputs total (for strides 8/16/32, each has
score,bbox,kpstensors)- Stride 8 (12800 anchors):
-
score_8(e.g.,Sigmoidoutput,12800x1) -
bbox_8(e.g., Graph output451, shape12800x4) -
kps_8(e.g., Graph output454, shape12800x10)
-
- Stride 16 (3200 anchors):
-
score_16(e.g., Graph output471, shape3200x1) -
bbox_16(e.g., Graph output474, shape3200x4) -
kps_16(e.g., Graph output477, shape3200x10)
-
- Stride 32 (800 anchors):
-
score_32(e.g., Graph output494, shape800x1) -
bbox_32(e.g., Graph output497, shape800x4) -
kps_32(e.g., Graph output500, shape800x10)
-
- Stride 8 (12800 anchors):
-
Custom Parser:
IInferCustomProcessorimplementation (CreateInferServerCustomProcess) in a.sofile, loaded viacustom-lib-pathin thenvinferserverconfig.
Issue Description:
I am using the C++ Service Maker application (main.cpp). My process is:
-
I start the service.
-
I add a PeopleNet pool as the PGIE (
unique-id=1). -
I set the environment variable
DS_ENABLE_FACENET=1. This flag in mymain.cpptriggers adding thenvinferserverelement as an SGIE (sgie_facenet,unique-id=3). -
The
config_sgie_scrfd.txtfile is loaded, which explicitly setsprocess_mode: PROCESS_MODE_CLIP_OBJECTSandoperate_on_gie_id: 1. -
DeepStream immediately prints the warning that it has reset
process_modetoFULL_FRAME. -
My custom parser (
scrfd_custom_process.cpp) logs confirm it is not receiving theOPTION_NVDS_OBJ_META_LISTfrom theinOptionsand is falling back to full-frame mode. -
This fallback is unstable, leading to intermittent crashes or no face detections being attached.
As stated, the model and parser logic are correct, as they work perfectly in PGIE mode. The issue seems to be how nvinferserver handles (or ignores) the input_control settings when used as an SGIE.
DeepStream Pipeline (main.cpp snippet)
This is the logic in my main.cpp that adds the SGIE when DS_ENABLE_FACENET=1.
C++
// ... PGIE (uid=1) and optional Tracker/Analytics are added first ...
// Optional FaceNet SGIE (secondary) inserted after PGIE/tracker
std::string facenet_cfg_path;
if (enable_facenet) { // This is true
const char* facenet_cfg_env = std::getenv("DS_FACENET_CONFIG_FILE");
// This resolves to "config_sgie_scrfd.txt"
facenet_cfg_path = resolve_path(facenet_cfg_env ? std::string(facenet_cfg_env) : std::string("custom_logic/config_sgie_facenet.txt"));
std::cout << "[" << model_name << "] Enabling FaceNet SGIE with config: " << facenet_cfg_path << "\n";
// Add the nvinferserver element for the SGIE
p->add("nvinferserver", "sgie_facenet");
auto sgie_elem = (*p)["sgie_facenet"];
sgie_elem.set("config-file-path", facenet_cfg_path.c_str());
sgie_elem.set("unique-id", 3);
// We rely 100% on the config file for input_control (process_mode, operate_on)
}
// ... Pipeline linking logic follows ...
config_sgie_scrfd.txt (Gst-nvinferserver configuration)
This is the config file for the sgie_facenet element.
Ini, TOML
# Secondary GIE (SGIE) for SCRFD face detection via Triton
infer_config {
unique_id: 3
gpu_ids: [0]
max_batch_size: 4
backend {
triton {
model_name: "scrfd"
version: -1
grpc { url: "localhost:8001" }
}
}
preprocess {
network_format: IMAGE_FORMAT_BGR
tensor_order: TENSOR_ORDER_LINEAR
maintain_aspect_ratio: 1
symmetric_padding: 1
normalize { scale_factor: 1.0 }
}
postprocess { other {} }
extra {
# DS7 uses a misspelled key for the custom processor symbol
custom_process_funcion: "CreateInferServerCustomProcess"
}
custom_lib {
path: "/data/triton_models/dynamic_ds_service_cpp/build/libnvdsinferserver_custom_process_scrfd.so"
}
}
input_control {
# Run as SGIE on object crops coming from PGIE/tracker
process_mode: PROCESS_MODE_CLIP_OBJECTS
interval: 0
# Operate on PGIE with unique_id=1 (PeopleNet/YOLO primary)
operate_on_gie_id: 1
# Operate on person class from PGIE (PeopleNet/COCO typically class 0)
operate_on_class_ids: [0]
}
output_control {
output_tensor_meta: false
}
scrfd_custom_process.cpp (custom parser)
This is the full custom processor code for nvinferserver. It includes the fallback logic to manually filter by PGIE ROIs when OPTION_NVDS_OBJ_META_LIST is missing.
C++
/*
* SCRFD custom postprocess for DeepStream nvinferserver (Triton backend)
* - Assumes 9 outputs: for strides 8/16/32, each has {score(1), bbox(4), kps(10)} tensors
* - Generates anchors (2 per location), decodes boxes/keypoints, rescales to frame, NMS, and
* attaches NvDsObjectMeta (class "face") to NvDsFrameMeta.
*
* Notes:
* - Keep thresholds/runtime tunables here for quick iteration.
* - For DS7, ensure bInferDone is set so nvtracker knows this is a detector frame.
* - Landmarks: decoded but not attached as a special DS7 field (no stable landmark field). You
* can attach them as user meta if needed later.
*/
#include <string.h>
#include <algorithm>
#include <cmath>
#include <iostream>
#include <vector>
#include "nvdsinferserver/infer_custom_process.h"
#include "nvbufsurface.h"
#include "nvdsmeta.h"
namespace dsis = nvdsinferserver;
struct Anchor { float cx, cy, w, h; };
struct Proposal {
NvDsInferObjectDetectionInfo rect; // left, top, width, height, classId
float score;
float landmarks[10]; // 5 points (x,y)
};
static float iou_rect(const NvDsInferObjectDetectionInfo& a,
const NvDsInferObjectDetectionInfo& b) {
float x1 = std::max(a.left, b.left);
float y1 = std::max(a.top, b.top);
float x2 = std::min(a.left + a.width, b.left + b.width);
float y2 = std::min(a.top + a.height, b.top + b.height);
float iw = std::max(0.0f, x2 - x1);
float ih = std::max(0.0f, y2 - y1);
float inter = iw * ih;
float ua = a.width * a.height + b.width * b.height - inter;
return ua > 0.0f ? inter / ua : 0.0f;
}
static std::vector<Proposal> nms(std::vector<Proposal>& boxes, float iou_thr) {
std::vector<Proposal> out;
if (boxes.empty()) return out;
std::sort(boxes.begin(), boxes.end(), [](auto& x, auto& y){ return x.score > y.score; });
std::vector<char> sup(boxes.size(), 0);
for (size_t i = 0; i < boxes.size(); ++i) {
if (sup[i]) continue;
out.push_back(boxes[i]);
for (size_t j = i + 1; j < boxes.size(); ++j) {
if (!sup[j] && iou_rect(boxes[i].rect, boxes[j].rect) > iou_thr) sup[j] = 1;
}
}
return out;
}
static void generate_anchors(int net_w, int net_h, int stride,
const std::vector<float>& sizes,
std::vector<Anchor>& anchors) {
int fw = net_w / stride;
int fh = net_h / stride;
anchors.reserve(anchors.size() + (size_t)fw * fh * sizes.size());
for (int y = 0; y < fh; ++y) {
for (int x = 0; x < fw; ++x) {
float cx = (x + 0.5f) * stride;
float cy = (y + 0.5f) * stride;
for (float s : sizes) anchors.push_back({cx, cy, s, s});
}
}
}
class NvInferServerCustomProcess : public dsis::IInferCustomProcessor {
public:
~NvInferServerCustomProcess() override = default;
void supportInputMemType(dsis::InferMemType& type) override { type = dsis::InferMemType::kCpu; }
bool requireInferLoop() const override { return false; }
NvDsInferStatus extraInputProcess(const std::vector<dsis::IBatchBuffer*>&,
std::vector<dsis::IBatchBuffer*>&,
const dsis::IOptions*) override { return NVDSINFER_SUCCESS; }
void notifyError(NvDsInferStatus) override {}
NvDsInferStatus inferenceDone(const dsis::IBatchArray* outputs,
const dsis::IOptions* inOptions) override;
private:
NvDsInferStatus attachObjMeta(const dsis::IOptions* inOptions,
const std::vector<Proposal>& props,
uint32_t batchIdx);
const std::vector<std::string> kLabels = { "face" };
// SCRFD usually uses two anchors per location; we'll keep two "priors" per cell
// but decode boxes as distances (ltrb) from center.
const std::vector<float> kSizesS8 = {1.0f, 1.0f};
const std::vector<float> kSizesS16 = {1.0f, 1.0f};
const std::vector<float> kSizesS32 = {1.0f, 1.0f};
};
NvDsInferStatus NvInferServerCustomProcess::inferenceDone(
const dsis::IBatchArray* outputs, const dsis::IOptions* inOptions)
{
if (!outputs || outputs->getSize() != 9) {
std::cerr << "[scrfd] Expected 9 outputs, got "
<< (outputs ? (int)outputs->getSize() : -1) << "\n";
return NVDSINFER_CUSTOM_LIB_FAILED;
}
// Determine effective batch size using metadata preference order:
// SGIE object meta list > frame meta list > surface params list > stream ids > tensor batch size
std::vector<uint64_t> streamIds; inOptions->getValueArray(OPTION_NVDS_SREAM_IDS, streamIds);
std::vector<NvBufSurfaceParams*> surfParamsList; inOptions->getValueArray(OPTION_NVDS_BUF_SURFACE_PARAMS_LIST, surfParamsList);
std::vector<NvDsFrameMeta*> frameMetaList; inOptions->getValueArray(OPTION_NVDS_FRAME_META_LIST, frameMetaList);
std::vector<NvDsObjectMeta*> objMetaList; if (inOptions->hasValue(OPTION_NVDS_OBJ_META_LIST)) {
inOptions->getValueArray(OPTION_NVDS_OBJ_META_LIST, objMetaList);
}
uint32_t B = 0;
if (!objMetaList.empty()) B = static_cast<uint32_t>(objMetaList.size());
else if (!frameMetaList.empty()) B = static_cast<uint32_t>(frameMetaList.size());
else if (!surfParamsList.empty()) B = static_cast<uint32_t>(surfParamsList.size());
else if (!streamIds.empty()) B = static_cast<uint32_t>(streamIds.size());
else if (outputs && outputs->getSize() > 0) {
auto* buf0 = outputs->getBuffer(0);
if (buf0) B = buf0->getBatchSize();
}
if (B == 0) {
// No frames in this callback (can happen in DS), nothing to do.
return NVDSINFER_SUCCESS;
}
// Tunables
// Per-stride confidence thresholds and candidate caps to curb over-detections.
// If SGIE requested but receives no ROIs (DS fallback to full-frame), relax thresholds a bit.
bool hasObjListKey = inOptions->hasValue(OPTION_NVDS_OBJ_META_LIST);
bool sgie_no_roi = hasObjListKey && objMetaList.empty();
int64_t stage_uid_dbg = 0; inOptions->getInt(OPTION_NVDS_UNIQUE_ID, stage_uid_dbg);
bool is_secondary = (stage_uid_dbg != 1);
bool sgie_fullframe = (is_secondary && !hasObjListKey);
std::cerr << "[scrfd] mode: hasObjListKey=" << (hasObjListKey?1:0)
<< " sgie_no_roi=" << (sgie_no_roi?1:0)
<< " sgie_fullframe=" << (sgie_fullframe?1:0)
<< " B=" << B << "\n";
float conf_thr_s[3]; // s8, s16, s32
if (sgie_no_roi || sgie_fullframe) {
conf_thr_s[0] = 0.45f; conf_thr_s[1] = 0.40f; conf_thr_s[2] = 0.30f; // more permissive for full-frame
} else {
conf_thr_s[0] = 0.75f; conf_thr_s[1] = 0.65f; conf_thr_s[2] = 0.50f; // tighter for PGIE/true ROI
}
const int topk_s[3] = {150, 100, 50}; // fewer candidates per stride before decode
const float nms_iou = 0.30f; // stricter NMS
const float min_face = (sgie_no_roi || sgie_fullframe) ? 16.0f : 24.0f; // allow smaller faces in full-frame sec
const int max_total_out = (sgie_no_roi || sgie_fullframe) ? 250 : 150; // allow a few more in fallback/full-frame sec
const int net_w = 640, net_h = 640; // assumed
const int strides[3] = {8, 16, 32};
// Anchors
std::vector<Anchor> A8, A16, A32;
generate_anchors(net_w, net_h, strides[0], kSizesS8, A8);
generate_anchors(net_w, net_h, strides[1], kSizesS16, A16);
generate_anchors(net_w, net_h, strides[2], kSizesS32, A32);
const std::vector<const std::vector<Anchor>*> AGRIDS = { &A8, &A16, &A32 };
for (uint32_t b = 0; b < B; ++b) {
std::vector<Proposal> props;
// Frame dimensions always refer to the original full frame (for clamping and offsets)
bool sgie_mode = (!objMetaList.empty());
uint32_t fidx = (!frameMetaList.empty() ? (sgie_mode ? 0u : std::min<uint32_t>(b, frameMetaList.size()-1)) : 0u);
float frame_w = (float)net_w, frame_h = (float)net_h;
if (b < surfParamsList.size() && surfParamsList[b]) {
frame_w = (float)surfParamsList[b]->width;
frame_h = (float)surfParamsList[b]->height;
} else if (!frameMetaList.empty() && frameMetaList[fidx]) {
// Prefer source frame dimensions from frame meta when surface params are unavailable
frame_w = (float)frameMetaList[fidx]->source_frame_width;
frame_h = (float)frameMetaList[fidx]->source_frame_height;
}
// For SGIE (clip objects), compute mapping with respect to the parent object's ROI
bool sgie_mode_roi = (!objMetaList.empty() && b < objMetaList.size() && objMetaList[b]);
float roi_x = 0.0f, roi_y = 0.0f, roi_w = frame_w, roi_h = frame_h;
if (sgie_mode_roi) {
const NvDsObjectMeta* parent = objMetaList[b];
const NvOSD_RectParams& pr = parent->rect_params;
roi_x = pr.left; roi_y = pr.top; roi_w = pr.width; roi_h = pr.height;
// Clamp to frame bounds defensively
roi_x = std::max(0.0f, std::min(roi_x, frame_w));
roi_y = std::max(0.0f, std::min(roi_y, frame_h));
roi_w = std::max(1.0f, std::min(roi_w, frame_w - roi_x));
roi_h = std::max(1.0f, std::min(roi_h, frame_h - roi_y));
}
// Preprocess mapping (maintain_aspect_ratio & symmetric_padding) from ROI to network
float r = std::min((float)net_w / roi_w, (float)net_h / roi_h);
float pad_x = (net_w - roi_w * r) * 0.5f;
float pad_y = (net_h - roi_h * r) * 0.5f;
for (int s = 0; s < 3; ++s) {
const int stride = strides[s];
const auto& anchors = *AGRIDS[s];
int N = (int)anchors.size();
auto* out_sc = outputs->getBuffer(s*3 + 0);
auto* out_bb = outputs->getBuffer(s*3 + 1);
auto* out_kp = outputs->getBuffer(s*3 + 2);
if (!out_sc || !out_bb || !out_kp) {
std::cerr << "[scrfd] missing output buffer at scale index " << s << "\n";
continue;
}
// Compute per-frame element counts from tensor dims to avoid overruns if shapes differ
auto elems_from_dims = [](const dsis::IBatchBuffer* buf) -> int {
auto d = buf->getBufDesc().dims;
long long prod = 1;
for (int i = 0; i < d.numDims; ++i) prod *= std::max(1, d.d[i]);
if (prod <= 0 || prod > INT32_MAX) return 0;
return (int)prod;
};
const int elems_sc = elems_from_dims(out_sc);
const int elems_bb = elems_from_dims(out_bb);
const int elems_kp = elems_from_dims(out_kp);
int n_from_buf = std::min({ elems_sc, (elems_bb > 0 ? elems_bb/4 : 0), (elems_kp > 0 ? elems_kp/10 : 0) });
if (n_from_buf <= 0) {
std::cerr << "[scrfd] invalid tensor shapes at scale " << s
<< " elems_sc=" << elems_sc << " elems_bb=" << elems_bb
<< " elems_kp=" << elems_kp << "\n";
continue;
}
if (N > n_from_buf) {
std::cerr << "[scrfd] anchor count(" << N << ") > buffer N(" << n_from_buf
<< ") at stride s" << stride << "; capping to prevent OOB\n";
N = n_from_buf;
}
// Batchless-output tolerant indexing: prefer b when buffer reports a batch, else index 0
auto select_index = [&](const dsis::IBatchBuffer* buf, const char* name, bool& ok) -> uint32_t {
uint32_t bs = buf->getBatchSize();
if (bs == 0) return 0; // treat as implicit batch-1
if (b >= bs) {
std::cerr << "[scrfd] batch index " << b << " out of range (" << bs << ") for " << name
<< " at scale " << s << "\n";
ok = false;
return 0;
}
return b;
};
// One-time shape print for sanity (before any early-returns)
static bool kPrintedShapes = false;
if (!kPrintedShapes) {
auto ds = out_sc->getBufDesc(); auto db = out_bb->getBufDesc(); auto dk = out_kp->getBufDesc();
auto pd = [&](const char* tag, const dsis::IBatchBuffer* buf, const auto& d){
std::cerr << "[scrfd] tensor " << tag << " dims=" << d.dims.numDims << " [";
for (int i=0;i<d.dims.numDims;++i){ std::cerr << d.dims.d[i] << (i+1<d.dims.numDims?",":""); }
std::cerr << "] batchReported=" << buf->getBatchSize() << "\n"; };
pd("score", out_sc, ds); pd("bbox", out_bb, db); pd("kps", out_kp, dk);
kPrintedShapes = true;
}
bool idx_ok = true;
uint32_t idx_sc = select_index(out_sc, "score", idx_ok);
uint32_t idx_bb = select_index(out_bb, "bbox", idx_ok);
uint32_t idx_kp = select_index(out_kp, "kps", idx_ok);
if (!idx_ok) continue;
const float* scores_base = static_cast<const float*>(out_sc->getBufPtr(idx_sc));
const float* bboxes_base = static_cast<const float*>(out_bb->getBufPtr(idx_bb));
const float* kpses_base = static_cast<const float*>(out_kp->getBufPtr(idx_kp));
if (!scores_base || !bboxes_base || !kpses_base) {
std::cerr << "[scrfd] null tensor ptr(s) at scale " << s << "\n";
continue;
}
// Adjust pointers for implicit-batch layout (concatenated per-frame) when batchReported==0
const uint32_t bs_sc = out_sc->getBatchSize();
const uint32_t bs_bb = out_bb->getBatchSize();
const uint32_t bs_kp = out_kp->getBatchSize();
const int num_cells = n_from_buf; // anchors per frame at this stride from buffer
const int step_sc = elems_sc; // scores per frame elements
const int step_bb = elems_bb; // bbox floats per frame elements
const int step_kp = elems_kp; // kps floats per frame elements
const float* scores = scores_base + ((bs_sc == 0) ? (int)b * step_sc : 0);
const float* bboxes = bboxes_base + ((bs_bb == 0) ? (int)b * step_bb : 0);
const float* kpses = kpses_base + ((bs_kp == 0) ? (int)b * step_kp : 0);
// Debug: summarize score distribution for this stride once per frame
float max_sc = 0.0f; int cnt_gt_01 = 0, cnt_gt_thr = 0;
for (int i = 0; i < N; ++i) {
float sc = scores[i];
if (sc > max_sc) max_sc = sc;
if (sc > 0.10f) ++cnt_gt_01;
if (sc > conf_thr_s[s]) ++cnt_gt_thr;
}
std::cerr << "[scrfd] b=" << b << " s" << stride
<< " N=" << N
<< " max_sc=" << max_sc
<< " gt0.1=" << cnt_gt_01
<< " gt_thr=" << cnt_gt_thr << "\n";
// Preselect top-K candidates above stride-specific threshold
std::vector<std::pair<float,int>> cand;
cand.reserve(std::min(N, topk_s[s]));
const float thr = conf_thr_s[s];
for (int i = 0; i < N; ++i) {
float score = scores[i];
if (score >= thr) cand.emplace_back(score, i);
}
// If none meet the threshold in fallback mode, still take the top-K by score to allow detections
if (cand.empty() && (sgie_no_roi || sgie_fullframe)) {
cand.reserve(std::min(N, topk_s[s]));
for (int i = 0; i < N; ++i) cand.emplace_back(scores[i], i);
}
if ((int)cand.size() > topk_s[s]) {
std::partial_sort(cand.begin(), cand.begin() + topk_s[s], cand.end(),
[](const auto& x, const auto& y){ return x.first > y.first; });
cand.resize(topk_s[s]);
} else {
std::sort(cand.begin(), cand.end(), [](const auto& x, const auto& y){ return x.first > y.first; });
}
for (const auto& kv : cand) {
int i = kv.second;
float score = kv.first;
const Anchor& a = anchors[i];
const float* bd = bboxes + i*4;
// Decode as distances from center (ltrb)
float left = a.cx - bd[0] * stride;
float top = a.cy - bd[1] * stride;
float right = a.cx + bd[2] * stride;
float bottom = a.cy + bd[3] * stride;
float w = std::max(0.0f, right - left);
float h = std::max(0.0f, bottom - top);
const float* kd = kpses + i*10;
float lm[10];
for (int k = 0; k < 5; ++k) {
lm[k*2] = a.cx + kd[k*2] * stride;
lm[k*2+1] = a.cy + kd[k*2+1] * stride;
}
Proposal p{}; p.score = score; p.rect.classId = 0;
// Rescale to ROI coordinates and then offset into full-frame coordinates
float rl = (left - pad_x) / r;
float rt = (top - pad_y) / r;
float rw = w / r;
float rh = h / r;
// Map into full-frame
float fl = roi_x + rl;
float ft = roi_y + rt;
p.rect.left = std::max(0.0f, fl);
p.rect.top = std::max(0.0f, ft);
p.rect.width = std::min(frame_w - p.rect.left, rw);
p.rect.height = std::min(frame_h - p.rect.top, rh);
for (int k = 0; k < 5; ++k) {
float lx = roi_x + (lm[k*2] - pad_x) / r;
float ly = roi_y + (lm[k*2+1] - pad_y) / r;
p.landmarks[k*2] = std::min(frame_w, std::max(0.0f, lx));
p.landmarks[k*2+1] = std::min(frame_h, std::max(0.0f, ly));
}
if (p.rect.width >= min_face && p.rect.height >= min_face) props.emplace_back(p);
}
}
std::cerr << "[scrfd] b=" << b << " props_pre_nms=" << props.size() << "\n";
auto finals = nms(props, nms_iou);
if ((int)finals.size() > max_total_out) {
std::partial_sort(finals.begin(), finals.begin() + max_total_out, finals.end(),
[](const Proposal& a, const Proposal& b){ return a.score > b.score; });
finals.resize(max_total_out);
}
std::cerr << "[scrfd] b=" << b << " props_post_nms=" << finals.size() << "\n";
if (attachObjMeta(inOptions, finals, b) != NVDSINFER_SUCCESS) {
std::cerr << "[scrfd] attachObjMeta failed for batch index " << b << "\n";
return NVDSINFER_CUSTOM_LIB_FAILED;
}
}
return NVDSINFER_SUCCESS;
}
NvDsInferStatus NvInferServerCustomProcess::attachObjMeta(
const dsis::IOptions* inOptions,
const std::vector<Proposal>& props,
uint32_t batchIdx)
{
NvDsBatchMeta* batchMeta = nullptr;
if (!inOptions->hasValue(OPTION_NVDS_BATCH_META) ||
inOptions->getObj(OPTION_NVDS_BATCH_META, batchMeta) != NVDSINFER_SUCCESS ||
!batchMeta) {
return NVDSINFER_CUSTOM_LIB_FAILED;
}
std::vector<NvDsFrameMeta*> frameMetaList;
inOptions->getValueArray(OPTION_NVDS_FRAME_META_LIST, frameMetaList);
std::vector<NvDsObjectMeta*> objMetaList;
if (inOptions->hasValue(OPTION_NVDS_OBJ_META_LIST)) {
inOptions->getValueArray(OPTION_NVDS_OBJ_META_LIST, objMetaList);
}
int64_t unique_id = 0; inOptions->getInt(OPTION_NVDS_UNIQUE_ID, unique_id);
// Resolve target frame meta robustly for both PGIE (full-frame) and SGIE (clip-objects)
NvDsFrameMeta* frameMeta = nullptr;
// For PGIE and SGIE, frameMetaList is expected to be aligned with the batch indices
if (batchIdx < frameMetaList.size()) frameMeta = frameMetaList[batchIdx];
if (!frameMeta) {
std::cerr << "[scrfd] attachObjMeta: missing frameMeta for batchIdx=" << batchIdx
<< " (frames=" << frameMetaList.size() << ", objs=" << objMetaList.size() << ")\n";
return NVDSINFER_CUSTOM_LIB_FAILED;
}
// Decide if this invocation is secondary (SGIE) vs primary (PGIE).
// Heuristics:
// - DS sets OPTION_NVDS_OBJ_META_LIST for SGIE clip-objects calls (key present; may be empty)
// - PGIE in this pipeline uses unique_id == 1
// Only treat "zero ROIs" as an SGIE condition if OPTION_NVDS_OBJ_META_LIST is actually present.
const bool hasObjListKey = inOptions->hasValue(OPTION_NVDS_OBJ_META_LIST);
int64_t stage_uid = 0; inOptions->getInt(OPTION_NVDS_UNIQUE_ID, stage_uid);
const bool likelyPGIE = (stage_uid == 1);
const bool isSecondary = (!likelyPGIE);
const bool sgie_mode = (hasObjListKey && !objMetaList.empty());
const bool sgie_fullframe = (isSecondary && !hasObjListKey);
if (hasObjListKey && objMetaList.empty()) {
// Many DS7 builds reset SGIE to full-frame if clip-objects cannot be honored; in that case,
// DeepStream still sets the OBJ_META_LIST key but leaves it empty. Proceed as full-frame to
// avoid silently dropping detections.
std::cerr << "[scrfd] attachObjMeta: SGIE zero ROIs (uid=" << stage_uid
<< ") ; FALLBACK to full-frame attach (default)\n";
}
// If secondary full-frame (key absent), emulate operate-on by filtering proposals to parent ROIs
std::vector<Proposal> filtered_props;
filtered_props.reserve(props.size());
if (sgie_fullframe && frameMeta) {
// Collect person ROIs from PGIE (class_id==0)
std::vector<NvOSD_RectParams> parent_rois;
for (NvDsMetaList* l = frameMeta->obj_meta_list; l != nullptr; l = l->next) {
NvDsObjectMeta* om = (NvDsObjectMeta*)l->data;
if (!om) continue;
if (om->class_id == 0) parent_rois.push_back(om->rect_params);
}
if (!parent_rois.empty()) {
for (const auto& p : props) {
float cx = p.rect.left + 0.5f * p.rect.width;
float cy = p.rect.top + 0.5f * p.rect.height;
bool inside = false;
for (const auto& pr : parent_rois) {
if (cx >= pr.left && cx <= pr.left + pr.width &&
cy >= pr.top && cy <= pr.top + pr.height) { inside = true; break; }
}
if (inside) filtered_props.push_back(p);
}
std::cerr << "[scrfd] attachObjMeta: filtered by PGIE ROIs: in=" << props.size()
<< " out=" << filtered_props.size() << "\n";
}
}
const std::vector<Proposal>& use_props = (!filtered_props.empty() ? filtered_props : props);
if (frameMetaList.empty() || (!frameMetaList[0])) {
std::cerr << "[scrfd] attachObjMeta: no valid frameMeta available; skipping attach (sgie="
<< (sgie_mode?1:0) << ")\n";
return NVDSINFER_SUCCESS;
}
for (const auto& p : use_props) {
NvDsObjectMeta* om = nvds_acquire_obj_meta_from_pool(batchMeta);
om->unique_component_id = unique_id;
om->confidence = p.score;
om->object_id = UNTRACKED_OBJECT_ID;
om->class_id = p.rect.classId; // 0 => face
NvOSD_RectParams& r = om->rect_params;
r.left = p.rect.left; r.top = p.rect.top;
r.width = p.rect.width; r.height = p.rect.height;
r.border_width = 2; r.has_bg_color = 0;
r.border_color = (NvOSD_ColorParams){0, 1, 0, 1};
// Minimal text/label to avoid any allocation/free issues inside OSD
NvOSD_TextParams& t = om->text_params;
om->obj_label[0] = '\0';
t.display_text = nullptr;
t.font_params.font_name = nullptr;
t.font_params.font_size = 0;
t.set_bg_clr = 0;
t.x_offset = 0; t.y_offset = 0;
// Ensure no mask meta is attached (no-op for this DS version)
// Important meta
om->detector_bbox_info.org_bbox_coords.left = r.left;
om->detector_bbox_info.org_bbox_coords.top = r.top;
om->detector_bbox_info.org_bbox_coords.width = r.width;
om->detector_bbox_info.org_bbox_coords.height = r.height;
// Validate rect to avoid downstream crashes
if (!(std::isfinite(r.left) && std::isfinite(r.top) && std::isfinite(r.width) && std::isfinite(r.height)) ||
r.width <= 0.0f || r.height <= 0.0f ||
r.left >= frameMeta->source_frame_width || r.top >= frameMeta->source_frame_height) {
// Skip invalid rect; best-effort: do not attach; DS will reclaim meta when frame ends
continue;
}
// Choose frame meta: SGIE uses first (single) frame; PGIE uses batch-aligned frame
NvDsFrameMeta* tgtFrame = frameMeta;
if (sgie_mode && !frameMetaList.empty() && frameMetaList[0]) {
tgtFrame = frameMetaList[0];
} else if (!sgie_mode && batchIdx < frameMetaList.size() && frameMetaList[batchIdx]) {
tgtFrame = frameMetaList[batchIdx];
}
// Final clamp to frame bounds (integer-safe) before attach
r.left = std::max(0.0f, std::min(r.left, (float)tgtFrame->source_frame_width - 1.0f));
r.top = std::max(0.0f, std::min(r.top, (float)tgtFrame->source_frame_height - 1.0f));
r.width = std::max(1.0f, std::min(r.width, (float)tgtFrame->source_frame_width - r.left));
r.height = std::max(1.0f, std::min(r.height, (float)tgtFrame->source_frame_height - r.top));
std::cerr << "[scrfd] attach: uid=" << stage_uid << " sgie=" << (sgie_mode?1:0)
<< " frame=" << (void*)tgtFrame
<< " rect=[" << r.left << "," << r.top << "," << r.width << "," << r.height << "]"
<< " conf=" << om->confidence << "\n";
// Attach to frame (no parent) under meta lock for maximum stability
nvds_acquire_meta_lock(batchMeta);
nvds_add_obj_meta_to_frame(tgtFrame, om, nullptr);
nvds_release_meta_lock(batchMeta);
std::cerr << "[scrfd] attach: ok\n";
if (stage_uid == 1) {
tgtFrame->bInferDone = TRUE; // Only mark PGIE (uid=1) as detector frame
}
}
return NVDSINFER_SUCCESS;
}
extern "C" dsis::IInferCustomProcessor* CreateInferServerCustomProcess(
const char* /*config*/, uint32_t /*configLen*/) {
std::cerr << "[scrfd] CreateInferServerCustomProcess() called\n";
return new NvInferServerCustomProcess();
}
What I Have Tried:
-
Hardening the parser: My
scrfd_custom_process.cppis already built to handle theFULL_FRAMEfallback. It manually fetches the PGIE person ROIs fromframeMeta->obj_meta_listand filters the face detections. This is unstable and often crashes or fails to attach. -
Safe Meta Attachment: Using
nvds_acquire_meta_lock/nvds_release_meta_lockin the parser when callingnvds_add_obj_meta_to_frame. -
Validating Rects: Clamping all final coordinates to be within the frame dimensions before attaching metadata.
-
g_object_set: I previously tried to setprocess-modeandoperate-on-gie-idviag_object_seton thenvinferserverelement in C++, but DeepStream warned these properties are not supported (which is why I am relying 100% on the config file).
Question:
-
What is the correct way to make
nvinferserverrespectPROCESS_MODE_CLIP_OBJECTSwhen used as an SGIE? Why is it being reset toFULL_FRAME? -
Is this a known limitation in DeepStream 7.0 for
nvinferserver(compared to the standardnvinferplugin)? -
Under what conditions should my
IInferCustomProcessorreceive theOPTION_NVDS_OBJ_META_LIST? Is its absence expected onceprocess_modeis reset? -
Given the crashes, is my fallback logic or metadata attachment (
nvds_add_obj_meta_to_frame) incorrect for an SGIE? Should I be attaching to a parent object instead of the frame?
Please guide me on how to properly configure this nvinferserver SGIE!
Any suggestions or examples will be greatly appreciated.
Thank you!