Custom Yolov8n-face and FER Model Integration into Deepstream

sahil.ranmbail · January 9, 2025, 1:17pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) RTX 4060
• DeepStream Version7.1
• JetPack Version (valid for Jetson only)
• TensorRT Version10.3
**• NVIDIA GPU Driver Version (valid for GPU only)**12.6
• Issue Type( questions, new requirements, bugs)

i have succesfully implemented the yolov8n-face model in deepstream pipeline here using this repo here: GitHub - marcoslucianops/DeepStream-Yolo-Face: NVIDIA DeepStream SDK 8.0 / 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 application for YOLO-Face models

Now i wanted to make it more abroad so i took the emotion model from this official repo of model: GitHub - justinshenk/fer: Facial Expression Recognition with a deep neural network as a PyPI package

and i have converted the model into onnx format using this code here:
import onnx
from onnx import shape_inference

Load the ONNX model

model_path = “model.onnx”
model = onnx.load(model_path)

Modify dynamic dimensions

for input_tensor in model.graph.input:
if input_tensor.type.tensor_type.shape.dim[0].dim_param == “unk__189”:
input_tensor.type.tensor_type.shape.dim[0].dim_param = “batch_size”
input_tensor.type.tensor_type.shape.dim[0].dim_value = 1 # Replace with desired batch size

Infer shapes

inferred_model = shape_inference.infer_shapes(model)

Save the updated ONNX model

onnx.save(inferred_model, “updated_model.onnx”)
print(“Updated ONNX model saved as updated_model.onnx”)

and then dure to shapes issue i have changed the input shape of the model using this code here:
import onnx
import onnx_graphsurgeon as gs

Load the model

graph = gs.import_onnx(onnx.load(“updated_model.onnx”))

Find the input tensor

input_tensor = graph.inputs[0]

Transpose from NHWC to NCHW

graph.inputs[0].shape = [1, 1, 64, 64]
graph.inputs[0].dtype = np.float32

Save the modified model

onnx.save(gs.export_onnx(graph), “emotion_classifier_transposed.onnx”)

after doing everything i checked my onnx model on netron and these are the specification of model:

name: input_1

tensor: float32[1,1,64,64]

Outputs

name: output_0

tensor: float32[1,7]

here is my config which i created for sgie:

emotion_classifier_sgie_config.txt (849 Bytes)

now when i try to run the pipeline im getting this error here with only the black display showing:
root@AAM-LAPTOP-027:/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face# python3 deepstream.py --source=file:///opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/footage.mp4 --config-infer=config_infer_primary_yoloV8_face.txt --config-sgie=emotion_classifier_sgie_config.txt --streammux-batch-size=1 --streammux-width=1920 --streammux-height=1080 --gpu-id=0 --fps-interval=5
/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/deepstream.py:247: DeprecationWarning: Gst.Element.get_request_pad is deprecated
streammux_sink_pad = streammux.get_request_pad(pad_name)
Unknown or legacy key specified ‘enable-custom-parser’ for group [property]
Unknown or legacy key specified ‘custom-parser-name’ for group [property]
0:00:00.619980464 2144 0x55e27e069d30 WARN nvinfer gstnvinfer.cpp:681:gst_nvinfer_logger: NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1243> [UID = 2]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:00.620304722 2144 0x55e27e069d30 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2106> [UID = 2]: Trying to create engine from model files
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:922 INT8 calibration file not specified. Trying FP16 mode.
0:00:31.599669273 2144 0x55e27e069d30 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2138> [UID = 2]: serialize cuda engine to file: /opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/emotion_classifier_transposed.onnx_b1_gpu0_fp16.engine successfully
Implicit layer support has been deprecated
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

0:00:31.924694784 2144 0x55e27e069d30 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 2]: Load new model:emotion_classifier_sgie_config.txt sucessfully
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:32.088319474 2144 0x55e27e069d30 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/yolov8n-face.onnx_b1_gpu0_fp32.engine
Implicit layer support has been deprecated
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

0:00:32.088364480 2144 0x55e27e069d30 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/yolov8n-face.onnx_b1_gpu0_fp32.engine
0:00:32.092202563 2144 0x55e27e069d30 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:config_infer_primary_yoloV8_face.txt sucessfully

Failed to query video capabilities: Inappropriate ioctl for device
0:00:32.911286588 2144 0x55e3135d7200 ERROR nvinfer gstnvinfer.cpp:678:gst_nvinfer_logger: NvDsInferContext[UID 2]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:60> [UID = 2]: Could not find output coverage layer for parsing objects
0:00:32.911313751 2144 0x55e3135d7200 ERROR nvinfer gstnvinfer.cpp:678:gst_nvinfer_logger: NvDsInferContext[UID 2]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:736> [UID = 2]: Failed to parse bboxes
Segmentation fault (core dumped)

As im new to this and testing and still learning this framework, i would really appreciate if any professional here can help me out where do im making mistakes and how can i do it right.

fanzh · January 10, 2025, 2:57am

In emotion_classifier_sgie_config.txt, please set the following configurations. please find the explanation in the doc and opensource code /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer.
network-type=1
process-mode=2

sahil.ranmbail · January 10, 2025, 5:50am

yes the stream is now running but im only able to see the detection of the faces but no emotions at all, i think there is something wrong with my custom parser for emotions, can you please see and help me out here:
here is the NvDsInferParseEmotion.cpp content:

/*

Custom Parser for Emotion Classification in DeepStream
Based on NvDsInferParseEmotion.cpp
Edited by [Your Name]
*/

include
include
include

include “nvdsinfer_custom_impl.h”

// Define the parse function prototype
extern “C” bool
NvDsInferParseEmotion(std::vector const& outputLayersInfo,
NvDsInferNetworkInfo const& networkInfo,
NvDsInferParseDetectionParams const& detectionParams,
std::vector& objectList);

// Utility function to clamp values
static float clamp(float val, float min_val, float max_val) {
return std::max(min_val, std::min(val, max_val));
}

static bool
NvDsInferParseCustomEmotion(std::vector const& outputLayersInfo,
NvDsInferNetworkInfo const& networkInfo,
NvDsInferParseDetectionParams const& detectionParams,
std::vector& objectList)
{
if (outputLayersInfo.empty()) {
std::cerr << “ERROR: Could not find output layer in emotion parsing” << std::endl;
return false;
}

const NvDsInferLayerInfo& output = outputLayersInfo[0];

const uint outputSize = output.inferDims.d[0]; // Batch size, should be 1
const uint numClasses = output.inferDims.d[1]; // 7 emotion classes

// Ensure the output buffer is not null
if (output.buffer == nullptr) {
    std::cerr << "ERROR: Output buffer is null" << std::endl;
    return false;
}

// Assuming output buffer is float32[1,7]
const float* outputBuffer = (const float*)(output.buffer);

// Loop through each batch (only 1 in this case)
for (uint b = 0; b < outputSize; ++b) {
    // Find the class with the highest probability
    float maxProb = 0.0f;
    int maxClass = -1;
    for (uint c = 0; c < numClasses; ++c) {
        float prob = outputBuffer[b * numClasses + c];
        if (prob > maxProb) {
            maxProb = prob;
            maxClass = c;
        }
    }

    // Apply threshold
    if (maxProb < detectionParams.perClassPreclusterThreshold[0]) { // Assuming a single threshold
        continue;
    }

    // Clamp the probability to [0,1]
    maxProb = clamp(maxProb, 0.0f, 1.0f);

    // Create an instance mask info for the emotion classification
    NvDsInferInstanceMaskInfo emoInfo;
    emoInfo.classId = maxClass;
    emoInfo.detectionConfidence = maxProb;
    // Since this is classification, bounding box info is not required
    emoInfo.left = 0;
    emoInfo.top = 0;
    emoInfo.width = 0;
    emoInfo.height = 0;
    emoInfo.mask = nullptr;
    emoInfo.mask_width = 0;
    emoInfo.mask_height = 0;
    emoInfo.mask_size = 0;

    objectList.push_back(emoInfo);
}

return true;

}

extern “C” bool
NvDsInferParseEmotion(std::vector const& outputLayersInfo,
NvDsInferNetworkInfo const& networkInfo,
NvDsInferParseDetectionParams const& detectionParams,
std::vector& objectList)
{
return NvDsInferParseCustomEmotion(outputLayersInfo, networkInfo, detectionParams, objectList);
}

CHECK_CUSTOM_INSTANCE_MASK_PARSE_FUNC_PROTOTYPE(NvDsInferParseEmotion);

sahil.ranmbail · January 10, 2025, 7:27am

Update: No classifier meta list for object ID 0
No classifier meta list for object ID 1
No classifier meta list for object ID 0
No classifier meta list for object ID 1
No classifier meta list for object ID 0
No classifier meta list for object ID 1
^Z
[7]+ Stopped python3 deepstream.py --source=file:///opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/footage.mp4 --config-infer=config_infer_primary_yoloV8_face.txt --config-sgie=emotion_classifier_sgie_config.txt --streammux-batch-size=1 --streammux-width=1920 --streammux-height=1080 --gpu-id=0 --fps-interval=5

this is the only output im seeing in logs with the display of only faces being detected:

fanzh · January 10, 2025, 8:52am

please make sure the preprocessing configurations of sgie are correct.
if commenting out custom-lib-path and custom-parser-name, will the app output classification? nvinfer will use the default postprocessing function.
if using custom postprocessing function, since sige is a classification model, please set parse-classifier-func-name instead of custom-parser-name.

sahil.ranmbail · January 13, 2025, 8:08am

im still looking for help here, i have done everything but still not able to get the classification meta data attached to the final output.
here is my sgie config:
[property]
gpu-id=0
net-scale-factor=0.003921568
onnx-file=emotion_classifier_transposed.onnx
model-engine-file=emotion_classifier_transposed.onnx_b1_gpu0_fp16.engine
labelfile-path=emotion_labels.txt
batch-size=1
num-detected-classes=7
input-object-min-width=64
input-object-min-height=64
output-blob-names=output_0
network-mode=1
process-mode=2
model-color-format=2
gpu-id=0
gie-unique-id=2
operate-on-gie-id=1
operate-on-class-ids=0
is-classifier=1
classifier-async-mode=1
classifier-threshold=0.4
#scaling-filter=0
#scaling-compute-hw=0

[parser]
enable=1

and here are the logs im getting:
root@AAM-LAPTOP-027:/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face# python3 deepstream.py --source=file:///opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/women2.mp4 --config-infer=config_infer_primary_yoloV8_face.txt --config-sgie=emotion_classifier_sgie_config.txt
/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/deepstream.py:300: DeprecationWarning: Gst.Element.get_request_pad is deprecated
streammux_sink_pad = streammux.get_request_pad(pad_name)
0:00:00.309059365 3811 0x564c8e72bd10 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 2]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/emotion_classifier_transposed.onnx_b1_gpu0_fp16.engine
Implicit layer support has been deprecated
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

0:00:00.309128073 3811 0x564c8e72bd10 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 2]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/emotion_classifier_transposed.onnx_b1_gpu0_fp16.engine
0:00:00.311320799 3811 0x564c8e72bd10 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 2]: Load new model:emotion_classifier_sgie_config.txt sucessfully
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:00.464960065 3811 0x564c8e72bd10 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/yolov8n-face.onnx_b1_gpu0_fp32.engine
Implicit layer support has been deprecated
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:327 [Implicit Engine Info]: layers num: 0

0:00:00.465001608 3811 0x564c8e72bd10 INFO nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-7.1/sources/DeepStream-Yolo-Face/yolov8n-face.onnx_b1_gpu0_fp32.engine
0:00:00.468016720 3811 0x564c8e72bd10 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus: [UID 1]: Load new model:config_infer_primary_yoloV8_face.txt sucessfully

Failed to query video capabilities: Inappropriate ioctl for device

(python3:3811): GStreamer-CRITICAL **: 08:08:17.134: gst_debug_log_valist: assertion ‘category != NULL’ failed

(python3:3811): GStreamer-CRITICAL **: 08:08:17.134: gst_debug_log_valist: assertion ‘category != NULL’ failed
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0

fanzh · January 13, 2025, 8:21am

In the screenshot, there is “face 0 angry” on the video. seems sgie is working. do you mean the result “angry” is not correct? did you verify model by other tools? 1. please make sure the preprocessing configurations of sgie are consistent with testing in other tools.

sahil.ranmbail · January 13, 2025, 8:24am

no the angry in the screenshot is just picking up the 0th index from the label file, i have tested on multiple videos and it is giving me the same label and if i replace the the index with some other label then it gives me that label which is present at the index.
this is not actually the real output of the model, it is just picking up the value from oth ondex for no reason.

and the the logs you can see the classification data is not being attached.

[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0

fanzh · January 13, 2025, 8:54am

if using network-mode=0 and classifier-threshold=0.1, will the app output the correct classification?
nvinfer does not support that “[parser]” setting. from the nvinfer configuration, there is no out custom-lib-path and custom-parser-name configuration, nvinfer will use the default postprocessing function parseAttributesFromSoftmaxLayers in ClassifyPostprocessor::fillClassificationOutput of \opt\nvidia\deepstream\deepstream\sources\libs\nvdsinfer\nvdsinfer_context_impl_output_parsing.cpp. you can add log to check if all 7 classification probability is low.

sahil.ranmbail · January 13, 2025, 9:27am

By using network-mode=0 and classifier-threshold=0.1, still the same results:

i only enable the parser to see if its pick any data from default parser but still im not able to see any emotion detection on it,

[SGIE Probe] Object ID=0, class_id=0

here i can see that the meta data from the face detection model is being sent to the SGIE, but the SGIE is saying this: [SGIE Probe] No classifier meta for object ID=0

fanzh · January 13, 2025, 9:45am

how did you do this? nvinfer does not support this “[parser]” configuration.

how did you add this log? please add the probe function on the src pad of sige.
where is the model preprocessing introduction? how do you know net-scale-factor is 0.003921568?

sahil.ranmbail · January 13, 2025, 9:57am

the preprocessing steps are already defined in the sgie, color mode is gray scale and input output is 64x64, and i checked the scale factor is as 0.003921568 by checking the model output in offline mode by dividing the output with 255.
enabling or disabling the parser does not make any difference anyway.

i defined the logs in main deepstream.py pipeline, to see if the pgie meta data se successfully being sent to the sgie and its bein gread from sgie, but after the sgie im not seeing any output data of classification.

here is my deepstream.py pipeline code reference:
#!/usr/bin/env python3

import gi
gi.require_version(‘Gst’, ‘1.0’)
from gi.repository import Gst, GLib

import os
import sys
import time
import argparse
import platform
from ctypes import *

sys.path.append(‘/opt/nvidia/deepstream/deepstream/lib’)
import pyds

MAX_ELEMENTS_IN_DISPLAY_META = 16

Define class IDs

PGIE_CLASS_ID_FACE = 0 # Add this line

Global Variables

SOURCE = ‘’
CONFIG_INFER = ‘’
CONFIG_SGIE = ‘’
STREAMMUX_BATCH_SIZE = 1
STREAMMUX_WIDTH = 1920
STREAMMUX_HEIGHT = 1080
GPU_ID = 0
PERF_MEASUREMENT_INTERVAL_SEC = 5

Skeleton for facial landmarks (if needed)

skeleton = [[16, 14], [14, 12], [17, 15], [15, 13], [12, 13], [6, 12], [7, 13],
[6, 7], [6, 8], [7, 9], [8, 10], [9, 11], [2, 3], [1, 2],
[1, 3], [2, 4], [3, 5], [4, 6], [5, 7]]

start_time = time.time()
fps_streams = {}

class GETFPS:
def init(self, stream_id):
global start_time
self.start_time = start_time
self.is_first = True
self.frame_count = 0
self.stream_id = stream_id
self.total_fps_time = 0
self.total_frame_count = 0

def get_fps(self):
    end_time = time.time()
    if self.is_first:
        self.start_time = end_time
        self.is_first = False
    current_time = end_time - self.start_time
    if current_time > PERF_MEASUREMENT_INTERVAL_SEC:
        self.total_fps_time += current_time
        self.total_frame_count += self.frame_count
        current_fps = float(self.frame_count) / current_time
        avg_fps = float(self.total_frame_count) / self.total_fps_time
        sys.stdout.write('DEBUG: FPS of stream %d: %.2f (Average: %.2f)\n' %
                         (self.stream_id + 1, current_fps, avg_fps))
        self.start_time = end_time
        self.frame_count = 0
    else:
        self.frame_count += 1

def set_custom_bbox(obj_meta):
border_width = 6
font_size = 18
x_offset = int(min(STREAMMUX_WIDTH - 1, max(0, obj_meta.rect_params.left - (border_width / 2))))
y_offset = int(min(STREAMMUX_HEIGHT - 1, max(0, obj_meta.rect_params.top - (font_size * 2) + 1)))

obj_meta.rect_params.border_width = border_width
obj_meta.rect_params.border_color.red = 0.0
obj_meta.rect_params.border_color.green = 0.0
obj_meta.rect_params.border_color.blue = 1.0
obj_meta.rect_params.border_color.alpha = 1.0
obj_meta.text_params.font_params.font_name = 'Ubuntu'
obj_meta.text_params.font_params.font_size = font_size
obj_meta.text_params.x_offset = x_offset
obj_meta.text_params.y_offset = y_offset
obj_meta.text_params.font_params.font_color.red = 1.0
obj_meta.text_params.font_params.font_color.green = 1.0
obj_meta.text_params.font_params.font_color.blue = 1.0
obj_meta.text_params.font_params.font_color.alpha = 1.0
obj_meta.text_params.set_bg_clr = 1
obj_meta.text_params.text_bg_clr.red = 0.0
obj_meta.text_params.text_bg_clr.green = 0.0
obj_meta.text_params.text_bg_clr.blue = 1.0
obj_meta.text_params.text_bg_clr.alpha = 1.0

def parse_face_from_meta(frame_meta, obj_meta):
# For face landmarks, if your model outputs them
num_joints = int(obj_meta.mask_params.size / (sizeof(c_float) * 3))

gain = min(obj_meta.mask_params.width / STREAMMUX_WIDTH,
           obj_meta.mask_params.height / STREAMMUX_HEIGHT)
pad_x = (obj_meta.mask_params.width - STREAMMUX_WIDTH * gain) / 2.0
pad_y = (obj_meta.mask_params.height - STREAMMUX_HEIGHT * gain) / 2.0

batch_meta = frame_meta.base_meta.batch_meta
display_meta = pyds.nvds_acquire_display_meta_from_pool(batch_meta)
pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)

for i in range(num_joints):
    data = obj_meta.mask_params.get_mask_array()
    xc = int((data[i * 3 + 0] - pad_x) / gain)
    yc = int((data[i * 3 + 1] - pad_y) / gain)
    confidence = data[i * 3 + 2]

    if confidence < 0.5:
        continue

    if display_meta.num_circles == MAX_ELEMENTS_IN_DISPLAY_META:
        display_meta = pyds.nvds_acquire_display_meta_from_pool(batch_meta)
        pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)

    circle_params = display_meta.circle_params[display_meta.num_circles]
    circle_params.xc = xc
    circle_params.yc = yc
    circle_params.radius = 6
    circle_params.circle_color.red = 1.0
    circle_params.circle_color.green = 1.0
    circle_params.circle_color.blue = 1.0
    circle_params.circle_color.alpha = 1.0
    circle_params.has_bg_color = 1
    circle_params.bg_color.red = 0.0
    circle_params.bg_color.green = 0.0
    circle_params.bg_color.blue = 1.0
    circle_params.bg_color.alpha = 1.0
    display_meta.num_circles += 1

import sys
import gi
gi.require_version(‘Gst’, ‘1.0’)
from gi.repository import Gst

import pyds

Define label map

label_map = [
“Angry”,
“Disgust”,
“Fear”,
“Happy”,
“Sad”,
“Surprise”,
“Neutral”
]

PGIE_CLASS_ID_FACE = 0 # Assuming PGIE assigns class ID 0 to faces

<<< ADDED: Additional Pad Probe for SGIE >>>

def sgie_src_pad_buffer_probe(pad, info, u_data):
“”"
Probe to confirm classification metadata is attached AFTER SGIE runs.
“”"
gst_buffer = info.get_buffer()
if not gst_buffer:
print(“[SGIE Probe] Unable to get GstBuffer”)
return Gst.PadProbeReturn.OK

batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
if not batch_meta:
    return Gst.PadProbeReturn.OK

l_frame = batch_meta.frame_meta_list
while l_frame is not None:
    try:
        frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
    except StopIteration:
        break

    l_obj = frame_meta.obj_meta_list
    while l_obj is not None:
        try:
            obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
        except StopIteration:
            break

        # Print out the object class_id so we know which IDs are coming through
        print(f"[SGIE Probe] Object ID={obj_meta.object_id}, class_id={obj_meta.class_id}")

        # If classification metadata is attached, this list won't be None
        if obj_meta.classifier_meta_list is not None:
            print(f"[SGIE Probe] Classifier meta found for object ID={obj_meta.object_id}")
        else:
            print(f"[SGIE Probe] No classifier meta for object ID={obj_meta.object_id}")

        l_obj = l_obj.next
    l_frame = l_frame.next

return Gst.PadProbeReturn.OK

def tracker_src_pad_buffer_probe(pad, info, u_data):
“”"
Existing tracker probe.
Shows when faces are detected, tries to read classification meta.
“”"
gst_buffer = info.get_buffer()
if not gst_buffer:
print(“Unable to get GstBuffer”)
return Gst.PadProbeReturn.OK

batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list

while l_frame is not None:
    try:
        frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
    except StopIteration:
        break

    l_obj_meta = frame_meta.obj_meta_list
    while l_obj_meta is not None:
        try:
            obj_meta = pyds.NvDsObjectMeta.cast(l_obj_meta.data)
        except StopIteration:
            break

        # <<< ADDED: Print out the class_id after PGIE >>>
        print(f"[Tracker Probe] Object ID={obj_meta.object_id}, class_id={obj_meta.class_id}")

        if obj_meta.class_id == PGIE_CLASS_ID_FACE:
            try:
                if obj_meta.classifier_meta_list is not None:
                    classifier_meta = pyds.NvDsClassifierMeta.cast(obj_meta.classifier_meta_list.data)
                    if classifier_meta is not None:
                        class_info_list = classifier_meta.class_info_list

                        if class_info_list is None:
                            print(f"[Python Probe] No class info list for object ID {obj_meta.object_id}")
                        else:
                            while class_info_list is not None:
                                try:
                                    class_info = pyds.NvDsInferObjectDetectionInfo.cast(class_info_list.data)
                                except StopIteration:
                                    break

                                emotion_id = class_info.classId
                                confidence = class_info.detectionConfidence

                                if 0 <= emotion_id < len(label_map):
                                    emotion_label = label_map[emotion_id]
                                else:
                                    emotion_label = "Unknown"

                                # Update the display text with emotion_label
                                obj_meta.text_params.display_text = f"ID:{obj_meta.object_id} {emotion_label}"

                                print(f"[Python Probe] Object ID {obj_meta.object_id}: {emotion_label} ({confidence:.2f})")

                                class_info_list = class_info_list.next
                else:
                    print(f"[Python Probe] No classifier meta list for object ID {obj_meta.object_id}")

            except Exception as e:
                print(f"Error accessing classifier meta: {e}")

        l_obj_meta = l_obj_meta.next
    l_frame = l_frame.next

return Gst.PadProbeReturn.OK

def decodebin_child_added(child_proxy, Object, name, user_data):
if name.find(‘decodebin’) != -1:
Object.connect(‘child-added’, decodebin_child_added, user_data)
if name.find(‘nvv4l2decoder’) != -1:
Object.set_property(‘drop-frame-interval’, 0)
Object.set_property(‘num-extra-surfaces’, 1)
if is_aarch64():
Object.set_property(‘enable-max-performance’, 1)
else:
Object.set_property(‘cudadec-memtype’, 0)
Object.set_property(‘gpu-id’, GPU_ID)

def cb_newpad(decodebin, pad, user_data):
streammux_sink_pad = user_data
caps = pad.get_current_caps()
if not caps:
caps = pad.query_caps()
structure = caps.get_structure(0)
name = structure.get_name()
features = caps.get_features(0)
if name.find(‘video’) != -1:
if features.contains(‘memory:NVMM’):
if pad.link(streammux_sink_pad) != Gst.PadLinkReturn.OK:
sys.stderr.write(‘ERROR: Failed to link source to streammux sink pad\n’)
else:
sys.stderr.write(‘ERROR: decodebin did not pick NVIDIA decoder plugin’)

def create_uridecode_bin(stream_id, uri, streammux):
bin_name = ‘source-bin-%04d’ % stream_id
bin = Gst.ElementFactory.make(‘uridecodebin’, bin_name)
if ‘rtsp://’ in uri:
pyds.configure_source_for_ntp_sync(bin)
bin.set_property(‘uri’, uri)
pad_name = ‘sink_%u’ % stream_id
streammux_sink_pad = streammux.get_request_pad(pad_name)
bin.connect(‘pad-added’, cb_newpad, streammux_sink_pad)
bin.connect(‘child-added’, decodebin_child_added, 0)
fps_streams[‘stream{0}’.format(stream_id)] = GETFPS(stream_id)
return bin

def bus_call(bus, message, user_data):
loop = user_data
t = message.type
if t == Gst.MessageType.EOS:
sys.stdout.write(‘DEBUG: EOS\n’)
loop.quit()
elif t == Gst.MessageType.WARNING:
err, debug = message.parse_warning()
sys.stderr.write(‘WARNING: %s: %s\n’ % (err, debug))
elif t == Gst.MessageType.ERROR:
err, debug = message.parse_error()
sys.stderr.write(‘ERROR: %s: %s\n’ % (err, debug))
loop.quit()
return True

def is_aarch64():
return platform.machine() == ‘aarch64’

def main():
Gst.init(None)

loop = GLib.MainLoop()

pipeline = Gst.Pipeline()
if not pipeline:
    sys.stderr.write('ERROR: Failed to create pipeline\n')
    sys.exit(1)

# Create StreamMuxer
streammux = Gst.ElementFactory.make('nvstreammux', 'Stream-muxer')
if not streammux:
    sys.stderr.write('ERROR: Failed to create nvstreammux\n')
    sys.exit(1)
pipeline.add(streammux)

# Create Source Bin
source_bin = create_uridecode_bin(0, SOURCE, streammux)
if not source_bin:
    sys.stderr.write('ERROR: Failed to create source_bin\n')
    sys.exit(1)
pipeline.add(source_bin)

# Create Primary Inference (PGIE)
pgie = Gst.ElementFactory.make('nvinfer', 'primary-inference')
if not pgie:
    sys.stderr.write('ERROR: Failed to create PGIE\n')
    sys.exit(1)
pgie.set_property('config-file-path', CONFIG_INFER)
pipeline.add(pgie)

# Create Tracker
tracker = Gst.ElementFactory.make('nvtracker', 'tracker')
if not tracker:
    sys.stderr.write('ERROR: Failed to create tracker\n')
    sys.exit(1)
# Configure tracker properties
tracker.set_property('tracker-width', 640)
tracker.set_property('tracker-height', 384)
tracker.set_property('gpu_id', GPU_ID)
tracker.set_property('ll-lib-file', '/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so')
tracker.set_property('ll-config-file',
                     '/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml')
tracker.set_property('display-tracking-id', 1)
pipeline.add(tracker)

# Create Secondary Inference (SGIE - Emotion Classification)
sgie = Gst.ElementFactory.make('nvinfer', 'secondary-inference')
if not sgie:
    sys.stderr.write('ERROR: Failed to create SGIE\n')
    sys.exit(1)
sgie.set_property('config-file-path', CONFIG_SGIE)
pipeline.add(sgie)

# Link Elements: StreamMuxer -> PGIE -> Tracker -> SGIE
streammux.link(pgie)
pgie.link(tracker)
tracker.link(sgie)

# Create Video Converter for General Conversion
converter = Gst.ElementFactory.make('nvvideoconvert', 'converter')
if not converter:
    sys.stderr.write('ERROR: Failed to create converter\n')
    sys.exit(1)
pipeline.add(converter)

# Create On-Screen Display (OSD)
osd = Gst.ElementFactory.make('nvdsosd', 'nvdsosd')
if not osd:
    sys.stderr.write('ERROR: Failed to create nvdsosd\n')
    sys.exit(1)
osd.set_property('process-mode', int(pyds.MODE_GPU))
pipeline.add(osd)

# Create Sink
sink = None
if is_aarch64():
    sink = Gst.ElementFactory.make('nv3dsink', 'nv3d-sink')
    if not sink:
        sys.stderr.write('ERROR: Failed to create nv3dsink\n')
        sys.exit(1)
else:
    sink = Gst.ElementFactory.make('nveglglessink', 'nvvideo-renderer')
    if not sink:
        sys.stderr.write('ERROR: Failed to create nveglglessink\n')
        sys.exit(1)
sink.set_property('async', 1)
sink.set_property('sync', 1)
sink.set_property('qos', 1)
pipeline.add(sink)

# Link SGIE -> Converter -> OSD -> Sink
sgie.link(converter)
converter.link(osd)
osd.link(sink)

# Configure StreamMuxer Properties
streammux.set_property('batch-size', STREAMMUX_BATCH_SIZE)
streammux.set_property('batched-push-timeout', 25000)
streammux.set_property('width', STREAMMUX_WIDTH)
streammux.set_property('height', STREAMMUX_HEIGHT)
streammux.set_property('enable-padding', 0)
streammux.set_property('live-source', 1)
streammux.set_property('attach-sys-ts', 1)

# Additional Property Configurations for PGIE and Tracker
if 'file://' in SOURCE:
    streammux.set_property('live-source', 0)

if tracker.find_property('enable_batch_process') is not None:
    tracker.set_property('enable_batch_process', 1)

if tracker.find_property('enable_past_frame') is not None:
    tracker.set_property('enable_past_frame', 1)

if not is_aarch64():
    streammux.set_property('nvbuf-memory-type', 0)
    streammux.set_property('gpu_id', GPU_ID)
    pgie.set_property('gpu_id', GPU_ID)
    tracker.set_property('gpu_id', GPU_ID)
    sgie.set_property('gpu_id', GPU_ID)
    converter.set_property('gpu_id', GPU_ID)
    osd.set_property('gpu_id', GPU_ID)

# Add Probe to Tracker's Source Pad
tracker_src_pad = tracker.get_static_pad('src')
if not tracker_src_pad:
    sys.stderr.write('ERROR: Failed to get tracker src pad\n')
    sys.exit(1)
else:
    tracker_src_pad.add_probe(Gst.PadProbeType.BUFFER, tracker_src_pad_buffer_probe, 0)

# <<< ADDED: Also add a Probe to SGIE's Source Pad >>>
sgie_src_pad = sgie.get_static_pad('src')
if not sgie_src_pad:
    sys.stderr.write('ERROR: Failed to get SGIE src pad\n')
    sys.exit(1)
else:
    sgie_src_pad.add_probe(Gst.PadProbeType.BUFFER, sgie_src_pad_buffer_probe, 0)

# Add Bus Call Function
bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message', bus_call, loop)

# Start Playing
pipeline.set_state(Gst.State.PLAYING)
sys.stdout.write('\n')

try:
    loop.run()
except:
    pass

# Cleanup
pipeline.set_state(Gst.State.NULL)
sys.stdout.write('\n')

def parse_args():
global SOURCE, CONFIG_INFER, CONFIG_SGIE, STREAMMUX_BATCH_SIZE, STREAMMUX_WIDTH, STREAMMUX_HEIGHT, GPU_ID,
PERF_MEASUREMENT_INTERVAL_SEC

parser = argparse.ArgumentParser(description='DeepStream Face Detection with Emotion Classification')
parser.add_argument('-s', '--source', required=True, help='Source stream/file')
parser.add_argument('-c', '--config-infer', required=True, help='Config infer file for PGIE')
parser.add_argument('-c_sgie', '--config-sgie', required=True, help='Config infer file for SGIE')
parser.add_argument('-b', '--streammux-batch-size', type=int, default=1, help='Streammux batch-size (default: 1)')
parser.add_argument('-w', '--streammux-width', type=int, default=1920, help='Streammux width (default: 1920)')
parser.add_argument('-e', '--streammux-height', type=int, default=1080, help='Streammux height (default: 1080)')
parser.add_argument('-g', '--gpu-id', type=int, default=0, help='GPU id (default: 0)')
parser.add_argument('-f', '--fps-interval', type=int, default=5, help='FPS measurement interval (default: 5)')
args = parser.parse_args()

if args.source == '':
    sys.stderr.write('ERROR: Source not found\n')
    sys.exit(1)
if args.config_infer == '' or not os.path.isfile(args.config_infer):
    sys.stderr.write('ERROR: Config infer not found\n')
    sys.exit(1)
if args.config_sgie == '' or not os.path.isfile(args.config_sgie):
    sys.stderr.write('ERROR: Config infer for SGIE not found\n')
    sys.exit(1)

SOURCE = args.source
CONFIG_INFER = args.config_infer
CONFIG_SGIE = args.config_sgie
STREAMMUX_BATCH_SIZE = args.streammux_batch_size
STREAMMUX_WIDTH = args.streammux_width
STREAMMUX_HEIGHT = args.streammux_height
GPU_ID = args.gpu_id
PERF_MEASUREMENT_INTERVAL_SEC = args.fps_interval

if name == ‘main’:
parse_args()
sys.exit(main())

sahil.ranmbail · January 13, 2025, 10:18am

here i have done some modification in this file here which is the default parser here:
\opt\nvidia\deepstream\deepstream\sources\libs\nvdsinfer\nvdsinfer_context_impl_output_parsing.cpp.

and now when i run the pipeline it is showing me the output here:

[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.00261071
Class 1 => Probability = 7.94431e-06
Class 2 => Probability = 0.197126
Class 3 => Probability = 0.28822
Class 4 => Probability = 0.0646253
Class 5 => Probability = 0.0622103
Class 6 => Probability = 0.3852
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[Tracker Probe] Object ID=0, class_id=0
[Python Probe] No classifier meta list for object ID 0
[SGIE Probe] Object ID=0, class_id=0

sahil.ranmbail · January 13, 2025, 11:15am

@fanzh waiting for your response, as i believe i have progressed so far by knowing that the inference is happening.
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.00261071
Class 1 => Probability = 7.94431e-06
Class 2 => Probability = 0.197126
Class 3 => Probability = 0.28822
Class 4 => Probability = 0.0646253
Class 5 => Probability = 0.0622103
Class 6 => Probability = 0.3852

fanzh · January 13, 2025, 2:42pm

did you set classifier-threshold=0.1? if so, the highest probability 0.3852 will be selected according to the current logics in parseAttributesFromSoftmaxLayers. please add log to confirm. please try other pictures to check if the classification results are correct.

sahil.ranmbail · January 13, 2025, 5:33pm

Yes i have checked the onnx file in offline mode and see the embeddings or output of the model which shows 7 outputs in embeddings form and its the same and i have checked it on several videos just to make sure and its is working fine.

Now tha only issue im facing here is that in the final display window im not able to see the results on video being attached to the bbox, as im only able to see the bbox with the face, but not the emotion label.

As i have created the logs and the inference of sgie is working fine, but I’m not able to see the final output on OSD, which should be there with the detected face.

So kindly help me to get through this.

sahil.ranmbail · January 14, 2025, 5:11am

@fanzh waiting for your guidance

fanzh · January 14, 2025, 5:35am

Here are some methods to debug.

In parseAttributesFromSoftmaxLayers, please print attrString to check if the classification result is correct.
in attach_metadata_classifier of opt\nvidia\deepstream\deepstream\sources\gst-plugins\gst-nvinfer\gstnvinfer_meta_utils.cpp, please print object_meta->text_params.display_text to check if the text includes classification result.
in sgie_src_pad_buffer_probe you shared, please check if obj_meta.classifier_meta_list is still null.

sahil.ranmbail · January 14, 2025, 6:17am

i have doen thoses changes you mention, now this is th eputput im getting:
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.0977492
Class 1 => Probability = 0.0181941
Class 2 => Probability = 0.130716
Class 3 => Probability = 0.0399927
Class 4 => Probability = 0.240364
Class 5 => Probability = 0.144125
Class 6 => Probability = 0.32886
[Tracker Probe] Object ID=0, class_id=0
[Tracker Probe] No classifier meta for ID=0
[Tracker Probe] Object ID=0, class_id=0
[Tracker Probe] No classifier meta for ID=0
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.105825
Class 1 => Probability = 0.0248201
Class 2 => Probability = 0.177451
Class 3 => Probability = 0.0921164
Class 4 => Probability = 0.21939
Class 5 => Probability = 0.132402
Class 6 => Probability = 0.247995
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.104213
Class 1 => Probability = 0.0377113
Class 2 => Probability = 0.125215
Class 3 => Probability = 0.107298
Class 4 => Probability = 0.19396
Class 5 => Probability = 0.126459
Class 6 => Probability = 0.305143
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[SGIE Probe] Current display_text for object ID=0: face 0
[Tracker Probe] Object ID=0, class_id=0
[Tracker Probe] No classifier meta for ID=0
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.0881634
Class 1 => Probability = 0.0595162
Class 2 => Probability = 0.154713
Class 3 => Probability = 0.0769069
Class 4 => Probability = 0.201242
Class 5 => Probability = 0.122912
Class 6 => Probability = 0.296547
[SGIE Probe] Object ID=0, class_id=0
[SGIE Probe] No classifier meta for object ID=0
[SGIE Probe] Current display_text for object ID=0: face 0
[Tracker Probe] Object ID=0, class_id=0
[Tracker Probe] No classifier meta for ID=0
[Tracker Probe] Object ID=0, class_id=0
[Tracker Probe] No classifier meta for ID=0
[parseAttributesFromSoftmaxLayers] Debug: Layer index = 0, numClasses = 7
Class 0 => Probability = 0.0958845
Class 1 => Probability = 0.0903064
Class 2 => Probability = 0.200158
Class 3 => Probability = 0.0923352
Class 4 => Probability = 0.17373
Class 5 => Probability = 0.0893197
Class 6 => Probability = 0.258266

and im still not able to get the emotion results on display window:

fanzh · January 14, 2025, 7:05am

could you share the result of step1, 2 in my last comment.

please print m_ClassifierThreshold, attrFound, attrString
in parseAttributesFromSoftmaxLayers to check if the classification result attrString is correct.
please print object_meta->text_params.display_text, which will be drawn on the video.

Topic		Replies	Views
DeepStream 7 – YOLO Classification Model always returns l_classifier = None DeepStream SDK deepstream , inception	16	324	September 26, 2025
Deepstream emotion detection using python DeepStream SDK gstreamer	10	816	February 16, 2024
[Secondary GIE] Custom Classifier in sgie outputs only random entry in label.txt DeepStream SDK	29	3039	June 28, 2021
Use YOLO Keypoints for Secondary GIE (LSTM Classifier) DeepStream SDK	25	1420	July 30, 2024
Classifier_meta_data is none for ONNX model as input DeepStream SDK pytorch , deepstream61	9	751	October 11, 2022
Classifier_meta_list is none in deepstream_test2.py DeepStream SDK python , tao , deepstream61	21	1962	September 26, 2022
Custom preprocesing for SGIE inference-Jetson xavier NX DeepStream SDK	6	709	June 23, 2023
How to identify the object detection function in deepstream-test12 app? DeepStream SDK	3	849	August 3, 2021
How to use onnx file with deepstream-test1-usbcam + Custom models DeepStream SDK	29	5107	June 22, 2021
Secondary classifiers labels are missing from output DeepStream SDK	4	842	July 22, 2020