Sgie inference does not work on all detected objects

my pipeline is as follows:
nvstreammux → pgie → tracker → sgie → (sgie_nvdsanalytics_srcpad) nvdsanalytics → nvvidconv → tiler → nvosd → renderer

the pgie model is peoplenet, which has the following config file [note that I intend to only detect faces]:

[property]
# workspace-size=3048
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=tlt_encode
tlt-encoded-model=../models/peoplenet/resnet34_peoplenet_pruned.etlt
labelfile-path=../configs/labels_peoplenet.txt
model-engine-file=../models/peoplenet/resnet34_peoplenet_pruned.etlt_b1_gpu0_int8.engine
int8-calib-file=../models/peoplenet/resnet34_peoplenet_int8.txt
input-dims=3;544;960;0
uff-input-blob-name=input_1
batch-size=1
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=3
cluster-mode=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid
# output-tensor-meta=1
[class-attrs-0]
pre-cluster-threshold=1.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1
[class-attrs-1]
pre-cluster-threshold=1.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1
[class-attrs-2]
pre-cluster-threshold=0.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.5
minBoxes=1

The sgie model is a onnx model to classify the age of the detected faces, and has the following configurations:

[property]
workspace-size=3048
gpu-id=0
net-scale-factor=1
onnx-file=../models/age/age_googlenet.onnx
model-engine-file=../models/age/age_googlenet.onnx_b1_gpu0_fp32.engine
# force-implicit-batch-dim=1
input-dims=3;224;224;0
# offsets=104;117;123
offsets=123;117;104
batch-size=1
# 0=FP32 and 1=INT8 mode
network-mode=0
input-object-min-width=0
input-object-min-height=0
process-mode=2
# model-color-format=1
gpu-id=0
gie-unique-id=2
# operate-on-gie-id=1
# operate-on-class-ids=0
network-type=100
# is-classifier=1
output-blob-names=fc1
output-tensor-meta=1
# classifier-async-mode=1
# classifier-threshold=0.001
#scaling-filter=0
#scaling-compute-hw=0

my sgie probe function is as follows:


def sgie_src_pad_buffer_probe(pad,info,u_data):
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    found_infer_data = False
    while l_frame is not None:
        try:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break
        
        l_obj=frame_meta.obj_meta_list
        while l_obj is not None:
            try:
                obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
                # print(dir(obj_meta))
            except StopIteration:
                break
            
            l_user = obj_meta.obj_user_meta_list
            while l_user is not None:
                try:
                    user_meta=pyds.NvDsUserMeta.cast(l_user.data)
                except StopIteration:
                    break
                
                if user_meta.base_meta.meta_type != pyds.NvDsMetaType.NVDSINFER_TENSOR_OUTPUT_META:
                    continue
                
                found_infer_data = True
                tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data)
                layers_info = []
                # print(f"num out layers: {tensor_meta.num_output_layers}", flush=True)
                # print(f"gpu id: {tensor_meta.gpu_id}")
                # print(f"unique id: {tensor_meta.unique_id}")
                
                ageList=['(0-2)', '(4-6)', '(8-12)', '(15-20)', '(25-32)', '(38-43)', '(48-53)', '(60-100)']
                for i in range(tensor_meta.num_output_layers):
                    layer = pyds.get_nvds_LayerInfo(tensor_meta, i)
                    layers_info.append(layer)
                try:
                    l_user = l_user.next
                except StopIteration:
                    break
            
            if found_infer_data:
                for layer in layers_info:
                    # Convert NvDsInferLayerInfo buffer to numpy array
                    ptr = ctypes.cast(pyds.get_ptr(layer.buffer), ctypes.POINTER(ctypes.c_float))
                    v = np.ctypeslib.as_array(ptr, shape=(len(ageList),))
                    
                    classifier_meta = pyds.nvds_acquire_classifier_meta_from_pool(batch_meta)
                    classifier_meta.unique_component_id = tensor_meta.unique_id
                    
                    label_info = pyds.nvds_acquire_label_info_meta_from_pool(batch_meta)
                    label_info.result_class_id = np.argmax(v)
                    label_info.result_prob = np.max(v)
                    label_info.result_label = ageList[np.argmax(v)]
                    obj_meta.text_params.display_text = f"{obj_meta.obj_label} {obj_meta.object_id} {label_info.result_label}"
                    
                    pyds.nvds_add_label_info_meta_to_classifier(classifier_meta, label_info)
                    pyds.nvds_add_classifier_meta_to_object(obj_meta, classifier_meta)
            else:
                print(f"no infer data found for: {obj_meta.object_id}")
                print(f"{[obj_meta.tracker_bbox_info.org_bbox_coords.left, obj_meta.tracker_bbox_info.org_bbox_coords.top, obj_meta.tracker_bbox_info.org_bbox_coords.width, obj_meta.tracker_bbox_info.org_bbox_coords.height]}")
                obj_meta.text_params.display_text = f"{obj_meta.obj_label} {obj_meta.object_id} ---"
            
            try: 
                l_obj=l_obj.next
            except StopIteration:
                break
        try:
            l_frame=l_frame.next
        except StopIteration:
            break
    
    return Gst.PadProbeReturn.OK

an example of the unexpected output that I get:

no infer data found for: 32
[48.4701042175293, 189.16497802734375, 13.985758781433105, 25.6199893951416]
no infer data found for: 33
[115.42562103271484, 191.5161590576172, 9.979028701782227, 22.19746971130371]
no infer data found for: 32
[48.4632682800293, 189.09146118164062, 13.934600830078125, 25.54955291748047]
no infer data found for: 33
[115.44075012207031, 191.4876251220703, 9.941106796264648, 22.13165283203125]
no infer data found for: 65
[36.47920227050781, 183.53929138183594, 12.897473335266113, 27.13695526123047]

The Issue

So what happens is that most of the detected faces are classified with the sgie model but some of them are not. Which is not what I expect, as I want to classify every detected face.

I have noticed that the faces that do not get infered by sgie are often small in size, that’s why I printed their bbox info, and I made sure to add input-object-min-width=0 and input-object-min-height=0 to the sgie model configs but that didn’t change anything.


• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
v5.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
the one on the deepstream devel docker image
• NVIDIA GPU Driver Version (valid for GPU only)
465.19.01
• Issue Type( questions, new requirements, bugs)
questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

1 Like

Update:

The issue has been gone after I saw this thread Very small Bounding Boxes with custom sgie model - #9 by mchi and increased the muxer width and height.
Now all the objects detected in the pgie are infered with sgie.

However, this shouldn’t be the case because I didn’t specify any minimum sizes for the sgie model. If you can explain why this happens, that would be great. Because I used to set the muxer shape exactly like the pgie model input shape to prevent multiple resizing of the frame inside the pipeline plugins, but now I would have to set it to some high value just in case any small bbox is detected.

Seems related to the model. Is it possible the model issue?

Please check my reply (Sgie inference does not work on all detected objects - #3 by a7med.hish); it’s related to the muxer shape.

Is it possible avoid resize in muxer in your application?

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvstreammux.html

what do you mean?
the weird thing is that muxer size affects the bounding boxes that are inferred in the sgie while I am specifying the minimum width and height for the model input as 0.

Is it possible to set width and height to 0 for your use case to avoid resize in muxer? Suppose infer plugin will do resize.

Property Meaning Type and Range Example Notes
batch-size Maximum number of frames in a batch. Integer, 0 to 4,294,967,295 batch-size=30
batched-push-timeout Timeout in microseconds to wait after the first buffer is available to push the batch even if a complete batch is not formed. Signed integer, -1 to 2,147,483,647 batched-push-timeout= 40000 40 msec
width If non-zero, muxer scales input frames to this width. Integer, 0 to 4,294,967,295 width=1280
height If non-zero, muxer scales input frames to this height. Integer, 0 to 4,294,967,295 height=720

thanks for your reply.
when I set the width and height to 0 the app throughs this error:

ERROR from src_bin_muxer: Output width not set

I don’t think it’s possible to do this.
Also, I don’t understand why smaller boxes don’t get any inference, if a plugin is filtering the input to the sgie it should be said clearly. Seems a very vague and not clearly defined behaviour that needs more clear info.

I will have a check in my side later.

any updates?

Sorry for late response. Can you share reproduce step or any source code for reproducing the issue?

We encounter exactly the same issue. The configs and code supplied by @a7med.hish seems sufficient to reproduce the issue. For now, we moved the nvtracker beyond the nvinfer objects to ensure that no optimization is being done. From the gst-nvinfer docs, the following quote suggests this workaround.

When the plugin is operating as a secondary classifier along with the tracker, it tries to improve performance by avoiding re-inferencing on the same objects in every frame. It does this by caching the classification output in a map with the object’s unique ID as the key. The object is inferred upon only when it is first seen in a frame (based on its object ID) or when the size (bounding box area) of the object increases by 20% or more. This optimization is possible only when the tracker is added as an upstream element.[1]

However, it would be nice to have access to the tracker id in the parser of a nvinfer object running in secondary mode, while also have the guarantee that no optimization is being done.

[1] gst-nvinfer docs

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.