Only one classifier meta sometimes appears when tracker enabled between PGIE and two SGIEs

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): Jetson Orin NX
• DeepStream Version: 7.0
• JetPack Version (valid for Jetson only): 6.0
• TensorRT Version: 8.6.2.3
• Issue Type( questions, new requirements, bugs): questions

Pipeline

nvurisrcbin → nvvideoconvert → nvstreammux(batch-size=1) → nvinfer(PGIE, unique-id=1) → [nvtracker(IOU)] → nvinfer(SGIE#1, unique-id=2) → nvinfer(SGIE#2, unique-id=3) → identity(pad-probe) → fakesink

(PGIE = trafficcamnet, both SGIEs = vehiclemakenet with identical config files. Both SGIEs operate on PGIE’s detections (operate-on-gie-id=1).)

Expectation

Since both SGIEs are identical and operate on the same PGIE objects, each object should have either 0 or 2 classifier results (never exactly 1) on a given frame.

Actual Behavior

Only when the tracker is enabled, we intermittently observe frames where car objects (class_id=0) carry only one classifier meta. Without the tracker, we never see this. A pad-probe after the second SGIE counts classifier_meta entries per object and prints:

class_id=0 has only 1 classifier result

This appears dozens of times per run when the tracker is present.

Repro

  • Minimal Python script attached below (uses DeepStream sample video sample_1080p_h265.mp4).

  • Flip ENABLE_TRACKER=True/False to compare behavior.

Question

I am trying to create a pipeline application using DeepStream where objects are first detected by PGIE, then tracked, and afterward classified by two SGIEs for each detected object (PGIE → Tracker → SGIE1 → SGIE2).

While verifying the behavior, I noticed that only when there is a Tracker between the PGIE and SGIEs, the execution result of one of the SGIEs occasionally does not appear in the metadata. To confirm this, I conducted the following experiment:

  • I created a verification program using Gst-Python.

  • To compare the behavior with and without the Tracker, I defined ENABLE_TRACKER in the script so that the Tracker is linked between the PGIE and SGIEs only when it is set to True.

  • For the PGIE model, I used trafficcamnet.

  • For the Tracker model, I used IOU.

  • For both SGIEs, I used exactly the same configuration file and model (vehiclemakenet). Since both are set to perform inference on the PGIE detection results, they are expected to produce exactly the same output. (In other words, every object should always have either zero or two SGIE classification results attached. A situation where only one SGIE classification result is attached is not expected.)

  • I added a Probe to the pad of the identity element linked after the SGIEs. In this Probe, using PyDS, I counted the number of classification results attached to every object in every frame, and if the number was one, I printed the message “class_id=0 has only 1 classifier result.”

When I ran the experiment, dozens of “class_id=0 has only 1 classifier result” messages were printed when the Tracker was present, but no such messages were printed when the Tracker was absent. This shows that without the Tracker, both SGIEs attach inference results to the objects at exactly the same timing.

However, since the actual application needs to use a Tracker, I have been looking for a configuration that allows both SGIEs to produce exactly the same results while the Tracker is enabled, but I have not been able to find one.

Am I overlooking some configuration? I would greatly appreciate any help with this issue. Thank you in advance.

I have attached below the program I used for testing. I would appreciate it if you could review it.

import gi
gi.require_version("Gst", "1.0")
gi.require_version("GLib", "2.0")
from gi.repository import Gst, GLib
import pyds

ENABLE_TRACKER = True

INPUT_SOURCE_URI = 'file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h265.mp4'
PGIE_CONFIG_FILE = '/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt'
PGIE_ENGINE_FILE = '/opt/nvidia/deepstream/deepstream/samples/models/Primary_Detector/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine'
SGIE1AND2_CONFIG_FILE = '/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_secondary_vehiclemake.txt'
SGIE1AND2_ENGINE_FILE = '/opt/nvidia/deepstream/deepstream/samples/models/Secondary_VehicleMake/resnet18_vehiclemakenet.etlt_b1_gpu0_int8.engine'
TRACKER_LIB_FILE = '/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so'
TRACKER_CONFIG_FILE = '/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_IOU.yml'

def on_message(bus, message, data) -> bool:
    pipeline, loop = data
    msg_type = message.type
    src = message.src.name

    if msg_type == Gst.MessageType.EOS:
        print(f"{src} :: End-of-stream")
        pipeline.set_state(Gst.State.NULL)
        loop.quit()
    elif msg_type == Gst.MessageType.WARNING:
        err, debug = message.parse_warning()
        print(f"{src} :: Warning: {err}: {debug}")
    elif msg_type == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        print(f"{src} :: Error: {err}: {debug}")
        pipeline.set_state(Gst.State.NULL)
        loop.quit()
    elif msg_type == Gst.MessageType.STATE_CHANGED:
        msg = message.parse_state_changed()
        if src == "pipeline":
            print(f"{src} :: State change: {msg.oldstate.value_name} -> {msg.newstate.value_name}")
    return True

def on_pad_added(src, new_pad, sink_element):
    sink_pad = sink_element.get_static_pad("sink")
    if not sink_pad.is_linked():
        new_pad.link(sink_pad)

def callback(pad, info, u_data) -> None:
        gst_buffer = info.get_buffer()
        batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
        l_frame = batch_meta.frame_meta_list
        while l_frame is not None:
            try:
                frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
                l_obj=frame_meta.obj_meta_list
                while l_obj is not None:
                    try:
                        obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
                        if obj_meta.class_id == 0:
                            class_length = 0
                            l_class = obj_meta.classifier_meta_list
                            while l_class is not None:
                                try:
                                    class_meta = pyds.NvDsClassifierMeta.cast(l_class.data)
                                    class_length += 1
                                    l_class = l_class.next
                                except StopIteration:
                                    break
                            if class_length == 1:
                                print("class_id=0 has only 1 classifier result")
                        l_obj = l_obj.next
                    except StopIteration:
                        break
                l_frame = l_frame.next
            except StopIteration:
                break
        return Gst.PadProbeReturn.OK

def main():
    Gst.init(None)
    
    pipeline = Gst.Pipeline.new("pipeline")
    
    nvurisrcbin = Gst.ElementFactory.make("nvurisrcbin", "nvurisrcbin_0")
    nvurisrcbin.set_property("gpu-id", 0)
    nvurisrcbin.set_property("cudadec-memtype", 0)
    nvurisrcbin.set_property("uri", INPUT_SOURCE_URI)
    nvurisrcbin.set_property("file-loop", 0)
    nvurisrcbin.set_property("drop-frame-interval", 0)
    pipeline.add(nvurisrcbin)

    nvvideoconvert = Gst.ElementFactory.make("nvvideoconvert", "nvvideoconvert_0")
    pipeline.add(nvvideoconvert)
    nvurisrcbin.connect("pad-added", on_pad_added, nvvideoconvert)

    post_queue_0 = Gst.ElementFactory.make("queue", "post_queue_0")
    pipeline.add(post_queue_0)
    nvvideoconvert.link(post_queue_0)

    streammux = Gst.ElementFactory.make("nvstreammux", "streammux")
    streammux.set_property("batch-size", 1)
    streammux.set_property("sync-inputs", 0)
    pipeline.add(streammux)
    srcpad = post_queue_0.get_static_pad("src")
    sinkpad = streammux.request_pad_simple("sink_0")
    srcpad.link(sinkpad)

    preinfer_queue = Gst.ElementFactory.make("queue", "preinfer_queue")
    pipeline.add(preinfer_queue)
    streammux.link(preinfer_queue)

    primary_gie = Gst.ElementFactory.make("nvinfer", "primary_gie")
    primary_gie.set_property("unique-id", 1)
    primary_gie.set_property("batch-size", 1)
    primary_gie.set_property("config-file-path", PGIE_CONFIG_FILE)
    primary_gie.set_property("model-engine-file", PGIE_ENGINE_FILE)
    pipeline.add(primary_gie)
    preinfer_queue.link(primary_gie)

    if ENABLE_TRACKER:
        tracker = Gst.ElementFactory.make("nvtracker", "tracker")
        tracker.set_property("gpu-id", 0)
        tracker.set_property("qos", 0)
        tracker.set_property("tracker-width", 960)
        tracker.set_property("tracker-height", 544)
        tracker.set_property("display-tracking-id", 1)
        tracker.set_property("ll-lib-file", TRACKER_LIB_FILE)
        tracker.set_property("ll-config-file", TRACKER_CONFIG_FILE)
        pipeline.add(tracker)
        primary_gie.link(tracker)

    secondary_gie_1 = Gst.ElementFactory.make("nvinfer", "secondary_gie_1")
    secondary_gie_1.set_property("unique-id", 2)
    secondary_gie_1.set_property("batch-size", 1)
    secondary_gie_1.set_property("config-file-path", SGIE1AND2_CONFIG_FILE)
    secondary_gie_1.set_property("model-engine-file", SGIE1AND2_ENGINE_FILE)
    pipeline.add(secondary_gie_1)
    if ENABLE_TRACKER:
        tracker.link(secondary_gie_1)
    else:
        primary_gie.link(secondary_gie_1)

    secondary_gie_2 = Gst.ElementFactory.make("nvinfer", "secondary_gie_2")
    secondary_gie_2.set_property("unique-id", 3)
    secondary_gie_2.set_property("batch-size", 1)
    secondary_gie_2.set_property("config-file-path", SGIE1AND2_CONFIG_FILE)
    secondary_gie_2.set_property("model-engine-file", SGIE1AND2_ENGINE_FILE)
    pipeline.add(secondary_gie_2)
    secondary_gie_1.link(secondary_gie_2)

    posttracker_queue = Gst.ElementFactory.make("queue", "posttracker_queue")
    pipeline.add(posttracker_queue)
    secondary_gie_2.link(posttracker_queue)

    identity = Gst.ElementFactory.make("identity", "identity")
    pipeline.add(identity)
    posttracker_queue.link(identity)
    identity.get_static_pad("src").add_probe(Gst.PadProbeType.BUFFER, callback, None)

    fakesink = Gst.ElementFactory.make("fakesink", "fakesink")
    pipeline.add(fakesink)
    identity.link(fakesink)

    loop = GLib.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus_id = bus.connect("message", on_message, (pipeline, loop))

    pipeline.set_state(Gst.State.PLAYING)
    print("Pipeline started")
    loop.run()
    
    print("Pipeline stopped")
    bus.remove_signal_watch()
    bus.disconnect(bus_id)

if __name__ == '__main__':
    main()

Thanks for the sharing! Why do you need to add two vehiclemakenet with identical config files? could you try classifier-async-mode=0 in nvinfer cfg file of sgie?

Thank you for your reply.

We intentionally set up two identical vehiclemakenet SGIEs because we are trying to verify our pipeline behavior. Our final application will run two different classifiers at this stage, so for testing purposes we duplicated the same config/model to ensure the outputs would be identical for each object.

I tried setting classifier-async-mode=0 in the SGIE’s nvinfer config, and indeed the issue disappeared.
However, in our case we also need the equivalent setting when using nvinferserver instead of nvinfer.
We found an item named async_mode in the nvinferserver config file, which seemed equivalent, but setting it to false did not resolve the issue.

Could you please clarify if there is an equivalent parameter in nvinferserver to achieve the same behavior as classifier-async-mode=0 in nvinfer?

right, async_mode is similar to classifier-async-mode of nvinfer. nvinfer plugin and low-level lib are opensource. you may add log in GstNvInferServerImpl::processObjects, which calls shouldInferObject, to check if shouldReinfer is 1 and isAsyncMode() is 0.

I tried running the same experiment using nvinferserver.
Even though I have set async_mode: false, the message
“class_id=0 has only 1 classifier result” is still being output.

How can I ensure that SGIE always runs for every object?
Are there any other settings besides async_mode that I should configure?

If using nvinferserver, you can set secondary_reinfer_interval for the “Re-inference interval”. Please find the explanation in the doc. Please refer to the cfg sample /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app-triton/config_infer_secondary_plan_engine_vehiclemake.txt.

Thank you very much.

In addition to setting async_mode: false, I also set secondary_reinfer_interval to 0 as you advised and ran the same test again, but the message
“class_id=0 has only 1 classifier result” still appears.

Are there any possible causes or settings we should check in this situation?

I have attached a ZIP file containing the test code we are using for your reference.
Thank you in advance for your time and support.

issue.zip (27.4 MB)

Thanks for the sharing! Here is the reason for your case.
In nvinferserver, there is a varable variable to save the history classification results. If the the same object is under infernece, the current object will be not sent to do inference. nvinfer will attach the history result. But sometime the inference result is null because probability is smaller than threshold. Both nvinfer and nvinferserver are opensource. you can add log in attachClassificationMetadata to check. sometimes attachClassificationMetadata is called but returns in advance because of this line code “if (objInfo.attributes.size() == 0 || objInfo.label.length() == 0)”.

Does this mean that as long as we use nvinferserver as is, inference skipping cannot be avoided?
When using nvinferserver, is there no official way to ensure that inference runs on every frame and always attaches classifier metadata?

why do you need sgie to do inference on each frame? the classficaiton result of sgie rarely change. for example, in /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app-triton/config_infer_secondary_plan_engine_vehiclemake.txt, secondary_reinfer_interval is set to 90.
If you still need sgie to do inference on each frame, Since nvinferserver is opensource, you can modify it to customize. for example, you may comment out the following code in GstNvInferServerImpl::shouldInferObject().

        if (history->under_inference)
            return false;