Extract probability of the event for a single classifier

szymon.budziak.td · July 17, 2024, 2:12pm

• Hardware Platform (Jetson / GPU): NVIDIA Jetson AGX Orin
• DeepStream Version: 6.3
• JetPack Version (valid for Jetson only): 5
• TensorRT Version: 8.5.2
• Issue Type( questions, new requirements, bugs): questions

I have an ONNX model with an input shape of 64x224x224x3. For DeepStream, this translates to 64x3x224x224. This model outputs the probability of detecting a human in an image or video, functioning as a simple classifier for a single class without bounding boxes or any drawing. My goal is to print this probability in the terminal. However, I’m facing difficulties in extracting this probability from my Python script. Below is the DeepStream Python script I’m working with:

def osd_sink_pad_buffer_probe(pad, info, u_data):
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer\n")
        return

    # Retrieve batch metadata from the gst_buffer
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list

    while l_frame is not None:
        try:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        try:
            l_frame = l_frame.next
        except StopIteration:
            break

    # WHAT SHOULD BE HERE TO EXTRACT THE PROBABILITY OF THE EVENT

    return Gst.PadProbeReturn.OK


def main(args):
    # Check input arguments
    if len(args) != 2:
        sys.stderr.write(f"usage: {args[0]} <media file or uri>\n")
        sys.exit(1)

    # Standard GStreamer initialization
    Gst.init(None)

    # Create gstreamer elements
    # Create Pipeline element that will form a connection of other elements
    print("Creating Pipeline\n")
    pipeline = Gst.Pipeline()
    if not pipeline:
        sys.stderr.write("Unable to create Pipeline\n")

    # Source element for reading from the file
    print("Creating Source\n")
    source = Gst.ElementFactory.make("filesrc", "file-source")
    if not source:
        sys.stderr.write("Unable to create Source\n")

    # Data format in the input file is elementary h264 stream, we need a parser
    print("Creating parser\n")
    parser = Gst.ElementFactory.make("h264parse", "h264-parser")
    if not parser:
        sys.stderr.write("Unable to create parser\n")

    # Use nvdec_h264 for hardware accelerated decode on GPU
    print("Creating Decoder\n")
    decoder = Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")
    if not decoder:
        sys.stderr.write("Unable to create Nvv4l2 Decoder\n")

    # Create nvstreammux instance to form batches from one or more sources.
    print("Creating Streammux\n")
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write("Unable to create NvStreamMux\n")

    # Use nvinfer to run inferencing on decoder's output, behaviour of inferencing is set through config file
    print("Creating Primary Infer\n")
    pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
    if not pgie:
        sys.stderr.write("Unable to create pgie\n")

    # Use convertor to convert from NV12 to RGBA as required by nvosd
    print("Creatting nvvideoconvert\n")
    nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
    if not nvvidconv:
        sys.stderr.write("Unable to create nvvidconv\n")

    # Create OSD to draw on the converted RGBA buffer
    # Do I need this when I am not drawing anyting on the display?
    print("Creating OSD\n")
    nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
    if not nvosd:
        sys.stderr.write("Unable to create nvosd")

    # Create sink for the output
    print("Creating nv3dsink \n")
    sink = Gst.ElementFactory.make("nv3dsink", "nv3d-sink")
    if not sink:
        sys.stderr.write("Unable to create nv3dsink\n")

    print(f"Playing file {args[1]}")
    source.set_property("location", args[1])
    streammux.set_property("width", 1920)
    streammux.set_property("height", 1080)
    streammux.set_property("batch-size", 64)
    streammux.set_property("batched-push-timeout", 4000000)

    pgie.set_property("config-file-path", "config_infer_primary.txt")

    print("Adding elements to Pipeline\n")
    pipeline.add(source)
    pipeline.add(parser)
    pipeline.add(decoder)
    pipeline.add(streammux)
    pipeline.add(pgie)
    pipeline.add(nvvidconv)
    pipeline.add(nvosd)
    pipeline.add(sink)

    # we link the elements together
    print("Linking elements in the Pipeline\n")
    source.link(parser)
    parser.link(decoder)

    sinkpad = streammux.get_request_pad("sink_0")
    if not sinkpad:
        sys.stderr.write("Unable to get the sink pad of streammux\n")

    srcpad = decoder.get_static_pad("src")
    if not srcpad:
        sys.stderr.write("Unable to get source pad of decoder\n")

    srcpad.link(sinkpad)
    streammux.link(pgie)
    pgie.link(nvvidconv)
    nvvidconv.link(nvosd)
    nvosd.link(sink)

    # create an event loop and feed gstreamer bus mesages to it
    loop = GLib.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect("message", bus_call, loop)

    # Lets add probe to get informed of the meta data generated, we add probe to the sink pad
    osdsinkpad = nvosd.get_static_pad("sink")
    if not osdsinkpad:
        sys.stderr.write("Unable to get sink pad of nvosd \n")

    osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)

    # start play back and listen to events
    print("Starting pipeline\n")
    pipeline.set_state(Gst.State.PLAYING)

    try:
        loop.run()
    except:
        pass

    # cleanup
    pipeline.set_state(Gst.State.NULL)

I do not know what should I write in osd_sink_pad_buffer_probe to extract that probability.

Here is the configuration file “config_infer_primary.txt”:

[property]
gpu-id=0
onnx-file=my_model_b64.onnx
model-engine-file=my_model_b64.onnx_b64_gpu0_fp32.engine
batch-size=64
network-mode=0
network-type=1
process-mode=1
gie-unique-id=1
classifier-threshold=0.0
model-color-format=1
infer-dims=3;224;224

Unfortunately, I can’t provide the model, but my objective is to run deepstream-app on a video, which works correctly and I can see the video being displayed, but I want to print the probability of detecting a human in the terminal, which I cannot achieve. When it comes to nvdsosd do I really need it when I am not drawing anything on the display? I could not find the solution in deepstream_python_aps, is there such a solution there?

yuweiw · July 18, 2024, 1:33am

You can try to get the obj_meta bu referring the deepstream_test_1.py first. Then you can get the confidence from the obj_meta.

szymon.budziak.td · July 18, 2024, 7:37am

Thank You for the answer.

So this is how I changed my code:

def osd_sink_pad_buffer_probe(pad, info, u_data):
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer\n")
        return

    # Retrieve batch metadata from the gst_buffer
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list

    while l_frame is not None:
        try:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        frame_number = frame_meta.frame_num

        l_obj = frame_meta.obj_meta_list
        print(
            f"Frame Number: {frame_number}; Number of objects: {frame_meta.num_obj_meta}"
        )

        print(l_obj)
        while l_obj is not None:
            try:
                obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
            except StopIteration:
                break

            confidence = obj_meta.confidence
            print(f"Confidence: {confidence}")

            try:
                l_obj = l_obj.next
            except StopIteration:
                break

        try:
            l_frame = l_frame.next
        except StopIteration:
            break

    return Gst.PadProbeReturn.OK

I created obj_meta and attempted to extract the confidence from it, but the code still doesn’t work. The variable l_obj is always None, which prevents the second while loop from executing. I tested the model independently of DeepStream, and it successfully detected the event. However, when using DeepStream with a video, it fails to do so. Additionally, when I print num_obj_meta, it always returns 0.

Could it be because of the model? I checked it once again and in the output layer it has tanh function, so it returns values between -1 and 1 not exactly the probability.

yuweiw · July 18, 2024, 8:41am

OK. Since your model is a classifier model. You need to get the value from the NvDsClassifierMeta. But there are no “confidence” value in the NvDsClassifierMeta. So we cannot support get the confidence/score when you are using a classifier model.

szymon.budziak.td · July 18, 2024, 8:50am

Thank You for the reply.

Ok so maybe I will ask it in different way.
My model takes a 224x224 frame as input and outputs a single value since it has one output node with a tanh activation function. How can I retrieve this value in a Python script in DeepStream?

yuweiw · July 18, 2024, 10:13am

As I attached above, we don’t support this scenario currently. If you want to get this value, you need to parse the output tensor yourself.
We don’t have such example with Python. You can refer to our C/C++ demo code.

\opt\nvidia\deepstream\deepstream-7.0\sources\apps\sample_apps\deepstream-3d-action-recognition\deepstream_3d_action_recognition.cpp

szymon.budziak.td · July 18, 2024, 11:44am

Thank You for the reply.

In your previous answer, you suggested using NvDsClassifierMeta. I reviewed this class, and according to the NvDsClassifierMeta — Deepstream Deepstream Version: 7.0 documentation, I can extract label_info_list and then iterate over it to extract values such as label_id or result_prob. Will this approach work in my case? My model has a single output with values in the range of -1 to 1, and there are no labels or classes involved.

Is the only solution to write this in C++ based on the example that You provided in deepstream-3d-action-recognition\deepstream_3d_action_recognition.cpp?

Are there any Python examples with classifier model with just one output node?

yuweiw · July 19, 2024, 1:49am

Yes. For a normal classifier model, this value is what you expect to get.

Yes. You can refer to deepstream-ssd-parser to learn how to parse the output tensor with Python.

system · August 16, 2024, 8:16am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to make a parses function for my regression model DeepStream SDK	23	1693	October 12, 2021
How to output probability of each class in audio-transform? DeepStream SDK	21	287	July 23, 2024
Issue with Converting ONNX Model with different dimensions to TensorRT Engine for DeepStream DeepStream SDK deepstream	20	102	April 22, 2025
Classifier_meta_data is none for ONNX model as input DeepStream SDK pytorch , deepstream61	10	576	October 11, 2022
Making the deepstream python example 1 work in headless mode with output to file DeepStream SDK	4	388	April 7, 2024
Issue in operate-on-class-ids DeepStream SDK deepstream	9	29	May 6, 2025
Jetson Nano DeepStream SDK	14	455	July 20, 2023
Error importing model engine in deepstream TensorRT	5	961	December 12, 2022
DeepStream outputs different segmentation mask shape than ONNX model has itself DeepStream SDK	3	45	July 19, 2024
Parsing custom tensorflow model DeepStream SDK	31	570	September 4, 2023

Extract probability of the event for a single classifier

Related topics