Jpeg to nvinfer to nvosd to rects to jpeg on Jetson Deepstream 6.4

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Orin
• DeepStream Version 6.4 docker
• JetPack Version (valid for Jetson only) 6.0DP
• TensorRT Version 8.6.2.3
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

#!/usr/bin/env python3

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2019-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

import sys
sys.path.append('../')
import os
import gi
gi.require_version('Gst', '1.0')
from gi.repository import GLib, Gst
from common.is_aarch_64 import is_aarch64
from common.bus_call import bus_call

import pyds

PGIE_CLASS_ID_VEHICLE = 0
PGIE_CLASS_ID_BICYCLE = 1
PGIE_CLASS_ID_PERSON = 2
PGIE_CLASS_ID_ROADSIGN = 3
MUXER_BATCH_TIMEOUT_USEC = 33000

# def osd_sink_pad_buffer_probe(pad,info,u_data):
#     frame_number=0
#     num_rects=0

#     gst_buffer = info.get_buffer()
#     if not gst_buffer:
#         print("Unable to get GstBuffer ")
#         return

#     # Retrieve batch metadata from the gst_buffer
#     # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
#     # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
#     batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
#     l_frame = batch_meta.frame_meta_list
#     while l_frame is not None:
#         try:
#             # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
#             # The casting is done by pyds.NvDsFrameMeta.cast()
#             # The casting also keeps ownership of the underlying memory
#             # in the C code, so the Python garbage collector will leave
#             # it alone.
#             frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
#         except StopIteration:
#             break

#         #Intiallizing object counter with 0.
#         obj_counter = {
#             PGIE_CLASS_ID_VEHICLE:0,
#             PGIE_CLASS_ID_PERSON:0,
#             PGIE_CLASS_ID_BICYCLE:0,
#             PGIE_CLASS_ID_ROADSIGN:0
#         }
#         frame_number=frame_meta.frame_num
#         num_rects = frame_meta.num_obj_meta
#         l_obj=frame_meta.obj_meta_list
#         while l_obj is not None:
#             try:
#                 # Casting l_obj.data to pyds.NvDsObjectMeta
#                 obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
#             except StopIteration:
#                 break
#             obj_counter[obj_meta.class_id] += 1
#             obj_meta.rect_params.border_color.set(0.0, 0.0, 1.0, 0.8) #0.8 is alpha (opacity)
#             try: 
#                 l_obj=l_obj.next
#             except StopIteration:
#                 break

#         # Acquiring a display meta object. The memory ownership remains in
#         # the C code so downstream plugins can still access it. Otherwise
#         # the garbage collector will claim it when this probe function exits.
#         display_meta=pyds.nvds_acquire_display_meta_from_pool(batch_meta)
#         display_meta.num_labels = 1
#         py_nvosd_text_params = display_meta.text_params[0]
#         # Setting display text to be shown on screen
#         # Note that the pyds module allocates a buffer for the string, and the
#         # memory will not be claimed by the garbage collector.
#         # Reading the display_text field here will return the C address of the
#         # allocated string. Use pyds.get_string() to get the string content.
#         py_nvosd_text_params.display_text = "Frame Number={} Number of Objects={} Vehicle_count={} Person_count={}".format(frame_number, num_rects, obj_counter[PGIE_CLASS_ID_VEHICLE], obj_counter[PGIE_CLASS_ID_PERSON])

#         # Now set the offsets where the string should appear
#         py_nvosd_text_params.x_offset = 10
#         py_nvosd_text_params.y_offset = 12

#         # Font , font-color and font-size
#         py_nvosd_text_params.font_params.font_name = "Serif"
#         py_nvosd_text_params.font_params.font_size = 10
#         # set(red, green, blue, alpha); set to White
#         py_nvosd_text_params.font_params.font_color.set(1.0, 1.0, 1.0, 1.0)

#         # Text background color
#         py_nvosd_text_params.set_bg_clr = 1
#         # set(red, green, blue, alpha); set to Black
#         py_nvosd_text_params.text_bg_clr.set(0.0, 0.0, 0.0, 1.0)
#         # Using pyds.get_string() to get display_text as string
#         print(pyds.get_string(py_nvosd_text_params.display_text))
#         pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)
#         try:
#             l_frame=l_frame.next
#         except StopIteration:
#             break
			
#     return Gst.PadProbeReturn.OK	


def do_things_to_buffer(pad,info,u_data):
    print("do_things_to_buffer")
    # print(type(pad))
    # print(dir(pad))
    # print(type(u_data))
    # print(dir(u_data))
    frame_number=0
    num_rects=0
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        try:
            # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
            # The casting is done by pyds.NvDsFrameMeta.cast()
            # The casting also keeps ownership of the underlying memory
            # in the C code, so the Python garbage collector will leave
            # it alone.
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break
        obj_counter = {
            PGIE_CLASS_ID_VEHICLE:0,
            PGIE_CLASS_ID_PERSON:0,
            PGIE_CLASS_ID_BICYCLE:0,
            PGIE_CLASS_ID_ROADSIGN:0
        }
        frame_number=frame_meta.frame_num
        num_rects = frame_meta.num_obj_meta
        if num_rects > 0:
            print("OOOOOOO")
            print(f"{num_rects = }")
        l_obj=frame_meta.obj_meta_list
        temp_id = 999
        while l_obj is not None:
            try:
                # Casting l_obj.data to pyds.NvDsObjectMeta
                obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
            except StopIteration:
                break
            obj_counter[obj_meta.class_id] += 1
            obj_meta.rect_params.border_color.set(0.0, 0.0, 1.0, 0.8) #0.8 is alpha (opacity)
            # if temp_id != obj_meta.class_id:
            #     temp_id = obj_meta.class_id
            #     print("frame_meta.frame_num", frame_meta.frame_num)
            #     print("frame_meta.num_obj_meta", frame_meta.num_obj_meta)

            try: 
                l_obj=l_obj.next
            except StopIteration:
                break
        
        print("hello frame")
        display_meta=pyds.nvds_acquire_display_meta_from_pool(batch_meta)
        display_meta.num_labels = 1
        py_nvosd_text_params = display_meta.text_params[0]
        py_nvosd_text_params.display_text = (
            "Frame Number={} Number of Objects={} Vehicle_count={} Bicycle_count={}".format(
                frame_number, num_rects, obj_counter[PGIE_CLASS_ID_VEHICLE], obj_counter[PGIE_CLASS_ID_BICYCLE]
            )
        )
        print(obj_counter)
        # Now set the offsets where the string should appear
        py_nvosd_text_params.x_offset = 10
        py_nvosd_text_params.y_offset = 12
        # Font , font-color and font-size
        py_nvosd_text_params.font_params.font_name = "Serif"
        py_nvosd_text_params.font_params.font_size = 10
        # set(red, green, blue, alpha); set to White
        py_nvosd_text_params.font_params.font_color.set(1.0, 1.0, 1.0, 1.0)
        # Text background color
        py_nvosd_text_params.set_bg_clr = 1
        # set(red, green, blue, alpha); set to Black
        py_nvosd_text_params.text_bg_clr.set(0.0, 0.0, 0.0, 1.0)
        # Using pyds.get_string() to get display_text as string
        print(pyds.get_string(py_nvosd_text_params.display_text))
        pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)

        try:
            l_frame=l_frame.next
        except StopIteration:
            break

    return Gst.PadProbeReturn.OK	


def main(args):
    # Check input arguments
    if len(args) != 2:
        sys.stderr.write("usage: %s <media file or uri>\n" % args[0])
        sys.exit(1)

    # Standard GStreamer initialization
    Gst.init(None)

    # Create gstreamer elements
    # Create Pipeline element that will form a connection of other elements
    print("Creating Pipeline \n ")
    pipeline = Gst.Pipeline()

    if not pipeline:
        sys.stderr.write(" Unable to create Pipeline \n")

    # Source element for reading from the file
    print("Creating Source \n ")
    source = Gst.ElementFactory.make("filesrc", "file-source")
    if not source:
        sys.stderr.write(" Unable to create Source \n")

    # # Since the data format in the input file is elementary h264 stream,
    # # we need a h264parser
    # print("Creating H264Parser \n")
    # h264parser = Gst.ElementFactory.make("h264parse", "h264-parser")
    # if not h264parser:
    #     sys.stderr.write(" Unable to create h264 parser \n")

    # Since the data format in the input file is jpeg,
    # we need a jpegparser
    print("Creating jpegParser \n")
    jpegparser = Gst.ElementFactory.make("jpegparse", "jpeg-parser")
    if not jpegparser:
        sys.stderr.write("Unable to create jpegparser \n")


    # Use nvdec_h264 for hardware accelerated decode on GPU
    print("Creating Decoder \n")
    decoder = Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")
    if not decoder:
        sys.stderr.write(" Unable to create Nvv4l2 Decoder \n")

    # Create nvstreammux instance to form batches from one or more sources.
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write(" Unable to create NvStreamMux \n")

    # Use nvinfer to run inferencing on decoder's output,
    # behaviour of inferencing is set through config file
    print("Creating primary-inference engine \n")
    pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
    if not pgie:
        sys.stderr.write(" Unable to create pgie \n")

    # Use convertor to convert from NV12 to RGBA as required by nvosd
    nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
    if not nvvidconv:
        sys.stderr.write(" Unable to create nvvidconv \n")

    # Create OSD to draw on the converted RGBA buffer
    nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")

    if not nvosd:
        sys.stderr.write(" Unable to create nvosd \n")

    # Finally render the osd output
    if is_aarch64():
        print("Creating nv3dsink \n")
        # sink = Gst.ElementFactory.make("nv3dsink", "nv3d-sink")
        # sink = Gst.ElementFactory.make("fakesink", "nvvideo-renderer")
        # New stufff
        queue = Gst.ElementFactory.make("queue", "queue")
        nvvidconv2 = Gst.ElementFactory.make("nvvideoconvert", "convertor2")
        if not nvvidconv2:
            sys.stderr.write(" Unable to create nvvidconv2 \n")

        # JPEG ??
        capsfilter = Gst.ElementFactory.make("capsfilter", "capsfilter")
        if not capsfilter:
            sys.stderr.write(" Unable to create capsfilter \n")
        # caps = Gst.Caps.from_string("video/x-raw(memory:NVMM), format=I420")
        caps = Gst.Caps.from_string("video/x-raw(memory:NVMM), format=I420")
        capsfilter.set_property("caps", caps)


        # # another converter?
        capsfilter3 = Gst.ElementFactory.make("capsfilter", "capsfilter3")
        nvvidconv3 = Gst.ElementFactory.make("nvvideoconvert", "convertor3")
        if not nvvidconv3:
            sys.stderr.write(" Unable to create nvvidconv3 \n")
        caps3 = Gst.Caps.from_string("video/x-raw(memory:NVMM), format=RGBA")
        capsfilter3.set_property("caps", caps3)


        print("Creating Code Parser \n")
        # codeparser = Gst.ElementFactory.make("mpeg4videoparse", "mpeg4-parser")
        jpegenc = Gst.ElementFactory.make("nvjpegenc", "nvjpegenc")
        if not jpegenc:
            sys.stderr.write(" Unable to create code parser \n")

        # JPEG ??
        sink = Gst.ElementFactory.make("filesink", "filesink")
        if not sink:
            sys.stderr.write(" Unable to create file sink \n")
        # Set filesink properties
        # output_file="/root/workingdir/out1.h264"
        output_file="/root/workingdir/out1.jpg"
        sink.set_property('location', output_file)
        sink.set_property("sync", 0)
        sink.set_property("async", 0)
        # new stuff ends
        if not sink:
            sys.stderr.write(" Unable to create filesink \n")
    else:
        print("Creating EGLSink \n")
        sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
        if not sink:
            sys.stderr.write(" Unable to create egl sink \n")

    print("Playing file %s " %args[1])
    source.set_property('location', args[1])
    if os.environ.get('USE_NEW_NVSTREAMMUX') != 'yes': # Only set these properties if not using new gst-nvstreammux
        # streammux.set_property('width', 1920)
        # streammux.set_property('height', 1080)
        # streammux.set_property('batched-push-timeout', MUXER_BATCH_TIMEOUT_USEC)
        streammux.set_property('width', 640)
        streammux.set_property('height', 480)
        streammux.set_property('batched-push-timeout', MUXER_BATCH_TIMEOUT_USEC)
    
    streammux.set_property('batch-size', 1)
    # # pgie.set_property('config-file-path', "dstest1_pgie_config.txt")
    pgie.set_property('config-file-path', "test_config.txt")

    print("Adding elements to Pipeline \n")
    pipeline.add(source)
    # pipeline.add(h264parser)
    pipeline.add(jpegparser)
    pipeline.add(decoder)
    pipeline.add(streammux)
    pipeline.add(pgie)
    pipeline.add(nvvidconv)
    pipeline.add(nvosd)

    # new stufff jpeg
    pipeline.add(nvvidconv3)
    pipeline.add(capsfilter3)

    pipeline.add(queue)
    pipeline.add(nvvidconv2)
    pipeline.add(capsfilter)
    pipeline.add(jpegenc)
    # new stuff ends

    pipeline.add(sink)

    # we link the elements together
    # file-source -> jpegparser -> nvh264-decoder ->
    # nvinfer -> nvvidconv -> nvosd -> nvvidconv2 -> nvjpegenc
    print("Linking elements in the Pipeline \n")
    source.link(jpegparser)
    jpegparser.link(decoder)

    sinkpad = streammux.request_pad_simple("sink_0")
    if not sinkpad:
        sys.stderr.write(" Unable to get the sink pad of streammux \n")
    srcpad = decoder.get_static_pad("src")
    if not srcpad:
        sys.stderr.write(" Unable to get source pad of decoder \n")
    srcpad.link(sinkpad)
    streammux.link(pgie)
    pgie.link(nvvidconv)
    nvvidconv.link(nvosd)

    nvosd.link(queue)

    queue.link(nvvidconv2)
    nvvidconv2.link(capsfilter)
    capsfilter.link(jpegenc)
    jpegenc.link(sink)

    # create an event loop and feed gstreamer bus mesages to it
    loop = GLib.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect ("message", bus_call, loop)

    # Lets add probe to get informed of the meta data generated, we add probe to
    # the sink pad of the osd element, since by that time, the buffer would have
    # had got all the metadata.
    osdsinkpad = nvosd.get_static_pad("sink")
    if not osdsinkpad:
        sys.stderr.write(" Unable to get sink pad of nvosd \n")

    # osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)
    osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, do_things_to_buffer, 0)

    # start play back and listen to events
    print("Starting pipeline \n")
    # Gst.debug_bin_to_dot_file(pipeline, Gst.DebugGraphDetails.ALL, 'graph_lowlevel.dot')
    pipeline.set_state(Gst.State.PLAYING)
    try:
        loop.run()
    except:
        pass
    # cleanup
    pipeline.set_state(Gst.State.NULL)

if __name__ == '__main__':
    sys.exit(main(sys.argv))


I am trying to take jpeg push through nvinfer, get rects and output drawn rects (using nvosd?) onto jpeg.
My model is trained in RGB. So I need to feed RGB to it, but I believe nvjpegenc requires I420 format. So I need to back convert to it. I am kind of lost.

I think this is what I want? At least what I have made mostly working.
file-source → jpegparser → nvh264-decoder → nvinfer → nvvidconv → nvosd → nvvidconv2 → nvjpegenc

Above is my attempt so far. Without the pgie, I was able to get the downsize image back out.

But when I try to run with pgie, do_things_to_buffer seem to never get called and the out1.jpg has zero bytes. Please advise on my pipeline and code. I am trying to learn deep stream.

One thing I tried: Gst.debug_bin_to_dot_file I wasn’t able to get find .dot file out to visualize. I feel this might be helpful for me but not sure what I am doing wrong.

The nvstreammux is missing.

Please refer to deepstream_python_apps/apps/deepstream-test1 at master · NVIDIA-AI-IOT/deepstream_python_apps (github.com) for the basic pipeline.

For dot graph dump, please refer to DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums

For the basic knowledge of DeepStream, please start with Welcome to the DeepStream Documentation — DeepStream documentation 6.4 documentation. Please make sure that you are familiar with GStreamer GStreamer: open source multimedia framework befoe you start with DeepStream.

Thanks for your links. My code block(posted in original post) generally follows deepstream-test1 that you have suggested.

With your suggestions for dot graph dump, I was able to generate the dot file and visualize it with

os.environ["GST_DEBUG_DUMP_DOT_DIR"] = "/tmp"
os.putenv('GST_DEBUG_DUMP_DIR_DIR', '/tmp')

as early as possible. +
Once the graph is fully created and just before triggering the main loop do the following:

Gst.debug_bin_to_dot_file(pipeline, Gst.DebugGraphDetails.ALL, "pipeline")

Finding /tmp/pipeline.dot when the program is run.

With nvv4l2decoder in pipeline, pipeline finishes without error but does not call do_things_to_buffer function.

https://i.vgy.me/HRjNU4.png

I tried nvjpegdec in pipeline is posted here.

https://i.vgy.me/xpJQXD.png

which results in this.

[JPEG Decode] BeginSequence Display WidthxHeight 4912x3684
nvstreammux: Successfully handled EOS for source_id=0
Error: gst-stream-error-quark: Internal data stream error. (1): ../libs/gst/base/gstbaseparse.c(3681): gst_base_parse_loop (): /GstPipeline:pipeline0/GstJpegParse:jpeg-parser:
streaming stopped, reason not-negotiated (-4)
[JPEG Decode] NvMMLiteJPEGDecBlockPrivateClose done
[JPEG Decode] NvMMLiteJPEGDecBlockClose done

Which one should I use and can you see what my errors are with the dot graph dump?

Checking for updates. Thank you

Can the following pipeline output correct JPEG file in your device?

gst-launch-1.0 filesrc location=./test.jpg ! jpegparse ! nvv4l2decoder mjpeg=true ! nvvideoconvert ! mux.sink_0 nvstreammux name=mux batch-size=1 width=960 height=576 ! nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! nvmultistreamtiler rows=1 columns=1 width=960 height=576 ! nvdsosd ! nvvideoconvert ! nvjpegenc ! filesink location=./convert.jpg

Yes. your pipeline worked. Thank you

This is the python code. (Modified for my model)
file-source → jpegparser → nvh264-decoder → nvvidconv → nvstreammux → nvinfer → tiler → nvosd → nvvidconv2 → nvjpegenc

def main(args):
    # Check input arguments
    if len(args) != 2:
        sys.stderr.write("usage: %s <media file or uri>\n" % args[0])
        sys.exit(1)

    # Standard GStreamer initialization
    Gst.init(None)

    # Create gstreamer elements
    # Create Pipeline element that will form a connection of other elements
    print("Creating Pipeline \n ")
    pipeline = Gst.Pipeline()

    if not pipeline:
        sys.stderr.write(" Unable to create Pipeline \n")

    # Source element for reading from the file
    print("Creating Source \n ")
    source = Gst.ElementFactory.make("filesrc", "file-source")
    if not source:
        sys.stderr.write(" Unable to create Source \n")

    # # Since the data format in the input file is elementary h264 stream,
    # # we need a h264parser
    # print("Creating H264Parser \n")
    # h264parser = Gst.ElementFactory.make("h264parse", "h264-parser")
    # if not h264parser:
    #     sys.stderr.write(" Unable to create h264 parser \n")

    # Since the data format in the input file is jpeg,
    # we need a jpegparser
    print("Creating jpegParser \n")
    jpegparser = Gst.ElementFactory.make("jpegparse", "jpeg-parser")
    if not jpegparser:
        sys.stderr.write("Unable to create jpegparser \n")


    # Use nvdec_h264 for hardware accelerated decode on GPU
    print("Creating Decoder \n")
    decoder = Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")
    if not decoder:
        sys.stderr.write(" Unable to create Nvv4l2 Decoder \n")
    decoder.set_property('mjpeg', True)

    # Create nvstreammux instance to form batches from one or more sources.
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write(" Unable to create NvStreamMux \n")

    # Use nvinfer to run inferencing on decoder's output,
    # behaviour of inferencing is set through config file
    print("Creating primary-inference engine \n")
    pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
    if not pgie:
        sys.stderr.write(" Unable to create pgie \n")

    # Use convertor to convert from NV12 to RGBA as required by nvosd
    nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
    if not nvvidconv:
        sys.stderr.write(" Unable to create nvvidconv \n")

    # Create OSD to draw on the converted RGBA buffer
    nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")

    if not nvosd:
        sys.stderr.write(" Unable to create nvosd \n")

    # Finally render the osd output
    if is_aarch64():
        nvvidconv2 = Gst.ElementFactory.make("nvvideoconvert", "nvvidconv2")
        if not nvvidconv2:
            sys.stderr.write(" Unable to create nvvidconv2 \n")

        # nvmultistreamtiler
        tiler = Gst.ElementFactory.make("nvmultistreamtiler", "nvtiler")
        if not tiler:
            sys.stderr.write(" Unable to create tiler \n")
        tiler.set_property("rows", 1)
        tiler.set_property("columns", 1)
        tiler.set_property("width", 640)
        tiler.set_property("height", 480)

        print("Creating jpegenc \n")
        jpegenc = Gst.ElementFactory.make("nvjpegenc", "nvjpegenc")
        if not jpegenc:
            sys.stderr.write(" Unable to create jpegenc \n")

        # File sink
        sink = Gst.ElementFactory.make("filesink", "filesink")
        if not sink:
            sys.stderr.write(" Unable to create file sink \n")
        # Set filesink properties
        # output_file="/root/workingdir/out1.h264"
        output_file="/root/workingdir/out1.jpg"
        sink.set_property('location', output_file)
        sink.set_property("sync", 0)
        sink.set_property("async", 0)
        # new stuff ends
        if not sink:
            sys.stderr.write(" Unable to create filesink \n")
    else:
        print("Creating EGLSink \n")
        sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
        if not sink:
            sys.stderr.write(" Unable to create egl sink \n")

    print("Playing file %s " %args[1])
    source.set_property('location', args[1])
    if os.environ.get('USE_NEW_NVSTREAMMUX') != 'yes': # Only set these properties if not using new gst-nvstreammux
        # streammux.set_property('width', 1920)
        # streammux.set_property('height', 1080)
        # streammux.set_property('batched-push-timeout', MUXER_BATCH_TIMEOUT_USEC)
        streammux.set_property('width', 640)
        streammux.set_property('height', 480)
        streammux.set_property('batched-push-timeout', MUXER_BATCH_TIMEOUT_USEC)
        streammux.set_property('batch-size', 1)

    
    streammux.set_property('batch-size', 1)
    # # pgie.set_property('config-file-path', "dstest1_pgie_config.txt")
    pgie.set_property('config-file-path', "test_config.txt")

    print("Adding elements to Pipeline \n")
    pipeline.add(source)
    # pipeline.add(h264parser)
    pipeline.add(jpegparser)
    pipeline.add(decoder)
    pipeline.add(streammux)
    pipeline.add(pgie)
    pipeline.add(nvvidconv)
    pipeline.add(nvosd)
    pipeline.add(tiler)
    pipeline.add(nvvidconv2)
    pipeline.add(jpegenc)

    pipeline.add(sink)

    # we link the elements together
    # file-source -> jpegparser -> nvh264-decoder -> nvvidconv -> nvstreammux
    # nvinfer -> tiler -> nvosd -> nvvidconv2 -> nvjpegenc
    print("Linking elements in the Pipeline \n")
    source.link(jpegparser)
    jpegparser.link(decoder)
    decoder_src_out = decoder.get_static_pad("src")
    if not decoder_src_out:
        sys.stderr.write(" Unable to get source pad of decoder \n")
    sinkpad = streammux.request_pad_simple("sink_0")
    if not sinkpad:
        sys.stderr.write(" Unable to get the sink pad of streammux \n")
    nvvidconv_sink_pad = nvvidconv.get_static_pad("sink")
    nvvidconv_src_pad = nvvidconv.get_static_pad("src")
    decoder_src_out.link(nvvidconv_sink_pad)
    nvvidconv_src_pad.link(sinkpad)
    streammux.link(pgie)
    pgie.link(tiler)
    tiler.link(nvosd)
    nvosd.link(nvvidconv2)

    nvvidconv2.link(jpegenc)
    jpegenc.link(sink)

    # create an event loop and feed gstreamer bus mesages to it
    loop = GLib.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect ("message", bus_call, loop)

    # Lets add probe to get informed of the meta data generated, we add probe to
    # the sink pad of the osd element, since by that time, the buffer would have
    # had got all the metadata.
    osdsinkpad = nvosd.get_static_pad("sink")
    if not osdsinkpad:
        sys.stderr.write(" Unable to get sink pad of nvosd \n")

    # osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)
    osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, do_things_to_buffer, 0)

    # start play back and listen to events
    print("Starting pipeline \n")
    Gst.debug_bin_to_dot_file(pipeline, Gst.DebugGraphDetails.ALL, 'pipeline')
    pipeline.set_state(Gst.State.PLAYING)
    try:
        loop.run()
    except:
        pass
    # cleanup
    pipeline.set_state(Gst.State.NULL)

If I am allowed to continue, the next step for me is to collect objects detected from my model, and draw them in the output frame.

I was able to create custom parser in c++, and create .so file. That .so file is used in the pipeline and I filled up std::vector<NvDsInferObjectDetectionInfo> &objectList. At the end of the parser, I verified that objectList has many objects.

So now in callback(in my case do_things_to_buffer, created based on osd_sink_pad_buffer_probe in deepstream_test_1.py, it is being called. However, my current blocker is frame_meta.num_obj_meta is 0.

My thought was it has something to do with non-maximum suppression. I tried switching cluster-mode=4 to do no clustering. But that was not successful.

I don’t understand why list of objects from parser did not passed correctly to the python callback. Might it have something do with nvosd not directly attached to pgie anymore? Please advise

Callback function below

def do_things_to_buffer(pad,info,u_data):
    print("do_things_to_buffer")
    frame_number=0
    num_rects=0
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        try:
            # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
            # The casting is done by pyds.NvDsFrameMeta.cast()
            # The casting also keeps ownership of the underlying memory
            # in the C code, so the Python garbage collector will leave
            # it alone.
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break
        obj_counter = {
            PGIE_CLASS_ID_VEHICLE:0,
            PGIE_CLASS_ID_PERSON:0,
            PGIE_CLASS_ID_BICYCLE:0,
            PGIE_CLASS_ID_ROADSIGN:0
        }
        frame_number=frame_meta.frame_num
        num_rects = frame_meta.num_obj_meta
        if num_rects > 0:
            print("OOOOOOO")
            print(f"{num_rects = }")
        l_obj=frame_meta.obj_meta_list
        temp_id = 999
        while l_obj is not None:
            try:
                # Casting l_obj.data to pyds.NvDsObjectMeta
                obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
            except StopIteration:
                break
            obj_counter[obj_meta.class_id] += 1
            obj_meta.rect_params.border_color.set(0.0, 0.0, 1.0, 0.8) #0.8 is alpha (opacity)
            # if temp_id != obj_meta.class_id:
            #     temp_id = obj_meta.class_id
            #     print("frame_meta.frame_num", frame_meta.frame_num)
            #     print("frame_meta.num_obj_meta", frame_meta.num_obj_meta)

            try: 
                l_obj=l_obj.next
            except StopIteration:
                break
        
        print("hello frame")
        display_meta=pyds.nvds_acquire_display_meta_from_pool(batch_meta)
        display_meta.num_labels = 1
        py_nvosd_text_params = display_meta.text_params[0]
        py_nvosd_text_params.display_text = (
            "Frame Number={} Number of Objects={} Vehicle_count={} Bicycle_count={}".format(
                frame_number, num_rects, obj_counter[PGIE_CLASS_ID_VEHICLE], obj_counter[PGIE_CLASS_ID_BICYCLE]
            )
        )
        print(obj_counter)
        # Now set the offsets where the string should appear
        py_nvosd_text_params.x_offset = 10
        py_nvosd_text_params.y_offset = 12
        # Font , font-color and font-size
        py_nvosd_text_params.font_params.font_name = "Serif"
        py_nvosd_text_params.font_params.font_size = 10
        # set(red, green, blue, alpha); set to White
        py_nvosd_text_params.font_params.font_color.set(1.0, 1.0, 1.0, 1.0)
        # Text background color
        py_nvosd_text_params.set_bg_clr = 1
        # set(red, green, blue, alpha); set to Black
        py_nvosd_text_params.text_bg_clr.set(0.0, 0.0, 0.0, 1.0)
        # Using pyds.get_string() to get display_text as string
        print(pyds.get_string(py_nvosd_text_params.display_text))
        pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)

        try:
            l_frame=l_frame.next
        except StopIteration:
            break

    return Gst.PadProbeReturn.OK	

config is below

[property]
gpu-id=0
net-scale-factor=0.00392156862745098
model-engine-file=/root/workingdir/nchw_model-2024-07-01-f577a1-016.onnx_b1_gpu0_fp16.engine
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
# num-detected-classes=4
num-detected-classes=1
interval=0
gie-unique-id=1
cluster-mode=4
infer-dims=3;480;640
# custom parser
parse-bbox-func-name=NvDsInferParseCustomEast
custom-lib-path=../../../../lib/libnvds_infercustomparser.so

[class-attrs-all]
pre-cluster-threshold=0.2
eps=0.2
group-threshold=1

Context:
I should add the eventual goal is to get 1. rectangle crops of detection from the full resolution image(I will need to scale the detected rect back up), and 2. resize the full resolution image to a smaller(but not neural net size) with nearest method(which is probably a tee at the beginning of the pipeline).
Source will be from argus camera, but I feel like it’s probably best to just work with a file first.

There is nvdsosd in the pipeline, it will draw the bboxes on the frame and you can see the bboxes in the output JPEG file.

It should. But in osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0), receives no objects for the frame(frame_meta.num_obj_meta is 0). Since there is no objects, no rectangles are being drawn in the output JPEG file.

The parsing function is definitely getting called, adding objects in objectList, can report objectList.size().

I don’t understand where the objects are being dropped, the two potential place I can think of is some clustering mechanism or tiler(which is between nvinfer and nvdsosd). But maybe there are other potential places. Please advise.

Please debug and check your code. The gst-nvinfer is open source.

It turns out that nvinfer is dropping object detected when the object height is over the image(neural net input) height.

It doesn’t seem to do that for width. But regardless, I think nvinfer would be happier with the entire rectangle inside the image dimension.

I probably still have bugs in my parsing function in the way I read the buffer, even though I am able producing rectangles.

The gst-nvinfer is open source, you make customize according to your own requirements.