How to append DeepStream Metadata in Python without using Streammux / nvinfer for parallel branch?

I have a complex multi-source pipeline for which I wish to also pass-through the stream.
This pass-through stream should just enter and exit the pipeline as quickly as possible, but it should still go through the DeepStream/GStreamer pipeline.

Only caveat, I would like to append to it DeepStream metadata, such that I can then compare each frame, to the same frame that goes through inference through output-tensor-meta.

Can this be done? In Python?

I’m aware, I could use streammux to just have it append the metadata, but it seems overkill?


• Hardware Platform (Jetson / GPU) Jetson Xavier AGX
• DeepStream Version 6.3
• JetPack Version (valid for Jetson only) 5.1.2
• Issue Type( questions, new requirements, bugs) Question

What do you want to append to the metadata? is it different for every frame? Where do you get the data?

I want to keep track of frames, across the pipelines, as to compare raw frames to processed frames.

Some of the data doesn’t have to pass through DeepStream plugins (i.e. no NvStreammux), thus not getting DeepStream metadata which includes the unique IDs for each frames.

My temporary solution would be to pass all streams by a first streammux + demux just to let them have the metadata, but this seems to be overkill?

Thank you

Do you mean there are some videos processed without DeepStream? How did you construct the pipeline? It is important to the requirement details for us to evaluate the solution.

What I mean by that, is that some branches don’t use deepstream plugins, but just gstreamer.

The (simplified) pipeline looks like this:

src1 ---- T --- Queue1 ----- nvinfer ---- ... --- sink [ AI branch ]
              - Queue2 ----- nvconv -- nvdsosd  --- sink [ replication branch ]
              - Queue3 ----- ... --- udpsink [ restream branch ]

This is a very high-level overview of the pipeline.

The AI branch, takes the input stream and processes it using a custom model. using output-tensor-meta = 1 as to extract the output tensor.

The replication branch aims to replicate the nvinfer preprocessing steps happening inside the nvinfer module, so that we can probe them and compare them to the ouput from AI branch.

The restream branch, restreams the input.

I need to figure out how to best compare each input to output from the model for each unique frame.
Thus I need

  1. keep track of exact frame numbers (here comes DeepStream metadata?)
  2. extract exact input to the model (here comes replication branch)
  3. Do something when input/output accomplish some condition.

The full pipeline is comprised of more branches but they don’t interact with these 3, and is also comprised of multiple sources which all do the same.

The branches are not necessary. All the things can be done in one branch.

I agree, but my biggest pain point which I need resolved, is how to access the raw tensor data that is fed to the model inside nvinfer.

If you help me solve that - preferably in python - I can finish the rest in one go.

Thanks.

If you want to access the raw tensor freely, please use Gst-nvdspreprocess (Alpha) — DeepStream documentation 6.4 documentation + Gst-nvinfer — DeepStream documentation 6.4 documentation instead of use Gst-nvinfer — DeepStream documentation 6.4 documentation only. Then you can get the input tensor data from NVIDIA DeepStream SDK API Reference: GstNvDsPreProcessBatchMeta Struct Reference | NVIDIA Docs

Please refer to /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-preprocess-test for how to construct the pipeline and refer to gst_nvinfer_process_tensor_input() in /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp for how to get the tensor data from GstNvDsPreProcessBatchMeta

For a simple scaling / resizing of the input images, would the given custom file and function be enough, are they even needed?

libcustom2d_preprocess.so
CustomTensorPreparation 

Or could that even be just achieved with the config file in the property section ?

# processig width/height at which image scaled
processing-width=120
processing-height=60
    # tensor shape based on network-input-order
network-input-shape= 1;1;60;120

Thank you

You can configure the sample customised nvpreprocess lib to do scaling only. It is open source, you can refer to the source code to know what happens inside the sample customized nvpreprocess lib.

I’m not an expert in C++, but after looking through it a little; would I be correct that I mostly need to implement this on :

NvDsPreProcessStatus NvDsPreProcessTensorImpl::prepare_tensor(
NvDsPreProcessBatch* batch, void*& devBuf)
{…}

And then maybe implement the transformation rescaling/resizing function itself in the cuda file ? (Similar to NvDsPreProcessConvert_C1ToP1Float in the file /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvdspreprocess/nvdspreprocess_lib/nvdspreprocess_conversion.cu)

If only scaling is needed. The sample /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvdspreprocess/nvdspreprocess_lib already meet your request, no need to change the code. The only thing is to configure the nvpreprocess configuration file in correct way.

You can refer to /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-preprocess-test/config_preprocess.txt, change the configuration to support the whole frame instead of ROIs and enable scaling only by setting “pixel-normalization-factor=1.0”.

There is document for the parameters. Gst-nvdspreprocess (Alpha) — DeepStream documentation 6.4 documentation

Hi,

So, I think the pipeline works, I can see the video being shown from the sink.

Now my problem is, I cannot access the tensors using the probes.

Neither nvinfer (src and sink) nor nvdspreproess (src) seem to populate l_user at any point. (It’s always “None”)

Could you help me out?

Here the current testing pipeline, graph and code:

gi.require_version('Gst', '1.0')
from gi.repository import Gst, GObject
import sys
sys.path.append('../')
from common.bus_call import bus_call
from common.is_aarch_64 import is_aarch64
import os
import shutil
import pyds
from utils import cb_newpad, decodebin_child_added, create_source_bin, create_and_add_element

os.environ["GST_DEBUG_DUMP_DOT_DIR"] = "debugging/"


# Callback function for the probe on nvdspreprocess src pad
def nvdspreprocess_src_pad_probe(pad, info, user_data):
    print("Buffer from nvdspreprocess")
    gst_buffer = info.get_buffer()
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    if not batch_meta:
        print("No batch meta found")
        return Gst.PadProbeReturn.OK

    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        print("Checking for user meta")
        l_user = batch_meta.batch_user_meta_list
        while l_user:
            user_meta = pyds.NvDsUserMeta.cast(l_user.data)
            print(f"Found user meta of type: {user_meta.base_meta.meta_type}")
            l_user = l_user.next

        try:
            l_frame = l_frame.next
        except StopIteration:
            break

    # #@TODO
    return Gst.PadProbeReturn.OK

# Callback function for the probe on nvinfer src pad
def nvinfer_src_pad_probe(pad, info, user_data):
    #@TODO:
    return Gst.PadProbeReturn.OK

def create_graph(pipeline):
    print("Graph done")
    Gst.debug_bin_to_dot_file(pipeline, Gst.DebugGraphDetails.ALL, "test_preprocess_dot")
# Initialize GStreamer
Gst.init(None)

# Create the GStreamer pipeline
pipeline = Gst.Pipeline()

# Define the elements of the pipeline
source = create_source_bin(0, "rtsp://user:pw@<IP>")
# source = create_source_bin(0, "file:///opt/nvidia/deepstream/deepstream/sources/deepstream_python_apps/apps/XX/video_output_cam0.mp4")
streammux = create_and_add_element("nvstreammux", "streammux", pipeline)
streammux.set_property('batch-size', 1)
streammux.set_property('width', 1280)
streammux.set_property('height', 720)
streammux.set_property('live-source', 0)
streammux.set_property('batched-push-timeout', 4000000)

nvconv = Gst.ElementFactory.make("nvvideoconvert", "nvconv")
nvdspreprocess = Gst.ElementFactory.make("nvdspreprocess", "nvdspreprocess")
nvinfer = Gst.ElementFactory.make("nvinfer", "nvinfer")

nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
# Finally render the osd output
if DISPLAY:
    if is_aarch64():
        print("Creating nv3dsink \n")
        sink = Gst.ElementFactory.make("nv3dsink", "nv3d-sink")
        if not sink:
            sys.stderr.write(" Unable to create nv3dsink \n")
    else:
        print("Creating EGLSink \n")
        sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
        if not sink:
            sys.stderr.write(" Unable to create egl sink \n")
else:
    sink = Gst.ElementFactory.make("fakesink", "in_fakesink")


if not pipeline or not source or not nvdspreprocess or not nvinfer or not sink:
    print("One element could not be created. Exiting.")
    

# Set properties for the elements
# nvinfer configuration
nvinfer.set_property("config-file-path", "/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/XX/c_nvinfer_1.txt")
nvinfer.set_property("input-tensor-meta", 1)  # Use preprocessed input from nvdspreprocess
nvdspreprocess.set_property('config-file', '/opt/nvidia/deepstream/deepstream/sources/deepstream_python_apps/apps/XX/configs/nvdspreprocess_conf.txt')
# Add elements to the pipeline
pipeline.add(source)
pipeline.add(nvconv)
pipeline.add(nvdspreprocess)
pipeline.add(nvinfer)

pipeline.add(nvvidconv)
pipeline.add(nvosd)
pipeline.add(sink)



# Create capsfilter
capsfilter = Gst.ElementFactory.make("capsfilter", "capsfilter")
capsfilter.set_property("caps", Gst.caps_from_string("video/x-raw(memory:NVMM),format=RGBA"))
pipeline.add(capsfilter)



sinkpad = streammux.get_request_pad("sink_1")
if not sinkpad:
    sys.stderr.write("Unable to create sink pad bin \n")
srcpad = source.get_static_pad("src")
if not srcpad:
    sys.stderr.write("Unable to create src pad bin \n")
srcpad.link(sinkpad)

# Link the elements together
streammux.link(nvconv)
nvconv.link(capsfilter)
capsfilter.link(nvdspreprocess)
# Insert into the pipeline
nvdspreprocess.link(nvinfer)

nvinfer.link(nvvidconv)
nvvidconv.link(nvosd)
nvosd.link(sink)



# Add probes
nvdspreprocess_src_pad = nvdspreprocess.get_static_pad("src")
if nvdspreprocess_src_pad:
    nvdspreprocess_src_pad.add_probe(Gst.PadProbeType.BUFFER, nvdspreprocess_src_pad_probe, 0)


nvinfer_src_pad = nvinfer.get_static_pad("src")
if nvinfer_src_pad:
    nvinfer_src_pad.add_probe(Gst.PadProbeType.BUFFER, nvinfer_src_pad_probe, 0)

# nvinfer_snk_pad = nvinfer.get_static_pad("sink")
# if nvinfer_snk_pad:
#     nvinfer_snk_pad.add_probe(Gst.PadProbeType.BUFFER, nvinfer_snk_pad_probe, 0)

# Start the pipeline
pipeline.set_state(Gst.State.PLAYING)
mainloop = GObject.MainLoop()
bus = pipeline.get_bus()
bus.add_signal_watch()

def on_message(bus, message):
    if message.type == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        print(f"Error from capsfilter: {err}, {debug}")
    elif message.type == Gst.MessageType.WARNING:
        warn, debug = message.parse_warning()
        print(f"Warning from capsfilter: {warn}, {debug}")
    # ... handle other messages as needed

bus.connect("message", on_message)
Gst.debug_bin_to_dot_file(pipeline, Gst.DebugGraphDetails.ALL, "test_preprocess_dotnorun")
try:
    mainloop.run()
    GObject.timeout_add_seconds(3, create_graph, pipeline)
except KeyboardInterrupt:
    print("Exiting on user request.")
    mainloop.quit()
    pipeline.set_state(Gst.State.NULL)

# Clean up
pipeline.set_state(Gst.State.NULL)

Here’s the nvdspreprocess config:

[property]
enable=1
    # list of component gie-id for which tensor is prepared
target-unique-ids=1
    # 0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=0
    # 0=process on objects 1=process on frames
process-on-frame=1
    #uniquely identify the metadata generated by this element
unique-id=0
    # gpu-id to be used
gpu-id=0
    # if enabled maintain the aspect ratio while scaling
maintain-aspect-ratio=1
    # if enabled pad symmetrically with maintain-aspect-ratio enabled
symmetric-padding=1
    # processig width/height at which image scaled
processing-width=120
processing-height=60
    # max buffer in scaling buffer pool
scaling-buf-pool-size=6
    # max buffer in tensor buffer pool
tensor-buf-pool-size=6
    # tensor shape based on network-input-order
network-input-shape= 1;1;60;120
    # 0=RGB, 1=BGR, 2=GRAY
network-color-format=2
    # 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
    # tensor name same as input layer name
tensor-name=input_1
    # 0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE 3=NVBUF_MEM_CUDA_UNIFIED
scaling-pool-memory-type=0
    # 0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU 2=NvBufSurfTransformCompute_VIC
scaling-pool-compute-hw=0
    # Scaling Interpolation method
    # 0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
    # 3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
    # 6=NvBufSurfTransformInter_Default
scaling-filter=0
    # custom library .so path having custom functionality
custom-lib-path=/opt/nvidia/deepstream/deepstream/lib/gst-plugins/libcustom2d_preprocess.so
    # custom tensor preparation function name having predefined input/outputs
    # check the default custom library nvdspreprocess_lib for more info
custom-tensor-preparation-function=CustomTensorPreparation

[user-configs]
   # Below parameters get used when using default custom library nvdspreprocess_lib
   # network scaling factor
pixel-normalization-factor=1.0
   # mean file path in ppm format
#mean-file=
   # array of offsets for each channel
#offsets=

#[group-0]
#src-ids=0
#custom-input-transformation-function=CustomAsyncTransformation
#process-on-roi=0

#roi-params-src-0=0;540;900;500;960;0;900;500;0;0;540;900;
#roi-params-src-1=0;540;900;500;960;0;900;500;0;0;540;900;
#roi-params-src-2=0;540;900;500;960;0;900;500;0;0;540;900;
#roi-params-src-3=0;540;900;500;960;0;900;500;0;0;540;900;

Here the nvinfer config:


[property]
gpu-id=0
#net-scale-factor=0.0039215697906911373
net-scale-factor=0.0039215686274509803921568627451
#onnx-file=/home/YYY/Desktop/MODELS/modelrgb2.onnx
#model-engine-file=/home/YYY/Desktop/MODELS/modelrgb2.onnx_b1_gpu0_fp16.engine
model-engine-file=/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/XX/Models/model_db.onnx_b1_gpu0_fp16.engine
onnx-file=/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/XX/Models/model_db.onnx
batch-size=1
network-mode=1
network-type=100
output-tensor-meta=1
#num-detected-classes=4
interval=0
gie-unique-id=1
output-blob-names=conv2d_2
model-color-format=2    # 0: RGB, 1: BGR, 2: GRAY, 3: RGBA, 4: BGRx, 5: ???
uff-input-order=1       # 0: NCHW, 1: NHWC, 2: NC
#onnx-input-order=1       # 0: NCHW, 1: NHWC, 2: NC
#infer-dims=60;120;1
input-tensor-from-meta=1

It is wrong. Please refer to gst_nvinfer_process_tensor_input() in /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp for how to get the tensor data from GstNvDsPreProcessBatchMeta

Can you please be more specific?

I also tried to be closer to the C++ source code, but for instance the metatype NVDS_PREPROCESS_BATCH_META doesn’t exist in this version yet?

And accessing

l_user = frame_meta.frame_user_meta_list
    while l_user is not None:
        user_meta = pyds.NvDsUserMeta.cast(l_user.data)

results in l_user being None at all times.

Any little help would be appreciated, as I’m stuck and can’t continue :/

Thanks!

The GstNvDsPreProcessBatchMeta is in batch user meta but not in frame user meta. Please refer to gst_nvinfer_process_tensor_input() in /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp. It is important to read the code before you try to write your own code.

Please, I understood this, and I’ve been through it.

Look, in this section of the gst_nvinfer_process_tensor_input() code

 NvDsUserMeta *user_meta = (NvDsUserMeta *) (l_user->data);
    if (user_meta->base_meta.meta_type != NVDS_PREPROCESS_BATCH_META)
      continue;
    GstNvDsPreProcessBatchMeta *preproc_meta =
        (GstNvDsPreProcessBatchMeta *) user_meta->user_meta_data;

I’m here:

l_user = batch_meta.batch_user_meta_list
while l_user is not None:
    user_meta = pyds.NvDsUserMeta.cast(l_user.data)
    print("    --- NvDsUserMeta ---")
    print("user meta.user_meta_data ", user_meta.user_meta_data)
    print("DIR user meta.user_meta_data ", dir(user_meta.user_meta_data))

This results in:

> user meta.user_meta_data  <capsule object NULL at 0xffff83a00c00>
> DIR user meta.user_meta_data  ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']

Right?
I need to access the NvDsPreprocessBatch structure, I don’t know how to cast or access it correctly.
Is this possible in DS6.3?

I’m not even sure it’s being processed correctly.

  1. I can see the stream being shown at the end through the sink
    1.1. The stream is in colour and full resolution, although in nvdspreprocess config and in nvinfer config we set the color to RGB and do the scaling- I imagine this is normal, as we just passthrough the original stream, and convert for the model only, can you confirm?

There is no python binding for NvDsPreprocessBatch. Maybe you need to generate the python binding by yourself. deepstream_python_apps/bindings at master · NVIDIA-AI-IOT/deepstream_python_apps (github.com)

Yes. The scaled RGB tensor data is sent to model only, the original video is not impacted.

1 Like

I see, so, as I meant initially, in DS6.3 on Python this cannot be done easily.

Does this binding exist in 6.4?
Would I be able to access the nvdspreprocess tensor data easily ?

Again, as I said, my expertise in C++ is limited, and I would probably need longer to generate the bindings, so if upgrading DeepStream to 6.4 is easier/faster I would try it that way.

Thank you very much for your patience and understanding!

No.

With c/c++ code, it is very easy.