Deepstream Cutom Temporal Preprocessing Plugin Question

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 7.1
• JetPack Version (valid for Jetson only) NA
• TensorRT Version 10.6
• NVIDIA GPU Driver Version (valid for GPU only) Latest
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) nA
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi,

I have a question regarding the possibility of operating the preprocessing plugin in frame mode while having the downstream nvinfer plugin in object mode. Let me explain my reasoning.

I need to create temporal batches of RGB frames related to a person in the stream, i.e., cropped RGB windows. This crop is a minimum encompassing bounding box (bbox), representing the minimum and maximum of the person’s bounding boxes over a sequence of frames.

With the preprocessing plugin I am creating, I need to buffer frames and persons along with their corresponding bounding boxes. I initially decided to operate the preprocessing plugin in frame mode so that the converted_frame_ptr in an NvDsPreProcessUnit points to a scaled version of the full frame. According to the 3D action recognition sample, we are safe to use this memory as it is allocated from the preprocessing plugin’s memory pool. With this approach, I buffer the full frames and can later crop the window of frames according to a person’s minimum encompassing bounding box.

My plan is then to attach this tensor to the person and send this person downstream. However, this is the part I am unsure about. Given that I have initially set the preprocessing plugin to operate in frame mode, it is likely assumed that the output from this stage should be frames, not objects. Is that correct? In this case, how should I handle this?

I hope this is clear. Thank you in advance.

P.S. For simplicity, assume batch size is 1, i.e., we are working with one stream for now.

Is your pipeline like the following?

pgie --> nvdspreprocess --> sgie
              |
       attach NVDS_PREPROCESS_BATCH_META metadata

You can use it like this, but I think nvdspreprocess may not work as you expect. As you know NVDS_PREPROCESS_BATCH_META will be attached to the batch, not the object(person).

If you are sure that you need to attach the tensor processed from frame to the object, you can consider using nvds_add_user_meta_to_obj to attach user-specific metadata at object level

Hi Jun,

The full pipeline is:
nvurisrbin → nvstreamux → pgie(person detection) → nvtracker → nvdspreprocess(attach NVDS_PREPROCESS_BATCH_META) → sgie(person behavior analysis) → TODO

I mark the end of the pipeline as TODO because I am still undecided as how to approach this final piece. Then end all be all goal would be to save the last ten seconds of the video stream whenever there is a positive prediction i.e. a person has been classifier of the positive class. And this clip has the person’s bbox annotated over the clip. For now I suppose just saving the cropped rgb window(preprocess tensor) will suffice as I should be able to access this in the batch. Will address this later.

So currently I have the pipeline seemingly working(everything up to ‘TODO’ + fakesink) i.e. it seems the sgie is actually running on the cropped rgb windows indicated by the increased gpu util and latency of the pipeline. I now want to confirm this by attaching a buffer probe to see the raw tensor output from the model. I have created two buffer probes one which goes and looks at the objects(persons) associated in the batch and this seems to be working well i.e. person count is reflective of the underlying supplied video. The second probe I created was trying to look at the raw tensor data output by the sgie. This probe so far sees nothing i.e. I have tried probing both of these:

’ 1. When operating as primary GIE, NvDsInferTensorMeta is attached to each frame’s (each NvDsFrameMeta object’s) frame_user_meta_list. When operating as secondary GIE, NvDsInferTensorMeta is attached to each each NvDsObjectMeta object’s obj_user_meta_list.’

While aligning the process-mode of the sgie to no avail.

Any insight on this issue?

Here is my full deepstream-app.cpp file:


#include "pipeline.hpp"
#include <iostream>

#include "nvdsmeta.h"

#include <gst/gst.h>


using namespace deepstream;

class TensorSanityChecker : public BufferProbe::IBatchMetadataObserver
{
    public:
    TensorSanityChecker() {}
    virtual ~TensorSanityChecker() {}

    virtual probeReturn handleData(BufferProbe& probe, const BatchMetadata& data) {
        // Define the meta_type for NVDSINFER_TENSOR_OUTPUT_META
        const int USER_META_TYPE = NVDSINFER_TENSOR_OUTPUT_META; // 12

        // Iterate over each FrameMetadata in BatchMetadata
        data.iterate([&](const FrameMetadata& frame_data) {
            auto tensor_output_count = 0;

            // Iterate over each UserMetadata within the current FrameMetadata
            frame_data.iterate([&](const UserMetadata& user_data) {

                std::cout << "here" << std::endl;

                // Attempt to cast UserMetadata to TensorOutputUserMetadata
                const TensorOutputUserMetadata* tensor_meta = dynamic_cast<const TensorOutputUserMetadata*>(&user_data);

                if (tensor_meta) {
                    // Successfully casted; process the TensorOutputUserMetadata
                    unsigned int unique_id = tensor_meta->uniqueId();
                    std::cout << "Tensor Output - Unique ID: " << unique_id << std::endl;
                    tensor_output_count++;
                } else {
                    // Handle the case where the cast fails (optional)
                    std::cerr << "Warning: Failed to cast UserMetadata to TensorOutputUserMetadata." << std::endl;
                }

            }, USER_META_TYPE); // Pass the meta_type here
            
            // Output the count and frame information
            std::cout << "Tensor Output Counter: " 
                      << " Pad Idx = " << frame_data.padIndex() 
                      << " Frame Number = " << frame_data.frameNum() 
                      << " Tensor Output Count = " << tensor_output_count 
                      << std::endl;
        });

        return probeReturn::Probe_Ok;
    }

};



class ObjectCounter : public BufferProbe::IBatchMetadataObserver
{
    public:
    ObjectCounter() {}
    virtual ~ObjectCounter() {}

    virtual probeReturn handleData(BufferProbe& probe, const BatchMetadata& data) {
        data.iterate([](const FrameMetadata& frame_data) {
        auto person_count = 0;
        frame_data.iterate([&](const ObjectMetadata& object_data) {
            auto class_id = object_data.classId();
            if (class_id == 0) {
                person_count++;
            }
        });
        std::cout << "Object Counter: " <<
            " Pad Idx = " << frame_data.padIndex() <<
            " Frame Number = " << frame_data.frameNum() <<
            " Person Count = " << person_count << std::endl;
        });

        return probeReturn::Probe_Ok;
    }
};


int main(int argc, char *argv[])
{
    try {
        Pipeline pipeline("my-pipeline", "/home/uname/service_maker_app/config.yaml");
        pipeline["src"].set("uri", argv[1]);
        pipeline.attach("infer2", new BufferProbe("counter", new ObjectCounter));
        pipeline.start().wait();
    } catch (const std::exception &e) {
        std::cerr << e.what() << std::endl;
        return -1;
    }
    return 0;
}

Here is he yml config as well:

deepstream:
nodes:

  • type: nvurisrcbin
    name: src
    properties:
    drop-frame-interval: 4
  • type: nvstreammux
    name: mux
    properties:
    batch-size: 1
    width: 960
    height: 544
  • type: nvinfer
    name: infer
    properties:
    config-file-path: /home/uname/service_maker_app/config_infer_primary.txt
  • type: nvtracker
    name: tracker
    properties:
    tracker-width: 960
    tracker-height: 544
    ll-lib-file: /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
    ll-config-file: /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_accuracy.yml
  • type: nvdspreprocess
    name: temporalbatcher
    properties:
    config-file: /home/uname/service_maker_app/config_preprocess_sgie.txt
  • type: nvinfer
    name: infer2
    properties:
    config-file-path: /home/uname/service_maker_app/config_infer_secondary.txt
  • type: nvosdbin
    name: osd
  • type: nvvideoconvert
    name: nvvidconv
  • type: nvv4l2h264enc
    name: encoder
    properties:
    bitrate: 4000000
  • type: h264parse
    name: h264parser
  • type: qtmux
    name: muxer
  • type: filesink
    name: sink
    properties:
    location: output.mp4
    sync: false
    async: false
    edges:
    src: mux
    mux: infer
    infer: tracker
    tracker: temporalbatcher
    temporalbatcher: infer2
    infer2: osd
    osd: nvvidconv
    nvvidconv: encoder
    encoder: h264parser
    h264parser: muxer
    muxer: sink

Ok it seems like its an issue with my second inference model. As if I adjust the config for model 1 to output-tensor-meta=1 then I see user metadata attached to the frame metadata. But still no user metadata when output-tensor-meta=1 for the second model. Note the first model I am using TAO Peoplenet i.e. builtin network. But the second model is a custom classifier model which is exported to tensorRT format.

Any suggestions on how to investigate this issue further?

I am a little confused, so your current problem is that after set output-tensor-meta=1 in sgie configuration file still cannot find NVDSINFER_TENSOR_OUTPUT_META in obj_user_meta_list, right?

Try debugging the function of attach_tensor_output_meta in the /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer/gstnvinfer_meta_utils.cpp file.

This is where nvinfer adds the NVDSINFER_TENSOR_OUTPUT_META user meta.

Hey Jun,

So i think I have figured the issue. I was probing for the incorrect meta type i.e. NVDSINFER_TENSOR_OUTPUT_META. But the actual type is NVDS_ROI_META

The documentation found here is slightly misleading:
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html#:~:text=as%20it%20is.-,Tensor%20Metadata,-%23

As it indicates that the output meta should be attached as type NVDSINFER_TENSOR_OUTPUT_META(NvDsInferTensorMeta). But as can be traced back in the code, if the nvinfer plugin is operating on NVDS_PREPROCESS_BATCH_META data then the output will be attached as type NVDS_ROI_META

1 Like

Thanks for your feedback.