Using deep stream in custom application

• Hardware Platform (Jetson / GPU) : Jetson
• DeepStream Version : 5.0 DP
• JetPack Version (valid for Jetson only) :4.4 Developer kit
• TensorRT Version : 7.1.3

Hi guys.
I want to know, Is it possible to use deep stream for custom apps? In my opinion, the deep stream sdk is for showing capability of jetson platforms, right? Suppose I want to use PeopleNet or any detection network or other task, How I can reusable the results? How I can get the outputs of network for other apps, Is it possible? In my opinion, the deep stream SDK is hard code. I want to know, Whats advantage of deep stream sdk except showing the results?

Yes, you can write your custom apps based on deepstream sdk per your requiements

@bcao
in you opinion, Isn’t hard task for doing such programming, as the deepstream sdk is close source and hard code, right?

Easy to implement your own app, refer NVIDIA DeepStream SDK Developer Guide — DeepStream 6.1.1 Release documentation

DeepStream is just a bunch of plugins for GStreamer. As long as you understand GStreamer, you’ll be fine, but that itself is a lot of work (that is worth it). To understand GStreamer, just retype the tutorials and read their explanations.

I say “retype” because it’s important to do that in order to learn. If you don’t retype it, you won’t learn. It’s not magic. If you need any help, no matter how simple the problem, ask here or on the GStreamer forums. It is “hard code”, but it’s worth the investment and frustration.

The advantage is speed. DeepStream (and GStreamer) is not friendly, but it is fast, and you’ll learn ways to make it friendlier the more experienced you get. It helps to not write GStreamer in C, but you should still know GStreamer in C because that’s what’s under the hood.

3 Likes

Thanks so much, @mdegans @bcao
I want to get the output of tracker in the apps, I want to know say me how many object is there. this number is show in the screen and above each of objects , but I want to access to that number, How I can to access to number?

YW, @LoveNvidia
So, when learning about GStreamer in the tutorials, you’ll eventually get to a tutorial with pad probes. In the example you’re probably referring to, a pad probe function is connected in such a way that on every buffer that flows across that pad, a function is called.

It’s in such a function that the things you’re interested in can be obtained. It’s now possible to do it in Python as well, but the interface looks like C and is frankly sometimes easier to work with in C. It’s worth it to learn. even if you stick with higher level language later, since it’ll make your life easier in the long term understanding GStreamer itself.

If you have any trouble with the tutorials, don’t hesitate to ask, even if it’s a simple thing. From experience, there are some assumptions about knowledge in those tutorials that should not be there, so it’s expected you’ll get stuck. Don’t let that stop you from learning.

Hi @mdegans
Thanks for your advice, you really right, but I don’t like the gstreamer forms, because its form is bad design, and isn’t a place to easy solve the problems and its community is bad in term of answering, I really enjoy from NVIDIA forms, in my opinion, NVIDIA forms is best that I ever seen, I ask any question in mind and give me solution and or at least hint, In Nvidia form is it possible to ask Gstreamer question? Or Do you know other form for gstreamer to be like NVIDIA?

@LoveNvidia

So, everything FOSS is ugly (90s web design and a visceral hatred of JavaScript), but the code quality also tends to be better than propretary. If you want advice on DeepStream specifically, here is probably the best place to ask. Otherwise, if it’s a general GStreamer question, you’re likely to get the best answer on a less aesthetically pleasing forum :)

Also, they can certainly be rude, but that’s also par for the course with FOSS. Really, the best suggestion I can make, if you want to learn GStreamer and DeepStream, is to retype the tutorials and ask if you get stuck (there is breakage, so that will likely happen). It has a steep learning curve, but there really aren’t many (good, mature) alternatives for streaming video or these sorts of analytics. It’s hard, and sometimes it might feel impossible, but stick with it and ask questions and you’ll learn.

1 Like

Thanks so much,
The caps and bins in gstreaemer, what are they?

Bins are a kindf of Element that contain other elements. They’re element groups basically, but you can treat them a a kind of Element. A Pipeline is a kind of Bin inself, actually.

Caps are structures that let GStreamer know whether Pads (on Elements) can be connected to each other. For example, a video source Element might have a src Pad Caps of video/x-raw, and that can only be connected to another video Pad of appropriate direction. Some elements have ANY caps that can accept a connection from anything.

This page probably explains it better and has diagrams explaining the major concepts:
https://gstreamer.freedesktop.org/documentation/application-development/introduction/basics.html?gi-language=c

Thanks so much,
Please keep on the discussion of gstreamer in the thread.

Re: other thread, it’s probably better if it stays in one thread since this is related.

A pipeline is comprised of Elements, which have Pads that connect each other like pieces of equipment are connected via ports. Pads have:

  • Direction: source or sink
  • Caps: what is sent over a pad (video, audio, text, etc).
  • Availability: always, sometimes, request

All of those affect whether an element can be linked via it’s pads. A source cannot be linked to a source, for example, a video paid cannot be linked to an audio pad, and a pad that does not exist because it has not been created or requested cannot be connected to an static pad that always exists.

In this case, the former part (video/x-raw) is the what. The (memory:NVMM) specifies where.

In this case, yes. You can use converter elements to send buffers back and forth if you need to.

A bit more than function. If you look at the source code of an element there are a whole bunch of them related to setup, teardown, property setting, and the business of how to handle buffers that flow through the element. You don’t really need to know any of that, however. You can treat elements like a black boxes and just interact with their publicly exposed properties. (eg. a file source might have a property so you can set/get the source file)

4- Each element only have one sink pad and one source pads? Is it possible to have one more sink/src pads in one element?

Yes, it’s possible to have any number of sources or sinks on a given pad. Sometimes they are always/static pads, which always exist. Sometimes they are “sometimes” pads that are created by the element themselves in reponse to something (eg. a source that might or might not have an audio pad). Sometimes they are request pads that you request from an element. You’ll see examples of all of them if you do the tutorials and it’s proably better explained at the above link.

Thanks so much, @mdegans, I get all of your explain. excellent. If possible and you have free time please keep on this thread for gathering more sharing knowledge.

I don’t get this part :

Availability: always, sometimes, request

Than’s mean, If we set the pad to always, that pad only work with specific caps like: video/-x-raw , and If we set the pad to sometimes, that pad can work sometime with video and sometime with audio and sometime with text and so on, If we set the pad to request, that pad only related to doing request? If so, Is it possible to have two input pad, one be for video and one be for request?

You don’t really need to know any of that,

If I want to add my custom tensor flow deep model(face recognition) is like classification task with 100-d output dimension, Is it possible to put my model instead of secondary classification task is deep stream task in pipeline? If yes, I need one step more is searching the embedded of faces in database after face recognition step, For this part(searching face part),in your opinion, I write custom element for that part and then add to pipeline of deepstream, Or It’s better to get buffers from classification element and do processing on the its outputs in the sink_pad_function_prob way then process with this output like python way in the other function?

my mean is that: If you see this link: It used tiler_sink_pad_buffer_probe for capturing the frames into numpy array, but I want to do such way:

tiler_sink_pad_buffer_probe(.....):
        ......
        search_face(metadata)
       ......
    return .....

seach_face(data):
       // do processing on data

As you know, it’s better to don’t put any processing in the prob function, because the performance of system is drop, But I want to do somethings like python asyncio loop.create_task( search_face(metadata)) in the tiler_sink_pad_buffer_probe function, In your opinion, Is it possible?

That’s what I thought too, but out of about 3-4 questions asked there on nabble, I only got 1 useful answer. You’ll most likely get answers from easy questions, but the annoying thing is that for easy questions you can already read the documentation. When there’s a really nasty gstreamer problem, nobody can answer it.

If we set the pad to always, that pad only work with specific caps like: video/-x-raw , and If we set the pad to sometimes, that pad can work sometime with video and sometime with audio and sometime with text and so on,

I think you’re confusing pad availability (sometimes, always, request) with pad capabilities (audio, video, etc). Pad availaiblity you can’t set, rather it’s a part of an element’s design (you can check with gst-inspect-1.0 followed by an element name)

It’s possible to have two (or more) static, pads, that exist always, two or more request pads, that you create on demand, or two or more sometimes pads, that the element creates (and you handle with an event handler function).

Yeah, I can’t help with that, sorry. No experience. ¯_(ツ)_/¯

Yeah. Then the answer is buried in the source somewhere :)

Thanks, @mdegans,
If you have experience in the deepstream python apps, please answer this question if possible.

@mdegans,
Some of elements in deepstream isn’t exist in gst-inpect-1.0 and they are custom elements for deeepstream, like this:

tiler=Gst.ElementFactory.make(“nvmultistreamtiler”, “nvtiler”)

Q1- I want to know for these elements that isn’t there in gst-inspect-1.0, How I can to access their properties? How I can to access their source code?

Q2- in the DS-python-apps, what’s the lines of 211,212?

Q3- If I want to have difference nvvideoconvert element for difference rtsp sources, How I can to define this? My goal is to define difference ROI(src-crop, dest-crop) property of nvvideoconvert for each RTSP stream?

Gst-inspect should necessarily have any element that you can use if it’s in a path GStreamer looks for. Example:

 $ gst-inspect-1.0 nvmultistreamtiler
Factory Details:
  Rank                     primary (256)
  Long-name                Stream Tiler DS 4.0
  Klass                    Generic
  Description              Tile input multistream buffer into a 2D array
  Author                   NVIDIA Corporation. Post on Deepstream for Tesla forum for any queries @ https://devtalk.nvidia.com/default/board/209/

Plugin Details:
  Name                     nvdsgst_multistreamtiler
  Description              NVIDIA Multistream Tiler plugin
  Filename                 /usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_multistreamtiler.so
  Version                  5.0.0
  License                  Proprietary
  Source module            nvmultistreamTiler
  Binary package           NVIDIA Multistream Plugins
  Origin URL               http://nvidia.com/

GObject
 +----GInitiallyUnowned
       +----GstObject
             +----GstElement
                   +----GstBaseTransform
                         +----GstNvMultiStreamTiler

Pad Templates:
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)NV12, (string)RGBA }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw(memory:NVMM)
                 format: { (string)NV12, (string)RGBA }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]

Element has no clocking capabilities.
Element has no URI handling capabilities.

Pads:
  SINK: 'sink'
    Pad Template: 'sink'
  SRC: 'src'
    Pad Template: 'src'

Element Properties:
  name                : The name of the object
                        flags: readable, writable
                        String. Default: "nvmultistreamtiler0"
  parent              : The parent of the object
                        flags: readable, writable
                        Object of type "GstObject"
  qos                 : Handle Quality-of-Service events
                        flags: readable, writable
                        Boolean. Default: false
  columns             : Number of columns in the Tiled 2D output
                        flags: readable, writable
                        Unsigned Integer. Range: 1 - 4294967295 Default: 1 
  rows                : Number of rows in the Tiled 2D output
                        flags: readable, writable
                        Unsigned Integer. Range: 1 - 4294967295 Default: 1 
  width               : Width of the tiled output in pixels
                        flags: readable, writable
                        Unsigned Integer. Range: 16 - 4294967295 Default: 1920 
  height              : Height of the tiled output in pixels
                        flags: readable, writable
                        Unsigned Integer. Range: 16 - 4294967295 Default: 1080 
  gpu-id              : Set GPU Device ID
                        flags: readable, writable
                        Unsigned Integer. Range: 0 - 4294967295 Default: 0 
  show-source         : ID of the source to be shown. If -1 all the sources will be tiled else only a single source will be scaled into the output buffer.
                        flags: readable, writable
                        Integer. Range: -1 - 2147483647 Default: -1 
  nvbuf-memory-type   : Type of NvBufSurface Memory to be allocated for output buffers
                        flags: readable, writable, changeable only in NULL or READY state
                        Enum "GstNvBufMemoryType" Default: 0, "nvbuf-mem-default"
                           (0): nvbuf-mem-default - Default memory allocated, specific to particular platform
                           (4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
  custom-tile-config  : Specifies individual tile resolution for all involved sources
                        flags: writable
                        Pointer. Write only
    uri_decode_bin.connect("pad-added",cb_newpad,nbin)
    uri_decode_bin.connect("child-added",decodebin_child_added,nbin)

The uridecodebin has sometimes pads, which are created by the element itself. In order to handle this, you need to connect the pad-added signal to a callback with the proper signature as shown. In other words, the first line says “when you get a new pad, call the cb_newpad function with nbin as the third parameter”.

The second line calls a function every time a child is added to uridecodebin. Bins are groups of elements, so they have child elements, and you can attach a function to modify what happens when a child is added. In this case, I believe Nvidia uses it to set some decoder options when the decoder is added to the bin.

So, it sounds like you’ll have to add nvvideoconvert elemnts after your decodebins but before your stream muxer. You can do this in cb_newpad probably. You’ll need to create a new element, add it to the bin (final parameter given to the func), and link it to the stream muxer. You’ll have to use gst-inspect-1.0 to figure out all of necessary properties you must set.