Deepstream overview for newbies

• Hardware Platform (Jetson / GPU)
Jetson Xavier NX
• DeepStream Version
5.1
• JetPack Version (valid for Jetson only)
4.5.1
• Issue Type( questions, new requirements, bugs)
Questions

Hello Nvidia family and friends,

I’m new to Deepstream and I have to admit that I am struggling quite a bit with some concepts.
I thought, “Hey, it would be good if I could just spend 15 minutes talking to a guy who knows well how this works”, then I remembered the forums :-)

Let me first summarise what I’m trying to do.
As an artistic project, I have a blimp hovering in a museum room, which carries a camera on a gimbal, and a Jetson NX.
On the ground, I have a Jetson AGX plugged to a monitor.
On the blimp, the camera looks at the people on the ground and the Jetson NX captures the video stream, runs a people detection software, and controls the gimbal so that when a human is detected, the camera will follow him wherever he move. Then it forwards the video feed to the ground, where the Jetson AGX will perform an action recognition, draw the skeleton on the body (posture detection), and display some information related to the action fo the visitor.

This whole process is somewhat working, but in the end I only have a 4 - 5 FPS, which is not ideal.
This is when I discovered Deepstream.

I’m developing everything in python, and I was trying to play with the code provided here, along with reading the documentation, but I’m facing some limits by myself.

So here are my questions:

  • In the example codes, a resnet detects cars and people for instance. I understand that we can probe for some metadata, like the number of persons or the confidence of the detection, but where and when can we decide to draw (or not draw in my case) the bounding boxes of the detection around the objects ?
    In my case for instance, in the first step I would like to only detect humans, and I would like to get the bounding box coordinates, to then control the camera to always keep the bounding box in the center of the image, but I don’t want to draw it
  • Still related to this example code, where can we decide to only detect people and not cars ? In the configuration file one can set the number of detected classes, but how do one decide which of those classes to detect ?
  • On another point, I don’t understand the point of the deepstream-app tool. It takes a configuration file, but how does that help me with my code ? I often see explanations on github for instance where people tell you to run deepstream-app -c some-config-file. How does that help in the development of a project ?

Well, sorry for the long post, as you can see I am brand new to deepstream and I’m still struggling with some of the very basic concepts and ideas that come with it.
If someone could shed some light my way, that would be greatly appreciated.
Thank you very mych

Cheers

You can try to remove the display meta if you dont want to display it

You can use the peoplenet instead of the default detector

Hello bcao, thank you very much for your reply.

I did find the peoplenet network, but this is one of the research that led me to write in this forum.
Here is where the explain how to use it with deepstream.
Basically what I understand from these explanations is: "there are some config files located here, then run deepstream-app.
I’m sorry but I still don’t understand the use of deepstream-app or how this helps me.
If I relate for instance to the python code example here, it uses one configuration file. In the case of the peoplenet, they mention two config files, one of which should be used with deepstream-app.

I think that what I’m missing is some high level understanding of the Deepstram SDK.

Regarding the drawing of the bounding boxes, as I was mentioning this python code earlier, I guess if I don’t want to draw them, I just need to not create the nvosd object, right?

# Create OSD to draw on the converted RGBA buffer
nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")

Anyway, thank you very much for having taken your time to reply to me.
Cheers

Well, about removing the nvosd object to remove the drawing of the bounding boxes, I’m now facing two new problems.
The first is that it does not work anymore, I get the following error:

Error: gst-stream-error-quark: Internal data stream error. (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(1984): gst_nvinfer_output_loop (): /GstPipeline:pipeline0/GstNvInfer:primary-inference:
streaming stopped, reason not-linked (-1)

Here is a link to my modified version of the python deepstream-test1.py to try and remove the osd (I just commented out the part I think were to be removed, but it’s apparently not working) and the result in the console.

Additionally, the probe that allows for the extraction of the metadata from the pipeline is lost, as it was attached to this object. According to the documentation it is the best place to put it, as all metadata should be complete at this point.

Hey customer,
Would you mind to create 2 separate topics to track the 2 questions, we would like one topic to track one issue/question.
Also the topic title is not suitable to track the 2 questions

Hello bcao,

Thank you for the advice.
So, here is my new topic related to OSD, and here is my new topic related to the use of peoplenet.

Cheers

Wow, I finally understood the use of deepstream-app.
Basically, if you have a config file with a sufficiently detailed configuration, you can run a basic app without writing code.
On the other hand, you could also almost do without any config file and set all the properties directly in the code.

For some reason this was not obvious to me, I came to realise it when I stumbled upon a very detailed config file.
Anyway, just saying in case this would help anyone…

Great, so can we close this topic now?

Hello bcao,

Yes, I think this topic can be closed.

Thank you for your help