• Hardware Platform (Jetson / GPU)
Jetson Xavier NX
• DeepStream Version
5.1
• JetPack Version (valid for Jetson only)
4.5.1
• Issue Type( questions, new requirements, bugs)
Questions
Hello Nvidia family and friends,
I’m new to Deepstream and I have to admit that I am struggling quite a bit with some concepts.
I thought, “Hey, it would be good if I could just spend 15 minutes talking to a guy who knows well how this works”, then I remembered the forums :-)
Let me first summarise what I’m trying to do.
As an artistic project, I have a blimp hovering in a museum room, which carries a camera on a gimbal, and a Jetson NX.
On the ground, I have a Jetson AGX plugged to a monitor.
On the blimp, the camera looks at the people on the ground and the Jetson NX captures the video stream, runs a people detection software, and controls the gimbal so that when a human is detected, the camera will follow him wherever he move. Then it forwards the video feed to the ground, where the Jetson AGX will perform an action recognition, draw the skeleton on the body (posture detection), and display some information related to the action fo the visitor.
This whole process is somewhat working, but in the end I only have a 4 - 5 FPS, which is not ideal.
This is when I discovered Deepstream.
I’m developing everything in python, and I was trying to play with the code provided here, along with reading the documentation, but I’m facing some limits by myself.
So here are my questions:
- In the example codes, a resnet detects cars and people for instance. I understand that we can probe for some metadata, like the number of persons or the confidence of the detection, but where and when can we decide to draw (or not draw in my case) the bounding boxes of the detection around the objects ?
In my case for instance, in the first step I would like to only detect humans, and I would like to get the bounding box coordinates, to then control the camera to always keep the bounding box in the center of the image, but I don’t want to draw it - Still related to this example code, where can we decide to only detect people and not cars ? In the configuration file one can set the number of detected classes, but how do one decide which of those classes to detect ?
- On another point, I don’t understand the point of the deepstream-app tool. It takes a configuration file, but how does that help me with my code ? I often see explanations on github for instance where people tell you to run
deepstream-app -c some-config-file
. How does that help in the development of a project ?
Well, sorry for the long post, as you can see I am brand new to deepstream and I’m still struggling with some of the very basic concepts and ideas that come with it.
If someone could shed some light my way, that would be greatly appreciated.
Thank you very mych
Cheers