• Hardware Platform (Jetson / GPU) - Jetson Nano
• DeepStream Version - 5.1
• JetPack Version (valid for Jetson only) - 4.5.1
• TensorRT Version - 7.1.3
• Issue Type( questions, new requirements, bugs) - questions
I am attempting to augment my home security system using deepstream with a Jetson Nano. My pipeline is six cameras coming in RTSP and using the default example which uses the resnet10 caffemodel. After playing with it for a while, I’m trying to figure out the best way to fit it into my use case, and have a few somewhat haphazard questions. My background may help some, I’ve been developing C++ software for 20+ years, and have barely dipped my toe into the perception and machine learning realm. I can sling code w/the best of em but when it comes to tensors and models and stuff I’m a dunce.
The default resnet10 model is built for detecting
Roadsign. In my setup I definitely don’t need
Roadsign, and I’d love to add in
Animal, or even
Possum, etc if possible (not super important though). Would I need to modify the resnet model somehow to remove the
Animal? Or can I simply remove the things I don’t need from the labels file? My ideal list would be:
Racoon, but I’d settle for
Animal. I assume that having a small-ish, narrow list will keep bad detections at a minimum. I recall running yolo at some point and it thought my front sidewalk was a surfboard, and I have no need to detect surfboards in my yard.
My cameras are all capable of roughly 2K resolution as well as have the ability to simultaneously output a second stream at a lower resolution. I like having the full size resolution at 15 fps on the main stream for my recording system (Blue Iris), however I noticed some models like a very specific resolution to be set? Should I be setting that output resolution to something the model prefers, or highest possible? How about framerate (plus constant vs variable) or colors? The Jetson Nano seems to be able to run this model on all of them at the same time at full res at 15 fps without issue, but I’m uncertain as to whether that’s best for the model or not.
While I can jump in and modify the
deepstream-appI’d love to use it as is with my customized input files. Is there a sink or something I can use that will simply spit out the bounding boxes, confidences, and classes detected and on which stream they’re detected over something like MQTT or another transport? If that doesn’t exist it seems it’d be super useful in the base app. If it’s something I need to add myself, is the
all_bbox_generated()callback the best place to start?
Are the models typically trained only for color images? I.e. when it’s night time and all my cameras switch over to infrared, are the models now junk? Or do I need it to run a different model when the cameras are in night mode?
This is probably the most difficult part but if I have an object that’s in my yard 24-7 that seems troublesome for the detector (it always thinks it’s a person but it’s not), is there something I can do in the model to say “ignore this object” by giving it some pictures? Even screenshots from the exact camera?
I know these are a lot of questions and on different subjects but it seems like I’m not that far off from accomplishing the goal, which is just to turn off Blue Iris’ junk motion detection and just feed it start/stops when people or cars are detected. Any help would be greatly appreciated.