Good day
I’ve written a small python application that uses the HTTP images from my HikVision DVR’s 8x 1080p channels to do object detection.
The basic flow is:
- Get images from all 8 channels asynchronously and convert them to numpy arrays using cudaFromNumpy and then store in memory for the detection routine
- Run the 8 frames through detectnet using SSD-Mobilenet-v2
- Then ultimately start a recording on a channel if a person is detected(this still needs implementation)
Please find a link to my github repo JetsonSecVision
Currently this whole process takes about 2 - 3 seconds for the whole routine(grab frames then detect), writing to disk implies some time penalty too when a person is detected.
My questions would be, is there a quicker way to achieve similar functionality?
-
Perhaps if I take every nth RTSP frame from all 8 channels add that to a queue and then process?
- This would seem like the most complex proposition, since my test have shown RTSP feeds not being very stable in the long term, I could be wrong.
-
Would re-training the model to only detect persons also improve the network speed and thereby increasing overall speed?
- This seems like an option to explore regardless since the model does give a bunch of false positives.
Or perhaps someone has a better suggestion I could explore?
Thanks in advance
Ohan Smit