I’ve created a Deepstream app that identifies elements in a video frame (in my case RTSP, so realtime) and in the demo the output is a live video with bounding boxes and identifying labels.
What I’d like to do is extract only the X and Y coordinates and the L and W of the bounding box for each instance identified in CSV or similar format; that I can then feed to a remote system via a TCP-IP pipe or similar (think of an overhead camera guiding a robot to a goal).
So looking at the [sink] portion of the deepstream_app_config_yoloV3.txt
[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
There’s a type (3) of file, but I’m pretty sure if I choose it it will output an MP4 with the bounding boxes and labels.
If you can help me with the best method of implementing this - the more practical the better, or point me to an example that has implemented it that would be great.
Thank you,
DougM