Hi!
I’ve enabled to save the bbox coordinates (by addint the kitti output export folder in the deepstream config .txt). This outputs a file with the detections and its bounding box, like this one:
Here, I understand that the numbers 1564.00 129.00 1702.00 355.00 are the bbox coordinates, ehich is eplaned in the write_kitti function in deepstream_app.c
However these bbox coordinates are different from the video dimentions. In particular, the video I’m working with ins 640x480, and the .mp4 output it saves in sink0 is 1280x720, neither of which coincide with the left 1564.00 and right 1702.00 coordinates. I’m a bit lost right now, can someone explain if I’m missing something?
Would you mind to explain more about your use case.
Please noticed that different model architecture has different output bounding box format.
If you are using the YOLO model, the bounding box should be parsed with YOLO parser rather than detectnet.
So are you trying to understand output format of YOLO?
If yes, you can find the parser in our deepstream sample directly.
I’ll be glad to @AastaLLL !
I’m doing tests with my jetson nano to run it to count people that enter and exit a store.
To accomplish this, I need to
process the videos with Yolov3
export the bounding box info
post process it with our own object tracking and counting algorithms. (I do this in a separate python script)
I used the objectDetector_Yolo example, so this accomplishes 1 and 2, by saving the bounding box info in the deepstream config file [application] gie-kitti-output-dir=/home/ubuntu/kitti_data/
However I’m a bit confused on the bounding box exported to the files. For example
model: Yolov3 (height = width = 416)
input video: .mp4, 640x480
output video: .mp4, 1280x720 (I dont think that this is relevant, but I add it just in case)
Generates a lot of .txt files with the detections. For exampleone detection in one of the files is:
, where I understand that 1564.00 129.00 1702.00 355.00 are the bbox coordinates. So I’m confused about this. I was expecting the coordinates of the bounding boxes to be in the range of the input video, so that I could “draw” or analice the movement of the people with the elements of the store.
So my question is, how do I take the kitti bbox coordinates to the original input coordinates? This is because I know where is the entrance of the store in the input image (for example, a vertical line at 100 pixels from the left)
I found the problem. It was the width and height parameter in [streammux] in the deepstream configuration file. I guess that this has to do with the output video display or something like that. Setting those parameters to the video width and height achieves what I needed.
/**
* Function to dump bounding box data in kitti format. For this to work,
* property "gie-kitti-output-dir" must be set in configuration file.
* Data of different sources and frames is dumped in separate file.
*/
static void
write_kitti_output (AppCtx * appCtx, NvDsBatchMeta * batch_meta)
{
...
}