Kitti BBox coordinates output

eimarinb.telefonica · April 15, 2020, 8:22pm

Hi!
I’ve enabled to save the bbox coordinates (by addint the kitti output export folder in the deepstream config .txt). This outputs a file with the detections and its bounding box, like this one:

person 0.0 0 0.0 1564.00 129.00 1702.00 355.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Here, I understand that the numbers 1564.00 129.00 1702.00 355.00 are the bbox coordinates, ehich is eplaned in the write_kitti function in deepstream_app.c

github.com

NVIDIA/DIGITS/blob/v4.0.0-rc.3/digits/extensions/data/objectDetection/README.md

# Object Detection Data Extension

This data extension creates DIGITS datasets for object detection networks such as [DetectNet](https://github.com/NVIDIA/caffe/tree/caffe-0.15/examples/kitti).

DIGITS uses the KITTI format for object detection data.
When preparing your own data for ingestion into a dataset, you must follow the same format.

#### Table of contents

* [Folder structure](#folder-structure)
* [Label format](#label-format)
* [Custom class mappings](#custom-class-mappings)

## Folder structure

You should have one folder containing images, and another folder containing labels.

* Image filenames are formatted like `IDENTIFIER.EXTENSION` (e.g. `000001.png` or `foo.jpg`).
* Label filenames are formatted like `IDENTIFIER.txt` (e.g. `000001.txt` or `foo.txt`).

This file has been truncated. show original

However these bbox coordinates are different from the video dimentions. In particular, the video I’m working with ins 640x480, and the .mp4 output it saves in sink0 is 1280x720, neither of which coincide with the left 1564.00 and right 1702.00 coordinates. I’m a bit lost right now, can someone explain if I’m missing something?

thanks a lot in advance,
Enrique

AastaLLL · April 16, 2020, 2:53am

Hi,

The output is related to the model input size.
Could you share the dimension of your prototxt data layer with us first?

Ex.

layer {
  name: "deploy_data"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 384
      dim: 1248
    }
  }
  include: { phase: TEST not_stage: "val" }
}

Thanks.

eimarinb.telefonica · April 16, 2020, 2:38pm

thanks for the quick reply @AastaLLL
Where can I check this?
Im using the objectDetector_Yolo example, with yolov3 with height = 416 and width = 416.

thanks,

AastaLLL · April 17, 2020, 2:24am

Hi,

Would you mind to explain more about your use case.

Please noticed that different model architecture has different output bounding box format.
If you are using the YOLO model, the bounding box should be parsed with YOLO parser rather than detectnet.

So are you trying to understand output format of YOLO?
If yes, you can find the parser in our deepstream sample directly.

 /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/

Thanks.

eimarinb.telefonica · April 17, 2020, 10:15pm

I’ll be glad to @AastaLLL !
I’m doing tests with my jetson nano to run it to count people that enter and exit a store.

To accomplish this, I need to

process the videos with Yolov3
export the bounding box info
post process it with our own object tracking and counting algorithms. (I do this in a separate python script)

I used the objectDetector_Yolo example, so this accomplishes 1 and 2, by saving the bounding box info in the deepstream config file
[application] gie-kitti-output-dir=/home/ubuntu/kitti_data/

However I’m a bit confused on the bounding box exported to the files. For example

model: Yolov3 (height = width = 416)
input video: .mp4, 640x480
output video: .mp4, 1280x720 (I dont think that this is relevant, but I add it just in case)

Generates a lot of .txt files with the detections. For exampleone detection in one of the files is:

person 0.0 0 0.0 1564.00 129.00 1702.00 355.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0

, where I understand that 1564.00 129.00 1702.00 355.00 are the bbox coordinates. So I’m confused about this. I was expecting the coordinates of the bounding boxes to be in the range of the input video, so that I could “draw” or analice the movement of the people with the elements of the store.
So my question is, how do I take the kitti bbox coordinates to the original input coordinates? This is because I know where is the entrance of the store in the input image (for example, a vertical line at 100 pixels from the left)

Thanks in advance,

eimarinb.telefonica · April 21, 2020, 2:51pm

I found the problem. It was the width and height parameter in [streammux] in the deepstream configuration file. I guess that this has to do with the output video display or something like that. Setting those parameters to the video width and height achieves what I needed.

thanks for the help!

AastaLLL · April 22, 2020, 6:24am

Hi,

YES. The bounding box coordinate is based on the display size.

More, Kitti output function is also open-sourced.
You can find the detail in this file:

/opt/nvidia/deepstream/deepstream-4.0/sources/apps/sample_apps/deepstream-app/deepstream_app.c

/**
 * Function to dump bounding box data in kitti format. For this to work,
 * property "gie-kitti-output-dir" must be set in configuration file.
 * Data of different sources and frames is dumped in separate file.
 */
static void
write_kitti_output (AppCtx * appCtx, NvDsBatchMeta * batch_meta)
{
    ...
}

Thanks.

Topic		Replies	Views
DeepStream app - Running yolov8 with TensorRT and DeepStream SDK, not able to extract information regarding class_id, bounding box coordinates DeepStream SDK jetson , deepstream	9	37	September 23, 2024
Deepstream-6.3 How to call or output object bounding box DeepStream SDK	2	382	November 11, 2023
Deepstream YoloV3-Tiny is giving oversized bounding boxes DeepStream SDK	8	1330	October 12, 2021
Coordinates of found object from CSI camera DeepStream SDK	6	631	October 12, 2021
Bounding box data in deepstream_test_2 DeepStream SDK	6	1881	October 12, 2021
DeepStream Python to get bounding boxes from detections DeepStream SDK	9	3772	October 12, 2021
YOLOv8-OBB model in deepstream General deepstream	13	346	August 9, 2024
Output of deepstream app model DeepStream SDK jetson-inference , gstreamer , python , deepstream	15	663	August 14, 2023
Yolov3 bounding boxes outside osd screen DeepStream SDK	4	667	October 12, 2021
Kitti Format misaligment DeepStream SDK	6	342	October 12, 2021

Kitti BBox coordinates output

Related topics