Object meta dimensions don't match source

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Orin
• DeepStream Version 6.2
• JetPack Version (valid for Jetson only) 5.1

Hi

I am a little confused by the numbers that I am seeing for the top, left, width and height values of the NvOSD_RectParams object from my frame meta objects.

If I get print the CV2 image shape, I get the resolution 960 x 544
If I print the source_width and source_height of the frame meta (NvDsFrameMeta) I get 704 x 576 (which matches the original camera res)

However, if I print the properties of the object rectangle (NvOSD_RectParams) I get the following values (as an example):
Top: 677
Left: 1015
Width: 6
Height: 22

These dimensions don’t fit inside either of the images (that I would expect them to fit inside)

Do you know what these dimensions represent? In the docs it just says “pixels”, but I’m struggling to find which pixels. The application I am using is structured similarly to deepstream_imagedata-multistream_redaction.py

How did you print these values? As you understand, these values represent pixels.

Can you share the code and configuration files? Or tell me how to reproduce your problem using sample code

This is how I am printing them (obj_meta is a NvDsObjectMeta):

    ...
    rect_params = obj_meta.rect_params
    top = int(rect_params.top)
    left = int(rect_params.left)
    width = int(rect_params.width)
    height = int(rect_params.height)
    print(top, left, width, height, classification, confidence)

My pipeline has:

  • a nvstreammux with dimensions of 960x544
  • a nvinfer pgie with the default config_infer_primary_peoplenet.yml config from DS tao (infer-dims: 3;544;960)
  • a nvtracker with dimensions of 640 x 384
  • a nvdsanalytics with the same dimensions as the tracker
  • a nvvideoconvert
  • a capsfilter
  • a fake sink which my probe is added to (same setup as deepstream_imagedata-multistream_redaction.py)

The pipeline elements are linked in the order as described above.

As stated in the original question, if I do:

n_frame = pyds.get_nvds_buf_surface(hash(gst_buffer), frame_meta.batch_id)
print(n_frame.shape)

I get 960 x 544 as the resolution.

If I do:

frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
print(frame_meta.source_width, frame_meta.source_height)

I get 704 x 576, which is the resolution that the camera is set to.

Do you mean frame_meta.source_frame_width and frame_meta.source_frame_height

As described in the documentation:

  • source_frame_widthint, Holds the width of the frame at input to Gst-streammux.
  • source_frame_heightint, Holds the height of the frame at input to Gst-streammux.

So they refer to the unscaled width and height of the video

https://docs.nvidia.com/metropolis/deepstream/6.2/dev-guide/python-api/PYTHON_API/NvDsMeta/NvDsFrameMeta.html

This refers to the width and height of the batch size, which is inferred after scaling by nvinfer.

https://docs.nvidia.com/metropolis/deepstream/6.2/dev-guide/python-api/PYTHON_API/Methods/methodsdoc.html#get-nvds-buf-surface

Sorry, yes I do mean source_frame_width / source_frame_height

So if the unscaled dimensions of the frame are 704 x 576, and the scaled dimensions are 960 x 544, how is it possible that the bounding box in the NvOSD_RectParams can start at left: 1015, top: 677 (see original question)?

Those dimensions are out of both the unscaled and the scaled dimensions of the frame.

For more context: I am using the function draw_bounding_boxes from the examples to try to draw the bounding boxes on the images, but the boxes are outside of the frame.

Can you provide a test stream to reproduce the problem?

I tested deepstream-imagedata-multistream-redaction and did not find this problem.

I use /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 to test.

I just tried my code with this stock video and the bounding boxes were drawn correctly, so I assume the issue is related to the stream that I am using.

The issue also seems to be intermittent with our own streams. I have tried ~200 streams and the problem exists for some but not others.

It would help me to know how the NvOSD_RectParams is calculated. In the documentation it says that it is just “pixels”, but what are the pixels based on? The scaled image or the unscaled image? Or something else?

Scaled image,The coordinates will be drawn by nvdsosd

So how is it possible for the “left” coordinate to be >1000 if the scaled image is only 960px wide?

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one. Thanks

I think this is normal, but it depends on the output of the model.

If you can upgrade to DS-6.4, setting the crop-objects-to-roi-boundary property to true can avoid this problem.

https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html#id2

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.