No sgie metadata for some pgie detections using pyds

pwoolvett · December 29, 2020, 3:52pm

Hi. I’m running a deepstream pipeline containing pgie detector + sgie classifier. Given detection metadata extracted from some frame, I’m not always able to extract its corresponding classification metadata.

PIPELINE: The pipeline I’m using is the following (parsed with Gst.parse_launch)

multifilesrc
  name=src
  location=cars/%012d.jpg
  caps="image/jpeg"
! jpegdec
  name=decoder
! nvvideoconvert
! video/x-raw(memory:NVMM), format=NV12, width=1280, height=720
! m.sink_0
  nvstreammux
  name=m
  batch-size=1
  width=1280
  height=720
! nvinfer
  config-file-path=models/Primary_Detector/config_infer_primary.txt
! nvinfer
  config-file-path=models/Secondary_CarColor/config_infer_secondary_carcolor.txt
! fakesink name=monitor

SGIE CONFIGURATION FILE:

[property]
gie-unique-id=2
operate-on-gie-id=1
network-type=1
process-mode=2
classifier-async-mode=0

net-scale-factor=1
model-file=../../models/Secondary_CarColor/resnet18.caffemodel
proto-file=../../models/Secondary_CarColor/resnet18.prototxt
model-engine-file=../../models/Secondary_CarColor/resnet18.caffemodel_b32_gpu0_int8.engine
int8-calib-file=../../models/Secondary_CarColor/cal_trt.bin
mean-file=../../models/Secondary_CarColor/mean.ppm
labelfile-path=../../models/Secondary_CarColor/labels.txt
model-color-format=1
network-mode=0
output-blob-names=predictions/Softmax

force-implicit-batch-dim=1
batch-size=32

classifier-threshold=0
input-object-min-width=0
input-object-min-height=0
input-object-max-width=0
input-object-max-height=0

DATA: Data generated from cars video with this cmd

ffmpeg -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 -start_number 0 -vframes 1088 cars/%012d.jpg

BUFFER_PROBE_CALLBACK: Metadata extracted just like in the sample callback up to this point, attached to the monitor element. Then:

...
obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)

report_obj_meta_counter(obj_meta)
classifier_meta_objects = obj_meta.classifier_meta_list  # THIS IS NONE SOMETIMES, but generally a `pyds.GList`
while classifier_meta_objects is not None:
    try:
        classifier_metadata = pyds.NvDsClassifierMeta.cast(classifier_meta_objects.data)
    except StopIteration:
        break

    label_info_list = obj_meta.label_info_list  # I think we've got to iterate again here for the multilabel case...
    while label_info_list is not None:
        try:
            label_info = pyds.NvDsLabelInfo.cast(label_info_list.data)
        except StopIteration:
            break
        report_label_info_counter(label_info)
        try:
            label_info_list = label_info_list.next
        except StopIteration:
            break
    try:
        classifier_meta_objects = classifier_meta_objects.next
    except StopIteration:
        break
...

• NOTES

Using the snippet above, the number of classifications (15249 as reported by report_label_info_counter) is slightly lower than the number of detections (16496 as reported by report_obj_meta_counter). There are cases where obj_meta.classifier_meta_list is None, instead of a pyds.GList.
The same number of frames (1088) are being processed if I change the decoder from jpegdec to nvjpegdec. If I turn on raw-output-tensors, both the pgie and sgie output 1088, consistent with the number of frames. However, I loose 11 detections (from 16496 to 16485. On the other hand, the number of classifications goes up from 15249 to 15252. These numbers are consistent across >10 runs each.
I can manually enforce the SGIE to “skip” detections, eg by increasing the input-object-min-width to 200. In that case, when also enabling raw-output-tensors, the number of sgie output tensors decreases (to 365), which makes me doubt its a sgie config file issue…
Other
- Using dla does not affect the numbers.
- network-mode does not affect the numbers
- Using jpegdec vs nvjpegdec does, but slightly (maybe a sync or a flush thing?), and does not solve the problem.

• QUESTIONS

What happened to those obj_meta which do not have corresponding classifier_meta_list?
Is this a pyds or nvinfer issue?

• Hardware Platform (Jetson / GPU)

JETSON_TYPE=AGX Xavier [16GB]
JETSON_CHIP_ID=25
JETSON_SOC=tegra194
JETSON_MACHINE=NVIDIA Jetson AGX Xavier [16GB]
JETSON_CODENAME=galen
JETSON_BOARD=P2822-0000
JETSON_MODULE=P2888-0001

• DeepStream Version Version: 5.0 (GCID: 23607587)
• JetPack Version (valid for Jetson only) 4.4
• TensorRT Version : 7.1.3.0

bcao · December 31, 2020, 8:10am

Hey, could you share a repro with us.
Also, have you tried to use deepstream-app to run your pipeline?

pwoolvett · December 31, 2020, 3:16pm

Hi, thanks for answering…

Here’s a MWE for Jetson Xavier

I have not. Can I use it from python? I’m not comfortable with C++, but if it ends up being a pyds issue, I’m not closed to considering other alternatives. I’ll take a look in the meantime.

pbcorrea1 · January 4, 2021, 3:15pm

Any updates on this? I’m experiencing a similar issue.

pwoolvett · January 4, 2021, 3:29pm

Nothing so far. I’ve also tried extracting label metadata directly from the batch. Same number of “lost” detections:

...
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
...
plist = batch_meta.label_info_meta_pool.full_list
while plist is not Note:
  try:
    label_info = pyds.NvDsLabelInfo.cast(plist.data)
  except StopIteration:
    break
  report_label2_meta_counter(label_info)
  try:
    plist=plist.next
  except StopIteration:
    break
...

mohammad1 · January 12, 2021, 12:50pm

Hello there, any updates regarding this issue. I am facing similar thing using C++ deepstream.

I am using a primary model to detect people and a custom secondary model (classifier). However, not all detected objects have metadata attached from the secondary although setting classification-threshold to 0.1 and even 0

bcao · January 12, 2021, 3:17pm

Hey @mohammad1 , so it’s not related to python, seems caused by meta data management, would you mind to share your repro with us, I would like to use the native c/c++ sample to debug.

mohammad1 · January 13, 2021, 5:41pm

Thanks @bcao for following up with this issue. The code is available at GitHub - zaka-ai/ds_example2_classification_problem

It is the same deepstream-test2 example (vehicle + color + type + make) with small modifications. I am printing the output of the classification of each object. As shown in the image attached, classification exits for some objects but not all and for some classification models but not all.

mohammad1 · January 14, 2021, 5:43am

These are some relevant issues but the solutions are either unsatisfactory or didn’t work

mohammad1 · January 20, 2021, 8:16am

Hello @bcao , any updates regarding this issue?

bcao · January 20, 2021, 8:48am

Hey, we are looking into it, will reply you ASAP.

bcao · January 26, 2021, 2:26pm

I checked the issue and the reason should be the object width or height < 16.
You can check the gstnvinfer.cpp ->should_infer_object for more details.

pwoolvett · January 26, 2021, 4:15pm

Thanks @bcao !

Every single detection without a corresponding classification has either of the dimmensions lower than 16.
Increasing MIN_INPUT_OBJECT_WIDTH and MIN_INPUT_OBJECT_HEIGHT( around line 60), increases the difference between detections vs classifications.

I cannot lower the values (segfault). A comment in the code even says

/* Should not infer on objects smaller than MIN_INPUT_OBJECT_WIDTH x MIN_INPUT_OBJECT_HEIGHT
   * since it will cause hardware scaling issues. */

Is there a way around this?

It would be good tho have in the docs somewhere that a value lower than 16 in the pgie-config-file is ignored :)

bcao · February 2, 2021, 8:07am

Could you set scaling-compute-hw=1 and then lower those values and try again?

pwoolvett · February 4, 2021, 2:04pm

SOLUTION:

lower the MIN_INPUT_OBJECT_WIDTH and MIN_INPUT_OBJECT_HEIGHT macro values in gstnvinfer.cpp
scaling-compute-hw=1 in sgie

revert to original values
make clean && make install
sgie → scaling-compute-hw={0,1,2}

original results, no changes (16496 vs 15249)

change macros to (8,8)
cd /opt/nvidia/deepstream/deepstream-5.0/sources/gst-plugins/gst-nvinfer && make clean && make install

sgie → scaling-compute-hw=2 → segfault most of the times:

/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform.cpp:3115: => VIC Configuration failed image scale factor exceeds 16, use GPU for Transformation
0:00:03.629223122  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1268:convert_batch_and_push_to_input_thread:<nvinfer1> error: NvBufSurfTransform failed with error -2 while converting buffer
0:00:03.629371225  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1975:gst_nvinfer_output_loop:<nvinfer0> error: Internal data stream error.
0:00:03.629393434  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1975:gst_nvinfer_output_loop:<nvinfer0> error: streaming stopped, reason error (-5)
0:00:03.630007863  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1268:convert_batch_and_push_to_input_thread:<nvinfer1> error: NvBufSurfTransform failed with error -3 while converting buffer
Segmentation fault (core dumped)

which, accordfing to nvbufsurftransform.h, mean “NvBufSurfTransformError_Invalid_Params” and “NvBufSurfTransformError_Execution_Error”

when it did not crash, it got 795 (same value for both detector and classifier!). I think this might have happened regardless of the macro change If I had tried enough times…

This got me thinking:

The docs say this Integer 0: Platform default – GPU (dGPU), VIC (Jetson) 1: GPU 2: VIC (Jetson only)
The log asked to use GPU
sgie → scaling-compute-hw=1 => 16496 detections and 16496 classifications!!!

Thanks again for the insights @bcao , I’ll run more tests to ensure this is reproducible with different values, but this workaround completely fulfills my needs. If I am able to reproduce this after more experimentation, I’ll come back and mark as solved :)

pwoolvett · February 4, 2021, 2:17pm

I’ve tried changing the macros all the way down to 0, and scaling-compute-hw=1 allows correct execution.

NOTE: the smallest detections I got from the sample data had minimum bboxes in the 8-16 range, so while I can confirm this works for smaller bboxes than the original setting (lower than 16), I have not tested everything works /won’t crash in case the detection produces bboxes with dimensions lower than 8.

bcao · February 4, 2021, 2:50pm

Great work, had corrected my commets, yeah, scaling-compute-hw should be 1, since the known hardware scaling issues is from VIC.

But keep in mind we also didn’t test such case

Topic		Replies	Views
Deepstream-6.3 secondary-reinfer-interval property in sgie not working DeepStream SDK	9	478	March 18, 2024
Secondary classifiers labels are missing from output DeepStream SDK	4	731	July 22, 2020
Depth estimation with deepstream DeepStream SDK	7	406	June 14, 2024
How to use metadata from the secondary classifier in Python DeepStream SDK	6	1384	February 1, 2022
Secondary Classifier outputs integer instead of labels DeepStream SDK	15	1857	October 12, 2021
Sgie inference does not work on all detected objects DeepStream SDK cuda , ubuntu , gstreamer , python	12	1859	November 9, 2021
Deepstream5.0.1+yolov5+resnet50 No output classifier result DeepStream SDK gstreamer	4	928	August 4, 2021
NVIDIA-AI-IOT / deepstream_lpr_app is not working when only using LPD and LPR model DeepStream SDK	23	688	June 8, 2023
RuntimeError: get_nvds_buf_Surface: Currently we only support RGBA color Format DeepStream SDK deepstream	13	455	July 2, 2024
How to identify the object detection function in deepstream-test12 app? DeepStream SDK	4	722	October 9, 2021

No sgie metadata for some pgie detections using pyds

Related topics