No sgie metadata for some pgie detections using pyds

Hi. I’m running a deepstream pipeline containing pgie detector + sgie classifier. Given detection metadata extracted from some frame, I’m not always able to extract its corresponding classification metadata.

  • PIPELINE: The pipeline I’m using is the following (parsed with Gst.parse_launch)

    multifilesrc
      name=src
      location=cars/%012d.jpg
      caps="image/jpeg"
    ! jpegdec
      name=decoder
    ! nvvideoconvert
    ! video/x-raw(memory:NVMM), format=NV12, width=1280, height=720
    ! m.sink_0
      nvstreammux
      name=m
      batch-size=1
      width=1280
      height=720
    ! nvinfer
      config-file-path=models/Primary_Detector/config_infer_primary.txt
    ! nvinfer
      config-file-path=models/Secondary_CarColor/config_infer_secondary_carcolor.txt
    ! fakesink name=monitor
    
  • SGIE CONFIGURATION FILE:

    [property]
    gie-unique-id=2
    operate-on-gie-id=1
    network-type=1
    process-mode=2
    classifier-async-mode=0
    
    net-scale-factor=1
    model-file=../../models/Secondary_CarColor/resnet18.caffemodel
    proto-file=../../models/Secondary_CarColor/resnet18.prototxt
    model-engine-file=../../models/Secondary_CarColor/resnet18.caffemodel_b32_gpu0_int8.engine
    int8-calib-file=../../models/Secondary_CarColor/cal_trt.bin
    mean-file=../../models/Secondary_CarColor/mean.ppm
    labelfile-path=../../models/Secondary_CarColor/labels.txt
    model-color-format=1
    network-mode=0
    output-blob-names=predictions/Softmax
    
    force-implicit-batch-dim=1
    batch-size=32
    
    classifier-threshold=0
    input-object-min-width=0
    input-object-min-height=0
    input-object-max-width=0
    input-object-max-height=0
    
  • DATA: Data generated from cars video with this cmd

    ffmpeg -i /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 -start_number 0 -vframes 1088 cars/%012d.jpg
    
  • BUFFER_PROBE_CALLBACK: Metadata extracted just like in the sample callback up to this point, attached to the monitor element. Then:

    ...
    obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
    
    report_obj_meta_counter(obj_meta)
    classifier_meta_objects = obj_meta.classifier_meta_list  # THIS IS NONE SOMETIMES, but generally a `pyds.GList`
    while classifier_meta_objects is not None:
        try:
            classifier_metadata = pyds.NvDsClassifierMeta.cast(classifier_meta_objects.data)
        except StopIteration:
            break
    
        label_info_list = obj_meta.label_info_list  # I think we've got to iterate again here for the multilabel case...
        while label_info_list is not None:
            try:
                label_info = pyds.NvDsLabelInfo.cast(label_info_list.data)
            except StopIteration:
                break
            report_label_info_counter(label_info)
            try:
                label_info_list = label_info_list.next
            except StopIteration:
                break
        try:
            classifier_meta_objects = classifier_meta_objects.next
        except StopIteration:
            break
    ...
    
    

• NOTES

  1. Using the snippet above, the number of classifications (15249 as reported by report_label_info_counter) is slightly lower than the number of detections (16496 as reported by report_obj_meta_counter). There are cases where obj_meta.classifier_meta_list is None, instead of a pyds.GList.

  2. The same number of frames (1088) are being processed if I change the decoder from jpegdec to nvjpegdec. If I turn on raw-output-tensors, both the pgie and sgie output 1088, consistent with the number of frames. However, I loose 11 detections (from 16496 to 16485. On the other hand, the number of classifications goes up from 15249 to 15252. These numbers are consistent across >10 runs each.

  3. I can manually enforce the SGIE to “skip” detections, eg by increasing the input-object-min-width to 200. In that case, when also enabling raw-output-tensors, the number of sgie output tensors decreases (to 365), which makes me doubt its a sgie config file issue…

  4. Other

    • Using dla does not affect the numbers.
    • network-mode does not affect the numbers
    • Using jpegdec vs nvjpegdec does, but slightly (maybe a sync or a flush thing?), and does not solve the problem.

• QUESTIONS

  1. What happened to those obj_meta which do not have corresponding classifier_meta_list?
  2. Is this a pyds or nvinfer issue?

• Hardware Platform (Jetson / GPU)

JETSON_TYPE=AGX Xavier [16GB]
JETSON_CHIP_ID=25
JETSON_SOC=tegra194
JETSON_MACHINE=NVIDIA Jetson AGX Xavier [16GB]
JETSON_CODENAME=galen
JETSON_BOARD=P2822-0000
JETSON_MODULE=P2888-0001

• DeepStream Version Version: 5.0 (GCID: 23607587)
• JetPack Version (valid for Jetson only) 4.4
• TensorRT Version : 7.1.3.0

2 Likes

Hey, could you share a repro with us.
Also, have you tried to use deepstream-app to run your pipeline?

Hi, thanks for answering…

Here’s a MWE for Jetson Xavier

I have not. Can I use it from python? I’m not comfortable with C++, but if it ends up being a pyds issue, I’m not closed to considering other alternatives. I’ll take a look in the meantime.

Any updates on this? I’m experiencing a similar issue.

Nothing so far. I’ve also tried extracting label metadata directly from the batch. Same number of “lost” detections:

...
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
...
plist = batch_meta.label_info_meta_pool.full_list
while plist is not Note:
  try:
    label_info = pyds.NvDsLabelInfo.cast(plist.data)
  except StopIteration:
    break
  report_label2_meta_counter(label_info)
  try:
    plist=plist.next
  except StopIteration:
    break
...

Hello there, any updates regarding this issue. I am facing similar thing using C++ deepstream.

I am using a primary model to detect people and a custom secondary model (classifier). However, not all detected objects have metadata attached from the secondary although setting classification-threshold to 0.1 and even 0

Hey @mohammad1 , so it’s not related to python, seems caused by meta data management, would you mind to share your repro with us, I would like to use the native c/c++ sample to debug.

Thanks @bcao for following up with this issue. The code is available at GitHub - zaka-ai/ds_example2_classification_problem

It is the same deepstream-test2 example (vehicle + color + type + make) with small modifications. I am printing the output of the classification of each object. As shown in the image attached, classification exits for some objects but not all and for some classification models but not all.

image

These are some relevant issues but the solutions are either unsatisfactory or didn’t work

Hello @bcao , any updates regarding this issue?

Hey, we are looking into it, will reply you ASAP.

I checked the issue and the reason should be the object width or height < 16.
You can check the gstnvinfer.cpp ->should_infer_object for more details.

Thanks @bcao !

  • Every single detection without a corresponding classification has either of the dimmensions lower than 16.

  • Increasing MIN_INPUT_OBJECT_WIDTH and MIN_INPUT_OBJECT_HEIGHT( around line 60), increases the difference between detections vs classifications.

I cannot lower the values (segfault). A comment in the code even says

/* Should not infer on objects smaller than MIN_INPUT_OBJECT_WIDTH x MIN_INPUT_OBJECT_HEIGHT
   * since it will cause hardware scaling issues. */

Is there a way around this?

It would be good tho have in the docs somewhere that a value lower than 16 in the pgie-config-file is ignored :)

Could you set scaling-compute-hw=1 and then lower those values and try again?

SOLUTION:

  1. lower the MIN_INPUT_OBJECT_WIDTH and MIN_INPUT_OBJECT_HEIGHT macro values in gstnvinfer.cpp
  2. scaling-compute-hw=1 in sgie

  1. revert to original values
  2. make clean && make install
  3. sgie → scaling-compute-hw={0,1,2}

original results, no changes (16496 vs 15249)

  1. change macros to (8,8)
  2. cd /opt/nvidia/deepstream/deepstream-5.0/sources/gst-plugins/gst-nvinfer && make clean && make install
  3. sgie → scaling-compute-hw=2 → segfault most of the times:
    /dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform.cpp:3115: => VIC Configuration failed image scale factor exceeds 16, use GPU for Transformation
    0:00:03.629223122  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1268:convert_batch_and_push_to_input_thread:<nvinfer1> error: NvBufSurfTransform failed with error -2 while converting buffer
    0:00:03.629371225  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1975:gst_nvinfer_output_loop:<nvinfer0> error: Internal data stream error.
    0:00:03.629393434  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1975:gst_nvinfer_output_loop:<nvinfer0> error: streaming stopped, reason error (-5)
    0:00:03.630007863  5083      0xfa42e80 WARN                 nvinfer gstnvinfer.cpp:1268:convert_batch_and_push_to_input_thread:<nvinfer1> error: NvBufSurfTransform failed with error -3 while converting buffer
    Segmentation fault (core dumped)
    

which, accordfing to nvbufsurftransform.h, mean “NvBufSurfTransformError_Invalid_Params” and “NvBufSurfTransformError_Execution_Error”

  • when it did not crash, it got 795 (same value for both detector and classifier!). I think this might have happened regardless of the macro change If I had tried enough times…

This got me thinking:

  • The docs say this Integer 0: Platform default – GPU (dGPU), VIC (Jetson) 1: GPU 2: VIC (Jetson only)
  • The log asked to use GPU
  • sgie → scaling-compute-hw=1 => 16496 detections and 16496 classifications!!!

Thanks again for the insights @bcao , I’ll run more tests to ensure this is reproducible with different values, but this workaround completely fulfills my needs. If I am able to reproduce this after more experimentation, I’ll come back and mark as solved :)

I’ve tried changing the macros all the way down to 0, and scaling-compute-hw=1 allows correct execution.

  • NOTE: the smallest detections I got from the sample data had minimum bboxes in the 8-16 range, so while I can confirm this works for smaller bboxes than the original setting (lower than 16), I have not tested everything works /won’t crash in case the detection produces bboxes with dimensions lower than 8.

Great work, had corrected my commets, yeah, scaling-compute-hw should be 1, since the known hardware scaling issues is from VIC.

But keep in mind we also didn’t test such case