Input Tensor is unexpectedly modified before fetched to primary detector

tutq2 · February 23, 2024, 7:31am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): GPU
• DeepStream Version: 6.4
• JetPack Version (valid for Jetson only)
• TensorRT Version: 8.6.1.6-1+cuda12.0
• NVIDIA GPU Driver Version (valid for GPU only): Driver Version: 525.147.05
• Issue Type( questions, new requirements, bugs): bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I

Why we raise this issue?
Primary detector gives different bounding boxes compared to results gotten when directly calling a Triton Inference server with the same model.engine file.

The bounding boxes are not entirely incorrect, but rather slightly modified (almost worsened), resembling errors in floating-point processing.

Pipeline setup:

My pipeline:
uridecodebin -> nvstreammux -> queue -> nvinfer (primary detector)

Element config:

Input source is a video of size H x W = 640 x 640.
Streammux

      streammux.set_property("width", 640)
      streammux.set_property("height", 640)
      streammux.set_property("batch-size", num_sources)
      streammux.set_property("batched-push-timeout", self.config["batched-push-timeout"])
      streammux.set_property("enable-padding", 0)
      streammux.set_property("interpolation-method", 4)

nvinfer (primary detector)

property:
  gpu-id: 0
  net-scale-factor: 0.0039215697906911373
  offsets: 0;0;0
  model-color-format: 0
  onnx-file: ../models/peoplenet_yolov8x/yolov8x.onnx
  model-engine-file: ../models/peoplenet_yolov8x/yolov8x.onnx_b1_gpu0_fp32.engine
  #int8-calib-file=calib.table
  labelfile-path: ../models/peoplenet_yolov8x/labels.txt
  batch-size: 1
  network-mode: 0
  num-detected-classes: 80
  interval: 0
  gie-unique-id: 1
  #operate-on-class-ids=0
  filter-out-class-ids: 1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52;53;54;55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70;71;72;73;74;75;76;77;78;79
  process-mode: 1
  network-type: 0
  cluster-mode: 2
  maintain-aspect-ratio: 0
  symmetric-padding: 0
  #force-implicit-batch-dim=1
  workspace-size: 1000
  parse-bbox-func-name: NvDsInferParseYolo
  #parse-bbox-func-name=NvDsInferParseYoloCuda
  custom-lib-path: ../custom_parser/libnvds_infercustomparser_tao.so
  output-tensor-meta: 1
  # engine-create-func-name: NvDsInferYoloCudaEngineGet
  crop-objects-to-roi-boundary: 1

class-attrs-all:
  pre-cluster-threshold: 0.21666836936549047
  nms-iou-threshold: 0.5645207000469065
  minBoxes: 2
  dbscan-min-score: 0.693671458753017
  eps: 0.15584185873130887
  detected-min-w: 20
  detected-min-h: 20

Our effort/investigation?

We resized the video to 640x640, as width and height of streammux.
We try to use different values of interpolation-method but the difference is still happened.
We did a simple test, by replacing the nvinfer element to nvinferserver with python backend, we are able to receive and dump input tensors before fetched to detector. We see that the input tensors are a bit different compared to original frames, which are extracted from the input video.

Image 1: Second channel of the 1st frame, logged before fetched to the model in nvinferserver

Image 2: Second channel of the 1st frame, extracted by using opencv-python.

Video for testing:
videos.zip (5.8 MB)

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

fanzh · February 23, 2024, 1:00pm

could you share some screenshots to show the different bboxes?
please refer to this faq for Debug Tips for DeepStream Accuracy Issue.
how did you log the preprocessed data to get that two screenshots?

tutq2 · February 26, 2024, 7:18am

Let me synthesize and send you later.
Yes, I tried to tune those parameters, I can observed how the parameters work, but still can not make the bounding boxes to be same with results gotten from a standalone Triton Inference Server (with the same engine file)

The first image: gotten by writing a Python-backend nvinferserver, in which I put python code to log input tensors, having multiplied by net-scale-factor and plus with offsets . The pipeline is now uridecodebin -> nvstreammux -> queue -> nvinferserver.
The second image, gotten by adding probe to nvdsosd. The pipeline is now uridecodebin -> nvstreammux -> queue -> nvinfer (pgie) -> nvvideoconvert -> capsfilter -> nvdsosd. The additional elements behind pgie are just for getting the image.
Interestingly, frames of input video read by cv2.VideoCapture() and frames taken from both approach are different by pixel-to-pixel comparison.

Config of the nvinference server:

infer_config {
  unique_id: 5
  gpu_ids: [0]
  max_batch_size: 1
  backend {
    trt_is {
      model_name: "peoplenet_yolov8x_py"
      version: -1
      model_repo {
        root: "/iva/model_repository_triton"
        log_level: 2
        tf_gpu_memory_fraction: 0.4
        tf_disable_soft_placement: 0
      }
    }
  }

  preprocess {
    network_format: IMAGE_FORMAT_RGB
    tensor_order: TENSOR_ORDER_LINEAR
    maintain_aspect_ratio: 0
    normalize {
      scale_factor: 1
      channel_offsets: [0, 0, 0]
    }
  }

  extra {
    copy_input_to_host_buffers: true
  }

  custom_lib {
    path: "/opt/nvidia/deepstream/deepstream/lib/libnvds_infercustomparser.so"
  }
}
input_control {
  process_mode: PROCESS_MODE_FULL_FRAME
  interval: 0
}
output_control {
  output_tensor_meta: true
}

Python code to get image by adding probe to nvdsosd:

gst_buffer = info.get_buffer()
if not gst_buffer:
    logging.error("Unable to get GstBuffer")
    return

batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
    try:
        frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
    except StopIteration:
        break
    img = pyds.get_nvds_buf_surface(hash(gst_buffer), frame_meta.batch_id)
    img_copy = np.array(img, copy=True, order='C')
    img_copy = cv2.cvtColor(img_copy, cv2.COLOR_RGBA2BGRA)

capsfilter settings:

caps.set_property("caps", Gst.Caps.from_string("video/x-raw(memory:NVMM), format=RGBA"))

fanzh · February 27, 2024, 7:48am

about “directly calling a Triton Inference server”, do you mean you are using python + triton to do inference without deepstream?
you are using a DeepStream pipeline including nvinfer and python+triton without DeppStream to do inference respectively. and the using DeepStream’s results are worse. are I right? please refer to this yolov8 sample. let’s focus on nvinfer in this topic if using nvinferserver also has the worse results.
In theory, if bboxes are different, we need to compare the data of preprocessing, inference results and postprocessing. here is the method to dump preprocessing and postprocessing data.

tutq2 · February 27, 2024, 8:04am

Yes
Yes, output bounding boxes from DeepStream (A) are different from output bounding boxes from Triton Inference server (B) mentioned in question 1. Bot output A and B look reasonable, B seems worse than A.
Thanks, I will tried it.

tutq2 · February 28, 2024, 6:09am

By doing 2D object detection evaluation on a dataset, I am now Ok with output bounding boxes from DeepStream pipeline, it gives nearly similar performance compared to the results gotten from the Triton Inference Server of the same model file. Thanks.

fanzh · February 28, 2024, 9:30am

Sorry for the late reply, Is this still an DeepStream issue to support? Thanks!

tutq2 · March 8, 2024, 5:01am

Sorry for late, this is not an issue for us any more, I will close it. Thank for support.

system · March 22, 2024, 5:02am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
DeepStream 7.1 nvinferserver tensor clone error DeepStream SDK deepstream	12	76	November 29, 2024
Too much frame drop in deepstream pipeline DeepStream SDK cuda , jetson-inference , gstreamer , deepstream	20	118	February 12, 2025
Some question about Deep stream 5 DeepStream SDK	42	1782	October 12, 2021
Order within triton inference server python backend DeepStream SDK python , inference-server-triton , deepstream	31	1272	May 6, 2024
Processed video without bounding boxes for Gst-nvinferserver plug-in with DeepStream6.4 pipeline DeepStream SDK	2	205	February 8, 2024
Deepstream and Triton containers DeepStream SDK deepstream	5	26	September 30, 2024
How to append DeepStream Metadata in Python without using Streammux / nvinfer for parallel branch? DeepStream SDK	21	685	March 12, 2024
Issues with running inference on multiple rtsp streams in deepstream-imagedata-multistream DeepStream SDK jetson-inference	24	643	August 7, 2024
Nvinfer's results are different from nvinferserver DeepStream SDK tensorrt , camera , gstreamer , nvbugs	21	1259	September 11, 2023
Custom Detection parser error with nvinferserver and custom python model with > 1 streams DeepStream SDK inference-server-triton , gpu , deepstream	18	1083	September 4, 2023

Input Tensor is unexpectedly modified before fetched to primary detector

Related topics