Different output of neural network using TensorRT model directly and using in DeepStream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
AGX Xavier
• DeepStream Version
5.0
• JetPack Version (valid for Jetson only)
4.4
• TensorRT Version
7.1.3
• Issue Type( questions, new requirements, bugs)
question, bug

I implemented a custom DeeplabV3+/MobilenetV3 FP16 TRT model in DeepStream. When debugging my model I realized that the segmentation maps output show some slight differences (They only match by 96.3%). I experimented on many parameters (Interpolation method, video encoding, dataset…) but I couldn’t get the similarity above this value. I wonder if there is some other parameter for setting in the DeepStream pipeline that could lead to this difference in detection or if there is something fundamentally different in DeepStream that leads to this slight difference.

Here is my pipeline:

“filesrc location=input.mp4 ! qtdemux ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m batch-size=1 width=2000 height=1500 enable-padding=0 ! nvinfer config-file-path= config_infer_primary_deeplab.txt ! appsink name=appsink_deeplabv3 emit-signals=True”

Here is some example of the outputs (1st result : Running TensorRT model directly through Python API, 2nd result: Output from appsink in my DeepStream pipeline)


Appsink callback to read segmentation map:

static GstFlowReturn AppsinkCallbackDeeplabV3(GstAppSink *sink, gpointer u_data) {
GstSample *sample = gst_app_sink_pull_sample(sink);
GstBuffer *buf = gst_sample_get_buffer(sample);
NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta(buf);

for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
l_frame = l_frame->next) {
NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);

  //Access segmentation map
  guint8 image_map[1024][1024];
  guint width_map;
  guint height_map;

  for (l_obj = frame_meta->frame_user_meta_list; l_obj != NULL;
       l_obj = l_obj->next) {
    NvDsUserMeta *user_meta = (NvDsUserMeta *)(l_obj->data);

    if (user_meta->base_meta.meta_type == NVDSINFER_SEGMENTATION_META) {
      NvDsInferSegmentationMeta *user_seg_data =
          (NvDsInferSegmentationMeta *)(user_meta->user_meta_data);

      width_map = user_seg_data->width;
      height_map = user_seg_data->height;
      class_map = user_seg_data->class_map;

      // basic code to save output mask as grayscale image (0 = BG, 255 = Track)
      for (int i = 0; i < height_map; i++) {
        for (int j = 0; j < width_map; j++) {
          // 0 = tracks
          if (*(class_map + (width_map * i + j)) == 0) {
            image_map[i][j] = 255;
          }
          // -1 = background
          else if (*(class_map + (width_map * i + j)) == -1) {
            image_map[i][j] = 0;
          } 
        }
      }
    }
  }


}

gst_buffer_unref(buf);

return GST_FLOW_OK;
}

DeepStream config file:

[property]
gpu-id=0
net-scale-factor=1
enable-dla=0
use-dla-core=0
model-color-format=0
model-engine-file=deeplab.engine
labelfile-path=labels_segment.txt
network-mode=2
num-detected-classes=2
gie-unique-id=1
network-type=2
cluster-mode=4
maintain-aspect-ratio=0
output-tensor-meta=0
segmentation-threshold=0.0
scaling-filter=1

I would appreciate any advice that would help me in solving this issue.

Hi @blubthefish,

Could you refer to 2. [DS5.0GA_Jetson_GPU_Plugin] Dump the Inference Input to dump the input of the TRT inference in DeepStream , and compare to the input of your standalone TRT inference.?