NVDSINFER_TENSOR_OUTPUT_META missing when nvinfer in pgie mode with both input-tensor-meta and output-tensor-meta enabled

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) : Dgpu
• DeepStream Version: 7.0
• TensorRT Version: 8.6.1
• Issue Type: bug?
• How to reproduce the issue: Enable both input-tensor-meta and output-tensor-meta and get NVDSINFER_TENSOR_OUTPUT_META on nvinfer pad probe

Normally, enabling only output-tensor-meta, the output tensor is attached into each frame meta and can be acessed with:

        batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
        l_frame = batch_meta.frame_meta_list
        
        while l_frame is not None:
            try:
                frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
            except StopIteration:
                break
            l_user = frame_meta.frame_user_meta_list
            
            while l_user is not None:
                frame_object_list = []
                try:
                    # Note that l_user.data needs a cast to pyds.NvDsUserMeta
                    # The casting also keeps ownership of the underlying memory
                    # in the C code, so the Python garbage collector will leave
                    # it alone.
                    user_meta = pyds.NvDsUserMeta.cast(l_user.data)
                except StopIteration:
                    break

                if (
                        user_meta.base_meta.meta_type
                        != pyds.NvDsMetaType.NVDSINFER_TENSOR_OUTPUT_META
                ):
                    l_user = l_user.next
                    continue

But with both input-tensor-meta and output-tensor-meta, the NVDSINFER_TENSOR_OUTPUT_META is attach into batch_meta and must be accessed with:

        while l_user is not None:
            try:

                user_meta=pyds.NvDsUserMeta.cast(l_user.data)
            except StopIteration:
                break
            if (
                    user_meta.base_meta.meta_type
                    == pyds.NvDsMetaType.NVDSINFER_TENSOR_OUTPUT_META
            ):  
                
                tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data)
                layers_info = ProbesHandler.get_layer_by_name(tensor_meta, "output")
                ptr = ctypes.cast(pyds.get_ptr(layers_info.buffer), ctypes.POINTER(ctypes.c_float))
                
                dims = np.trim_zeros(layers_info.inferDims.d, 'b')

                v = np.ctypeslib.as_array(ptr, shape=dims).copy()
                vector_list.append(v)
                #print(v.shape)
                #print(v)

            try:
                l_user=l_user.next
            except StopIteration:
                break

The problem is that, i can only get one NVDSINFER_TENSOR_OUTPUT_META on my batch-meta even though my batch-size is higher(5-10). I should be getting 5-10 NVDSINFER_TENSOR_OUTPUT_META for each frame in the batch. Is this a known issue?

does each frame have valide inference result? do you mean the size of batch_meta.user_meta_list is 1?
When input-tensor-meta is 1, maybe one frame includes multiple ROIs, hence output tensor needs ROI info. Since nvinfer plugin is opensource, Please refer to attach_tensor_output_meta in DS7.1, you can use DS7.1 or port the codes to DS7.0.

Yes each frame have a valid input. i have a look at source code for attach tensor output in nvinfer, if both process on frame and input tensor meta enable, it will attach only one user meta into the batch.

/* Attaches the raw tensor output to the GstBuffer as metadata. */
void
attach_tensor_output_meta (GstNvInfer *nvinfer, GstMiniObject * tensor_out_object,
    GstNvInferBatch *batch, NvDsInferContextBatchOutput *batch_output)
{
  NvDsBatchMeta *batch_meta = (nvinfer->process_full_frame
      || nvinfer->input_tensor_from_meta) ? batch->frames[0].
      frame_meta->base_meta.batch_meta : batch->frames[0].obj_meta->base_meta.
      batch_meta;

  /* Create and attach NvDsInferTensorMeta for each frame/object. Also
   * increment the refcount of GstNvInferTensorOutputObject. */
  for (size_t j = 0; j < batch->frames.size (); j++) {
    GstNvInferFrame & frame = batch->frames[j];

    /* Processing on ROIs (not frames or objects) skip attaching tensor output
     * to frames or objects. */
    NvDsInferTensorMeta *meta = new NvDsInferTensorMeta;
    meta->unique_id = nvinfer->unique_id;
    meta->num_output_layers = nvinfer->output_layers_info->size ();
    meta->output_layers_info = (NvDsInferLayerInfo*)g_memdup2(nvinfer->output_layers_info->data (),
     meta->num_output_layers * sizeof (NvDsInferLayerInfo)) ;
    meta->out_buf_ptrs_host = new void *[meta->num_output_layers];
    meta->out_buf_ptrs_dev = new void *[meta->num_output_layers];
    meta->gpu_id = nvinfer->gpu_id;
    meta->priv_data = gst_mini_object_ref (tensor_out_object);
    meta->network_info = nvinfer->network_info;
    meta->maintain_aspect_ratio = nvinfer->maintain_aspect_ratio;

    for (unsigned int i = 0; i < meta->num_output_layers; i++) {
      NvDsInferLayerInfo & info = meta->output_layers_info[i];
      meta->out_buf_ptrs_dev[i] =
          (uint8_t *) batch_output->outputDeviceBuffers[i] +
          info.inferDims.numElements * get_element_size (info.dataType) * j;
      meta->out_buf_ptrs_host[i] =
          (uint8_t *) batch_output->hostBuffers[info.bindingIndex] +
          info.inferDims.numElements * get_element_size (info.dataType) * j;
    }

    NvDsUserMeta *user_meta = nvds_acquire_user_meta_from_pool (batch_meta);
    user_meta->user_meta_data = meta;
    user_meta->base_meta.meta_type =
        (NvDsMetaType) NVDSINFER_TENSOR_OUTPUT_META;
    user_meta->base_meta.release_func = release_tensor_output_meta;
    user_meta->base_meta.copy_func = copy_tensor_output_meta;
    user_meta->base_meta.batch_meta = batch_meta;

    if (nvinfer->input_tensor_from_meta) {
      nvds_add_user_meta_to_roi (frame.roi_meta, user_meta);
      /* if object is roi itself */
      if (frame.obj_meta) {
        nvds_add_user_meta_to_obj (frame.obj_meta, user_meta);
      }
    } else if (nvinfer->process_full_frame) {
      nvds_add_user_meta_to_frame (frame.frame_meta, user_meta);
    } else {
      nvds_add_user_meta_to_obj (frame.obj_meta, user_meta);
    }
  }

  /* NvInfer is receiving input from tensor meta, also attach output tensor meta
   * at batch level. */
  if (nvinfer->input_tensor_from_meta) {
    NvDsInferTensorMeta *meta = new NvDsInferTensorMeta;
    meta->unique_id = nvinfer->unique_id;
    meta->num_output_layers = nvinfer->output_layers_info->size ();
    meta->output_layers_info = (NvDsInferLayerInfo*)g_memdup2(nvinfer->output_layers_info->data (),
     meta->num_output_layers * sizeof (NvDsInferLayerInfo)) ;
    meta->out_buf_ptrs_host = new void *[meta->num_output_layers];
    meta->out_buf_ptrs_dev = new void *[meta->num_output_layers];
    meta->gpu_id = nvinfer->gpu_id;
    meta->priv_data = gst_mini_object_ref (tensor_out_object);
    meta->network_info = nvinfer->network_info;

    for (unsigned int i = 0; i < meta->num_output_layers; i++) {
      NvDsInferLayerInfo & info = meta->output_layers_info[i];
      meta->out_buf_ptrs_dev[i] =
          (uint8_t *) batch_output->outputDeviceBuffers[i];
      meta->out_buf_ptrs_host[i] =
          (uint8_t *) batch_output->hostBuffers[info.bindingIndex];
    }

    NvDsUserMeta *user_meta = nvds_acquire_user_meta_from_pool (batch_meta);
    user_meta->user_meta_data = meta;
    user_meta->base_meta.meta_type =
        (NvDsMetaType) NVDSINFER_TENSOR_OUTPUT_META;
    user_meta->base_meta.release_func = release_tensor_output_meta;
    user_meta->base_meta.copy_func = copy_tensor_output_meta;
    user_meta->base_meta.batch_meta = batch_meta;

    nvds_add_user_meta_to_batch (batch_meta, user_meta);
  }
}

Does 7.1 have a solution for me? I dont see anyway to extract model output in python with pgie + input-tensor-meta enabled.

Sorry for the late reply, Please refer to the sample code for how to access output in Python.

Yes but if input-tensor-meta is enabled, the output-meta is attach into batch-meta instead. And the problem is that when extracting the output-meta, i can extract the whole batch but i dont know which output belong to which frame.

Right now i have to do some trick like adding dummy object for each frame, and then preprocess and inference on that object instead so the output is attach into my object meta.

    def add_clip_object_meta_probe(pad, info, u_data):
        gst_buffer = info.get_buffer()
        if not gst_buffer:
            return Gst.PadProbeReturn.OK

        batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
        l_frame = batch_meta.frame_meta_list

        while l_frame is not None:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
            src_id = frame_meta.source_id

            # assign a phase offset if new source
            if src_id not in source_phase_offset:
                source_phase_offset[src_id] = len(source_phase_offset) % INFER_EVERY_N

            # update total frame count
            frame_counters[src_id] = frame_counters.get(src_id, 0) + 1
            cnt = frame_counters[src_id]

            # check if this frame is processed for inference
            if (cnt + source_phase_offset[src_id]) % INFER_EVERY_N == 0:
                # update processed count
                processed_counters[src_id] = processed_counters.get(src_id, 0) + 1

                frame_width = frame_meta.source_frame_width
                frame_height = frame_meta.source_frame_height
                full_frame_object = {
                    "bbox": [0, 0, frame_width, frame_height],
                    "score": 1.0,
                }
                ProbesHandler.add_obj_meta_to_frame(
                    frame_object=full_frame_object,
                    batch_meta=batch_meta,
                    frame_meta=frame_meta,
                    unique_id=13,
                    allow_downstream_infer=True,
                    visualize=False,
                    parent=None
                )

            l_frame = l_frame.next

        # print stats
        # print("=== Frame Stats ===")
        # for src_id in frame_counters:
        #     total = frame_counters[src_id]
        #     processed = processed_counters.get(src_id, 0)
        #     print(f"Source {src_id}: total={total}, processed={processed}, fraction={processed/total:.2f}")
        # print("===================")

        return Gst.PadProbeReturn.OK


    def clip_src_pad_buffer_probe(pad, info, u_data):
        start_time = time.time()
        gst_buffer = info.get_buffer()
        if not gst_buffer:
            print("Unable to get GstBuffer ")
            return
        #print("In probe")
        vector_list = []
        frame_metas = []
        batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))

        l_frame = batch_meta.frame_meta_list
        
        while l_frame is not None:
            try:
                frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
                
            except StopIteration:
                break
            l_obj = frame_meta.obj_meta_list
            while l_obj is not None:
                obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
                
                l_user = obj_meta.obj_user_meta_list
                while l_user is not None:
                    user_meta = pyds.NvDsUserMeta.cast(l_user.data)
                    #print(f"found user meta {user_meta.base_meta.meta_type}")
                    if user_meta.base_meta.meta_type != pyds.NvDsMetaType.NVDSINFER_TENSOR_OUTPUT_META:
                        l_user = l_user.next
                        continue
                    
                    tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data)

                    layers_info = pyds.get_nvds_LayerInfo(tensor_meta, 0)
                    ptr = ctypes.cast(pyds.get_ptr(layers_info.buffer), ctypes.POINTER(ctypes.c_float))
                    dims = np.trim_zeros(layers_info.inferDims.d, 'b')
                    
                    v = np.ctypeslib.as_array(ptr, shape=dims).copy()
                    

                    vector_list.append(v)
      
                    try:
                        l_user = l_user.next
                    except StopIteration:
                        break    
                try:
                    l_obj = l_obj.next
                except StopIteration:
                    break
            try:
                l_frame = l_frame.next
            except StopIteration:
                break

        if len(vector_list) > 0: 
            features = np.stack(vector_list)
            print(features.shape)
        return Gst.PadProbeReturn.OK

Is this still an DeepStream issue to support? Thanks! Please refer to my last two comments. in DS7.1, When input-tensor-meta is 1, user_meta is added in roimeta, and roimeta is added to framemeta.

I dont see RoiMeta in deepstream python binding 7.1. Is it available only in deepstream 8.0?

seems the binding exist in 8.0 deepstream_python_apps. you can port the related code, then rebuild and reinstall the binding according to this doc.

Is it possible to install the python bindings 8.0 on my deepstream 7.0 NGC enviroment? Right now I install python 7.1 bindings (commands below) on deepstream 7.0 and it works.

RUN /opt/nvidia/deepstream/deepstream/user_additional_install.sh && \
/opt/nvidia/deepstream/deepstream/user_deepstream_python_apps_install.sh -v 1.1.1

RUN wget https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.2.0/pyds-1.2.0-cp310-cp310-linux_x86_64.whl

RUN pip install ./pyds-1.2.0-cp310-cp310-linux_x86_64.whl

binding version needs to be consistent with DeepStream version. otherwise maybe some bindings can’t find the corresponding implementation.

Okay thank you, i might need to patch the binding manually.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.