Not able to access raw tensor output as metadata

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Orin AGX
• DeepStream Version 7.1
• JetPack Version (valid for Jetson only) 6.2
• TensorRT Version 10.3.0
• NVIDIA GPU Driver Version (valid for GPU only) 540.4.0
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi,
I have a model running as SGIE and configuration with input-tensor-meta=1 as well as output-tensor-meta=1. I was able to access its raw tensor output as metadata in Deepstream 7.0 using following code in the gie_processing_done_buf_prob(). However I am not getting anything using the same code. I have confirmed that the model output is proper by saving the data in custom parser.

// Iterate through user metadata to find tensor output
for (l_user_meta = obj_meta->obj_user_meta_list; l_user_meta != NULL; l_user_meta = l_user_meta->next) {
    user_meta = (NvDsUserMeta *) l_user_meta->data;
  
    // Check if the metadata is tensor output
    if (user_meta->base_meta.meta_type == NVDSINFER_TENSOR_OUTPUT_META) {
        NvDsInferTensorMeta *meta = (NvDsInferTensorMeta *) user_meta->user_meta_data;
        if(meta->unique_id == 4){
            // Access raw tensor data as metadata
            }      
        }
}
[property]
gpu-id=0
input-tensor-from-meta=1
model-engine-file=inswapper_128.onnx_b1_gpu0_fp32.engine
onnx-file=inswapper_128.onnx
batch-size=1
model-color-format=0
network-mode=0
process-mode=2
network-type=3
interval=0
infer-dims=3;128;128
cluster-mode=4
custom-lib-path=./parser/libnvdsinfer_custom_impl.so
parse-bbox-instance-mask-func-name=NvDsInferParseSwap
#maintain-aspect-ratio=1
#symmetric-padding=1
output-tensor-meta=1

[class-attrs-all]
pre-cluster-threshold=0.5

Please note that there was as issue observed running this model which was fixed by Segfault at nvds_destroy_batch_meta with non-image input layer model

Regards

Update I printed metadata using nvds_get_current_metadata_info() under gie_processing_done_buf_prob() and I found that output tensor metadata is not received. So there is certainly an issue with DS7.1

******************FRAME META POOL STATUS ***************
meta_type = 2
max_elements_in_pool = 1
num_empty_elements = 0
num_full_elements = 1

FULL META 0 ptr = 0xffff18049e40
******************OBJ META POOL STATUS *****************
meta_type = 3
max_elements_in_pool = 8
num_empty_elements = 5
num_full_elements = 3

FREE META 0 ptr = 0xffff1804a810
FREE META 1 ptr = 0xffff1804a5e0
FREE META 2 ptr = 0xffff1804a3b0
FREE META 3 ptr = 0xffff1804a180
FREE META 4 ptr = 0xffff18049f50
FULL META 0 ptr = 0xffff1804aa40
x = 605.425232 y = 252.643570 w = 69.310959 h = 90.754807

FULL META 1 ptr = 0xffff1804ac70
x = 519.513977 y = 225.938507 w = 242.774094 h = 649.854004

FULL META 2 ptr = 0xffff1804aea0
x = 827.550110 y = 479.899658 w = 217.181580 h = 575.485779

FRAME 0
FRAME META ptr = 0xffff18049e40 source_id = 0 port id = 0

  OBJ 0
  OBJ META ptr = 0xffff1804aa40 component_id = 1,class_id = 0 object id = 13 x =605.425232, y =252.643570, w =69.310959, h = 90.754807

  OBJ 1
  OBJ META ptr = 0xffff1804ac70 component_id = 2,class_id = 0 object id = 18446744073709551615 x =519.513977, y =225.938507, w =242.774094, h = 649.854004

  OBJ 2
  OBJ META ptr = 0xffff1804aea0 component_id = 2,class_id = 0 object id = 18446744073709551615 x =827.550110, y =479.899658, w =217.181580, h = 575.485779
  1. nvinfer plugin is opensource. you can add log in gst_nvinfer_output_loop of opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp to check if output_tensor_meta is true and attach_tensor_output_meta if called.
  2. from attach_tensor_output_meta, if input_tensor_from_meta is true, the inference output user meta is attached in frame.roi_meta.

Yes, I have added debug print and attach_tensor_output_meta() is getting called. I have compared this function between DS7.0 and 7.1 and I didn’t found any change. Since I was getting raw tensor in obj_user_meta earlier why I am not getting it now. Also as you have mentioned I checked in roi_meta as well but didn’t get anything. Please check below code I have added in gie_processing_done_buf_prob()

   for (l_user_meta = frame_meta->frame_user_meta_list; l_user_meta != NULL;
        l_user_meta = l_user_meta->next) {
          NvDsUserMeta *roi_user_meta = (NvDsUserMeta *) l_user_meta->data;
          if (roi_user_meta->base_meta.meta_type != NVDS_ROI_META)
            continue;
          /* convert to roi metadata */
          NvDsRoiMeta *roi_meta =
            (NvDsRoiMeta *) roi_user_meta->user_meta_data;
          for (NvDsUserMetaList * r_user = roi_meta->roi_user_meta_list;
          r_user != NULL; r_user = r_user->next){
            NvDsUserMeta *tensor_user_meta = (NvDsUserMeta *) r_user->data;
            if (tensor_user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
              continue;
                std::cout << "raw tensor data received" << std::endl;
                }
    } 

However I didn’t got any print. I found that someone else is also using the same model and trying to implement it like this but facing same problem.

Questions

  1. Please check the above code for roi_meta and let me know anything is wrong.
  2. Do you have a similar sample app where both input and output meta is enabled for a SGIE instance segmentation model.

Regards,
Vishal

Hi,
I have further gone through the documentation of DS7.1 and got below link

This link clearly says that
When operating as secondary GIE, NvDsInferTensorMeta is attached to each each NvDsObjectMeta object’s obj_user_meta_list .

Therefore the implementation done in 7.0 should work with 7.1. Since I am not getting raw tensor so please look into this and help me resolve.

Regards

  1. Thanks for the sharing! If you are testing with deepstream-app, could you share the cfg file of deepstream-app?
  2. could you add the code above in attach_tensor_output_meta after the inference tensor user meta is added. If this method is right, the user meta should can be accessed.
  3. if the inference tensor user meta can be accessed in attach_tensor_output_meta, could you check if inference tensor user meta in analytics_done_buf_prob? since analytics_done_buf_prob is a probe function after sgie.

Here is the point wise answer

  1. Yes, I am testing with deepstream-app. Please see the main config file, I have already shared model config file in earlier post.
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=4
uri=rtsp://127.0.0.1:8555/stream
num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0
rtsp-reconnect-interval-sec=1
select-rtp-protocol=4
latency=500
udp-buffer-size=100000000
#rtsp-reconnect-attempts=10

[sink0]
enable=0
#Type - 1=FakeSink 2=EglSink/nv3dsink (Jetson only) 3=File
type=1
container=1
#sync=0
codec=1
enc-type=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[sink1]
enable=1
type=4
#1=mp4 2=mkv
container=1
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
#iframeinterval=10
bitrate=3000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
# set profile only for hw encoder, sw encoder selects profile based on sw-preset
profile=0
rtsp-port=8550
udp-buffer-size=100000000

[osd]
enable=1
gpu-id=0
border-width=1
text-size=7
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0
display-bbox=1
display-text=1

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
#live-source=0
live-source=1
batch-size=1
buffer-pool-size=500
## Set muxer output width and height
width=1920
height=1080
#width=640
#height=480
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=1
nvbuf-memory-type=0
## If set to TRUE, system timestamp will be attached as ntp timestamp
## If set to FALSE, ntp timestamp from rtspsrc, if available, will be attached
attach-sys-ts-as-ntp=1
batched-push-timeout=33333
sync-inputs=1
#frame-duration=1000


[primary-gie]
enable=1
gpu-id=0
batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=0;0;1;1
bbox-border-color1=0;0;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;0;1;1
interval=0
#Required by the app for SGIE, when used along with config-file property
gie-unique-id=1
nvbuf-memory-type=0
config-file=scrfd_config.txt

[secondary-pre-process0]
enable=1
operate-on-gie-id=1
config-file=primary_preprocess.txt

[secondary-gie0]
enable=1
gpu-id=0
batch-size=1
gie-unique-id=2
input-tensor-meta=1
operate-on-gie-id=1
nvbuf-memory-type=0
config-file=yolo_config.txt

[secondary-pre-process1]
enable=1
operate-on-gie-id=1
config-file=secondary_preprocess_recog.txt

[secondary-gie1]
enable=1
gpu-id=0
batch-size=1
gie-unique-id=3
operate-on-gie-id=1
nvbuf-memory-type=0
input-tensor-meta=0
config-file=recog_config.txt

[secondary-pre-process3]
enable=1
operate-on-gie-id=1
config-file=secondary_preprocess.txt

[secondary-gie2]
enable=1
gpu-id=0
batch-size=1
gie-unique-id=4
input-tensor-meta=1
operate-on-gie-id=1
nvbuf-memory-type=0
config-file=swap_config.txt

[secondary-gie3]
enable=1
gpu-id=0
batch-size=1
gie-unique-id=5
input-tensor-meta=0
operate-on-gie-id=2
nvbuf-memory-type=0
config-file=solider_config.txt

[tracker]
enable=1
# For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
tracker-width=640
tracker-height=640
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvSORT.yml
display-tracking-id=1

[tests]
file-loop=0
  1. I have added the code in attach_tensor_output_meta() and I am able to access the raw tensor.
    if (nvinfer->input_tensor_from_meta) {
      /* Attach tensor meta as part of ROI Meta */
      nvds_add_user_meta_to_roi (frame.roi_meta, user_meta);
      /* Attach ROI Meta to Frame Meta */
      NvDsUserMeta *user_meta = nvds_acquire_user_meta_from_pool (batch_meta);
      user_meta->user_meta_data = frame.roi_meta;
      user_meta->base_meta.meta_type =
        (NvDsMetaType) NVDS_ROI_META;
      user_meta->base_meta.release_func = release_user_meta_at_frame_level;
      user_meta->base_meta.copy_func = NULL;
      user_meta->base_meta.batch_meta = batch_meta;
      nvds_add_user_meta_to_frame (frame.frame_meta, user_meta);
      /* if object is roi itself */
      if (frame.obj_meta) {
        nvds_add_user_meta_to_obj (frame.obj_meta, user_meta);
        //std::cout << "[DEBUG] Attached ROI meta to Object: " << frame.obj_meta << std::endl;
        // Post-attachment: verify object meta contents again
            NvDsFrameMeta *om = frame.frame_meta;
            std::cout << __func__ << "Frame Num: " << om->frame_num << " " << om << std::endl;
            for (NvDsUserMetaList *l = om->frame_user_meta_list; l; l = l->next) {
                NvDsUserMeta *meta = (NvDsUserMeta *)l->data;
                std::cout << "  [POST] meta_type: " << meta->base_meta.meta_type << std::endl;
          NvDsRoiMeta *roi_meta =
            (NvDsRoiMeta *) meta->user_meta_data;
          for (NvDsUserMetaList * r_user = roi_meta->roi_user_meta_list;
          r_user != NULL; r_user = r_user->next){
            NvDsUserMeta *tensor_user_meta = (NvDsUserMeta *) r_user->data;
            if (tensor_user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
              continue;
                NvDsInferTensorMeta *tensor_meta = (NvDsInferTensorMeta *) tensor_user_meta->user_meta_data;
                NvDsInferLayerInfo *layer_info = &tensor_meta->output_layers_info[0];
                float *output_data = (float *) layer_info->buffer;
                std::cout << tensor_meta->unique_id << std::endl;
                }
            }
      }
    }

I got below debug prints

attach_tensor_output_metaFrame Num: 116 0xfffe50064100
  [POST] meta_type: 29 (I confirmed with header file that it is NVDS_ROI_META)
4 (unique_id)

attach_tensor_output_metaFrame Num: 117 0xfffe5005c500
  [POST] meta_type: 29 (I confirmed with header file that it is NVDS_ROI_META)
4 (unique_id)

attach_tensor_output_meta Frame Num: 118 0xfffe50058b60
  [POST] meta_type: 29  (I confirmed with header file that it is NVDS_ROI_META)
4 (unique_id)
  1. Then I added below code in analytics_done_buf_prob(). Here also I found raw tensor data.
static GstPadProbeReturn
analytics_done_buf_prob (GstPad * pad, GstPadProbeInfo * info, gpointer u_data)
{
  NvDsMetaList *l_frame = NULL;
  NvDsMetaList *l_user_meta = NULL;

  NvDsInstanceBin *bin = (NvDsInstanceBin *) u_data;
  guint index = bin->index;
  AppCtx *appCtx = bin->appCtx;
  GstBuffer *buf = (GstBuffer *) info->data;
  NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);
  if (!batch_meta) {
    NVGSTDS_WARN_MSG_V ("Batch meta not found for buffer %p", buf);
    return GST_PAD_PROBE_OK;
  }
    // Iterate through the frame metadata list
  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next) {
      NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);   
      std::cout << __func__ << " " << frame_meta << " " << frame_meta->frame_num << std::endl;
      for (l_user_meta = frame_meta->frame_user_meta_list; l_user_meta != NULL;
              l_user_meta = l_user_meta->next) {
          NvDsUserMeta *roi_user_meta = (NvDsUserMeta *) l_user_meta->data;
          if (roi_user_meta->base_meta.meta_type != NVDS_ROI_META)
              continue;
          /* convert to roi metadata */
          NvDsRoiMeta *roi_meta =
              (NvDsRoiMeta *) roi_user_meta->user_meta_data;
          for (NvDsUserMetaList * r_user = roi_meta->roi_user_meta_list;
                  r_user != NULL; r_user = r_user->next){
              NvDsUserMeta *tensor_user_meta = (NvDsUserMeta *) r_user->data;
              if (tensor_user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
                  continue;
              NvDsInferTensorMeta *meta = (NvDsInferTensorMeta *) tensor_user_meta->user_meta_data;
              std::cout << "raw tensor data received: " << meta->unique_id << std::endl;
          }
      }    
  }

I got below debug prints

analytics_done_buf_prob 0xfffe5005c500 117
raw tensor data received: 4

analytics_done_buf_prob 0xfffe50058b60 118
raw tensor data received: 4

analytics_done_buf_prob 0xfffe500585c0 119
raw tensor data received: 4

I then added debug prints in gie_processing_done_buf_prob() to print frame_meta pointer address and frame_num

static GstPadProbeReturn
gie_processing_done_buf_prob (GstPad * pad, GstPadProbeInfo * info,
    gpointer u_data)
{
    // Retrieve the GstBuffer and the instance bin and app context
    GstBuffer *buf = (GstBuffer *) info->data;
    NvDsInstanceBin *bin = (NvDsInstanceBin *) u_data;
    guint index = bin->index;
    AppCtx *appCtx = bin->appCtx;
    NvDsObjectMeta *obj_meta = NULL;
    NvDsMetaList *l_frame = NULL;
    NvDsMetaList *l_obj = NULL;
    NvDsMetaList *l_user_meta = NULL;
    NvDsUserMeta *user_meta = NULL;
    unsigned char *user_meta_data = NULL;
  
    // Check if the buffer is writable, if so, process the buffer
    if (gst_buffer_is_writable(buf)) {
        process_buffer(buf, appCtx, index);
    }    

    // Retrieve the NvDsBatchMeta from the GstBuffer (contains metadata for batch processing)
    unsigned int count = 0; 
    std::vector<cv::Point2f> keypoints; // Vector to store keypoints
    std::vector<cv::Mat> images;       // Vector to store inferred images
    std::vector<cv::Mat> M_list;       // List to store transformation matrices
    // Vector to store all detected objects and their info
    std::vector<ObjectInfo> detected_objects;
    //nvds_get_current_metadata_info(batch_meta);
    NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta(buf);
    // Iterate through the frame metadata list
    for (l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next) {
        NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);    
        NvDsDisplayMeta *display_meta = NULL;

        if (!frame_meta) continue;
        std::cout << "frame_meta: " << frame_meta << std::endl;
        guint frame_width = frame_meta->source_frame_width;
        guint frame_height = frame_meta->source_frame_height;

frame_meta: 0xffff040568e0
No frame user meta for 117

frame_meta: 0xffff04056c10
No frame user meta for 118

frame_meta: 0xffff04018a10
No frame user meta for 119

From this I concluded that the data is flowing from attach_tensor_output_meta → analytics_done_buf_prob but not to gie_processing_done_buf_prob. I also noticed that the frame_meta pointer address for the same frame_num is pointer matching for attach_tensor_output_meta and analytics_done_buf_prob while it is different in gie_processing_done_buf_prob. This the reason I am not able to get metadata.

Request you to please resolve this on priority. I believe that you are aware that there was another issue linked to input-tensor-meta=1 and output-tensor-meta=1. I have put the link in my earlier post. I am highlighting it so that you can look at any side effect caused by it.

Regards

nvstreamdemux creates a new framemeta. You can get inference user meta in analytics_done_buf_prob, Or, Since deepstream-app is opensource, you can add a probe function on the sink of nvstreamdemux. then get inference user meta in this function.

I need to access user meta in gie_processing_done_buf_prob because I need other models meta data to swap the face. Since I was able to access it in DS7.0 why it can’t be in 7.1. Can I request you to resolve this bug.

Here is a workaround.

  1. replace attach_tensor_output_meta with the function code in DS7.0.
  2. use the method in issue description to get inference tensor user meta, that is, get it from obj_meta->obj_user_meta_list.

I compared attach_tensor_output_meta() with 7.1 vs 7.0 and both the functions are same. So this issue doesn’t seems to be due to this function.

Thinking again, I am just wondering if new frame meta is created then how come I am able to access NVDS_OBJ_META and classifier_meta in gie_processing_done_buf_prob. Does it mean that there is a logic which copies data from old frame meta to new and it has missed to copy user meta?

please compare the two versons again. there are different. two-verson.txt (7.7 KB). especially please rebuild and replace the library after modifying the code.

After replacing attach_tensor_output_meta, I started getting the prints that data is received but when I am trying to access the data it is giving me seg fault. Please help.

Please refer the following code and log. After replacing with attach_tensor_output_meta in DS7.0. I can find TENSOR_OUTPUT_META user meta.
get.txt (2.0 KB) log.txt (8.7 KB)

Yes, I am able to access raw tensor metadata after replacing attach_tensor_output_meta. After collecting the swap model output I am trying to replace the face in same function gie_processing_done_buf_prob. I was getting error due to below code. Again this code was working in DS7.0

        // Declare surface buffer and map info for accessing buffer memory
        NvBufSurface *in_surf = nullptr;
        GstMapInfo in_map_info;
        memset(&in_map_info, 0, sizeof(in_map_info)); // Initialize map info structure
                                                      // Map the GstBuffer to read data
        if (!gst_buffer_map(buf, &in_map_info, GST_MAP_READ)) {
            std::cerr << "Error: Failed to map GstBuffer" << std::endl;
            return GST_PAD_PROBE_OK;
        }

        // Assign the mapped buffer data to the surface pointer
        in_surf = reinterpret_cast<NvBufSurface *>(in_map_info.data);

        if (!in_surf) {
            std::cerr << "Error: NvBufSurface is null" << std::endl;
            gst_buffer_unmap(buf, &in_map_info);
            return GST_PAD_PROBE_OK;
        }

        // Map the surface to read memory for processing
        if (NvBufSurfaceMap(in_surf, -1, -1, NVBUF_MAP_READ) != 0) { 
            std::cerr << "Error: Failed to map NvBufSurface for read" << std::endl;
            gst_buffer_unmap(buf, &in_map_info);
            return GST_PAD_PROBE_OK;
        }

        // Sync the surface memory for CPU access
        if (NvBufSurfaceSyncForCpu(in_surf, -1, -1) != 0) { 
            printf("Sync CPU failed for in_surf\n");
        }

        // Check if the mapped address for surface is valid
        if (!in_surf->surfaceList[0].mappedAddr.addr[0]) {
            std::cerr << "Error: Mapped address is null for buffer " << std::endl;
        }

        // Get the surface dimensions (height, width, and pitch)
        guint height = in_surf->surfaceList[0].height;
        guint width = in_surf->surfaceList[0].width;
        guint pitch = in_surf->surfaceList[0].planeParams.pitch[0];

        // Create an OpenCV Mat in RGBA format using the surface data
        unsigned char *frame_data = (unsigned char *)in_surf->surfaceList[0].mappedAddr.addr[0];
        if(frame_data)
            std::cout << "height = " << height << "width = " << width << "pitch = " << pitch << "frame_data = " << *frame_data << std::endl;
        cv::Mat frame = cv::Mat(height, width, CV_8UC4, frame_data, pitch); // CV_8UC4: 8-bit unsigned, 4 channels (RGBA)

When cv::Mat frame = cv::Mat(height, width, CV_8UC4, frame_data, pitch); is executed I am getting error

height = 1080, width = 1920, pitch = 1920

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.10.0) /home/wvtex/Downloads/workspace/opencv-4.10.0/modules/core/src/matrix.cpp:434: error: (-215:Assertion failed) _step >= minstep in function 'Mat'

To rule out the nvstreamdemux which copies the usermeta, if adding this code in analytics_done_buf_prob or attach_tensor_output_meta, can the code work well?

Let me try and come back. Meanwhile I just want to know that this code is just accessing input surface and has nothing to do with meta so why you are relating with user meta? Am I missing something.

your method is only for rgba. please check if the format is nv12. please refer to this faq for dumpping the pipeline.

Please see below link for 7.0 full pipeline and 7.1 pipeline.

Thanks for the sharing! from the ds-app-71.png, the format is always nv12 after streamdemux. you can print the format of in_surf->surfaceList[0].