I am getting issue on mask shift issue

Please provide complete information as applicable to your setup.
“def osd_sink_pad_buffer_probe(pad, info, u_data):
“””
GStreamer buffer probe for processing each frame.

Args:
    pad: GStreamer pad
    info: Buffer info
    u_data: User data

Returns:
    Gst.PadProbeReturn.OK: Probe result
"""
global main_window, processed_object_ids, detection_counter, cumulative_obj_counter
import copy  # Import copy for deep copying

if main_window is None:
    logging.error("Main window reference not set")
    return Gst.PadProbeReturn.OK

# Get detection confirmation count
detection_confirm_count = config['detection']['detection_confirm_count']

# Get buffer from pad
gst_buffer = info.get_buffer()
if not gst_buffer:
    logging.error("Unable to get GstBuffer. Possible camera issue.")
    return Gst.PadProbeReturn.OK

# Get metadata from buffer
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
if not batch_meta:
    logging.debug("No metadata found in the current frame. Skipping frame.")
    return Gst.PadProbeReturn.OK

# Process each frame in the batch
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
    try:
        frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
    except StopIteration:
        logging.error("Error processing frame_meta")
        break

    frame_number = frame_meta.frame_num
    n_frame_surface = pyds.get_nvds_buf_surface(hash(gst_buffer), frame_meta.batch_id)
    
    # Create a deep copy of the frame image once - this is our fixed copy
    fix_frame_image = copy.deepcopy(np.array(n_frame_surface, copy=True, order='C'))
    fix_frame_image = cv2.cvtColor(fix_frame_image, cv2.COLOR_RGBA2BGR)
    
    l_obj = frame_meta.obj_meta_list

    # Initialize frame-specific data
    fix_frame_class_ids = []    # Class IDs in this frame
    fix_frame_masks = []        # Masks in this frame
    fix_frame_bboxes = []       # Bounding boxes in this frame
    frame_to_process = False    # Flag to indicate if this frame needs processing
    fix_real_world_pos = None   # Real-world position for processing
    image_h, image_w = fix_frame_image.shape[:2]  # Get image dimensions
    
    # Calculate overall bounding box that encompasses all objects in the frame
    min_x, min_y = image_w, image_h
    max_x, max_y = 0, 0
    
    # First pass: collect all class IDs and masks for all objects and calculate overall bounding box
    current_l_obj = l_obj
    while current_l_obj is not None:
        try:
            obj_meta = pyds.NvDsObjectMeta.cast(current_l_obj.data)
        except StopIteration:
            break
        
        obj_id = obj_meta.object_id
        class_id = obj_meta.class_id
        rect_params = obj_meta.rect_params
        mask_params = obj_meta.mask_params
        
        # Update overall bounding box
        x1 = int(rect_params.left)
        y1 = int(rect_params.top)
        x2 = x1 + int(rect_params.width)
        y2 = y1 + int(rect_params.height)
        
        # Add class ID to frame's class IDs if not already there
        if class_id not in fix_frame_class_ids:
            fix_frame_class_ids.append(class_id)
        
        # Extract and store mask
        if mask_params and mask_params.data:
            # Get the mask array and properly process it
            mask = mask_params.get_mask_array()
            mask = mask.reshape((mask_params.height, mask_params.width))
            mask = (mask > 0).astype(np.uint8) * 255
            
            # Store deep copies in our fixed arrays
            fix_frame_masks.append(copy.deepcopy(mask))
            fix_frame_bboxes.append(copy.deepcopy([x1, y1, x2, y2]))
        else:
            # Add empty mask
            empty_mask = np.zeros((int(rect_params.height), int(rect_params.width)), dtype=np.uint8)
            fix_frame_masks.append(copy.deepcopy(empty_mask))
            fix_frame_bboxes.append(copy.deepcopy([]))
            logging.debug(f"No mask data for object ID {obj_id}, class ID {class_id}")
        
        # Only track objects with class IDs that aren't 0 or 1
        if class_id not in [0, 1]:
            # Track detection count
            if obj_id not in detection_counter:
                detection_counter[obj_id] = 1
            else:
                detection_counter[obj_id] += 1
            
            # Check if this object's detection count matches the threshold
            if detection_counter[obj_id] == detection_confirm_count and obj_id not in processed_object_ids:
                # Mark for processing and save position data
                frame_to_process = True
                processed_object_ids.add(obj_id)
                
                # Calculate real-world position
                fix_real_world_pos, _ = calculate_real_world_position(
                    frame_meta.source_frame_width,
                    rect_params.left,
                    rect_params.width
                )
        
        # Move to next object
        try:
            current_l_obj = current_l_obj.next"

**• Hardware Platform : Jetson
**• DeepStream Version : 6.3
**• JetPack Version :5.1.2
*• TensorRT Version : 12
**• NVIDIA GPU Driver Version (valid for GPU only) : 535
**• Issue Type( questions, new requirements, bugs) : mask shift issue
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Can you upgrade to DeepStream 7.0 GA?

What kind of issue have you met? Can you describe more details about your issue?

  1. The bbox i am getting is not fitted for the actual item present. So when i am tryin g to fit the mask to the bounding box, it is also not aligned

I am attaching a sample image when the segmentation mask is for bottle area (label excluded for bottle). Here because of wrong bounding box, my mask is trying fit in to this and hence my mask appears shifted/misaligned.

Seems your model is a instance segmentation model. There is already mask drawing function in nvdsosd. Why did you draw by yourself?

Please refer to the instance segmentation sample in NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

The picture you post here shows the wrong masks and bbox positions are shifted. Have you checked with the tensor parsing function you used with your model? And we don’t know the whole pipeline and configurations, please provide the complete pipeline and configurations. Where did you put the piece of code you post here in the pipeline?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.