NvDsInferLayerInfo.buffer and NvDsInferTensorMeta.out_buf_ptrs_host (and out_buf_ptrs_dev) are consistently NULL

Hello NVIDIA DeepStream Team,

I am trying to integrate a RetinaFace model (ResNet50 backbone) into DeepStream 7.1 using the gst-nvinfer plugin and Python (via pyds) to parse custom tensor outputs. I am encountering a persistent issue where NvDsInferLayerInfo.buffer and NvDsInferTensorMeta.out_buf_ptrs_host (and out_buf_ptrs_dev) are consistently NULL (or Python None) for all output layers, even when the gst-nvinfer plugin successfully loads the engine and identifies the layer names and dimensions. This prevents me from accessing the raw tensor data in my Python probe.

I’ve performed extensive debugging and have narrowed down the problem to a very specific behavior.

System Information:
DeepStream Version: 7.1 
TensorRT Version: 10.3.0.26-1+cuda12.5
Python Version: 3.10 
pyds Version: 1.2
GPU: (e.g., NVIDIA GeForce RTX 3090, 1660super, 1080)
OS: Ubuntu (from Docker base image) in wsl
Model Details:
Model: RetinaFace (ResNet50 backbone)
Input Dimensions: 1x3x640x640 (FP32)
Output Layer Names (confirmed via Netron and trtexec --verbose):
    loc: 1x16800x4 (FP32)
    conf: 1x16800x2 (FP32)
    landms: 1x16800x10 (FP32)
ONNX Export Details: PyTorch to ONNX, opset_version=12.

Problem Description:

I am running a DeepStream pipeline with gst-nvinfer configured as a Primary GIE (process-mode=1). My goal is to parse the raw tensor outputs in a Python pad probe using output-tensor-meta=1.

pgie_retinaface_config.txt (Current Version):
[property]
gpu-id=0
batch-size=1
gie-unique-id=1
model-engine-file=/deepface/deepstream_retinaface_project/models/retinaface/retinaface_manual.engine
onnx-file=/deepface/deepstream_retinaface_project/models/retinaface/retinaface_resnet50.onnx
network-type=100
output-tensor-meta=1
# output-blob-names=loc;conf;landms  (This was commented out for latest tests)

# Preprocessing parameters (added to try to match NVIDIA sample behavior)
net-scale-factor=0.00392156862745098
model-color-format=0 # Assuming RGB, though RetinaFace can be BGR
offsets=0.0;0.0;0.0
num-detected-classes=1
maintain-aspect-ratio=0

[class-attrs-all]
threshold=0.2
Python Probe Code Snippet (relevant part for tensor access):
def pgie_src_pad_buffer_probe(pad, info, u_data):
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        # print("Probe: Invalid GstBuffer.", flush=True) # Usually too verbose
        return Gst.PadProbeReturn.OK

    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    if not batch_meta:
        # print("Probe: No NvDsBatchMeta found.", flush=True) # Usually too verbose
        return Gst.PadProbeReturn.OK

    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        try:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        l_user = frame_meta.frame_user_meta_list
        while l_user is not None:
            try:
                user_meta = pyds.NvDsUserMeta.cast(l_user.data)
            except StopIteration:
                break

            if user_meta.base_meta.meta_type == pyds.NVDSINFER_TENSOR_OUTPUT_META:
                tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data)

                # These prints are for debugging the PyCapsule NULL issue
                print(f"--- Frame {frame_meta.frame_num}, TensorMeta Debug ---", flush=True)
                print(f"  num_output_layers: {tensor_meta.num_output_layers}", flush=True)
                # Check if these attributes exist and what they are
                print(f"  Does out_buf_ptrs_host exist? {'out_buf_ptrs_host' in dir(tensor_meta)}", flush=True)
                if 'out_buf_ptrs_host' in dir(tensor_meta):
                    print(f"  out_buf_ptrs_host: {tensor_meta.out_buf_ptrs_host}", flush=True)
                print(f"  Does out_buf_ptrs_dev exist? {'out_buf_ptrs_dev' in dir(tensor_meta)}", flush=True)
                if 'out_buf_ptrs_dev' in dir(tensor_meta):
                     print(f"  out_buf_ptrs_dev: {tensor_meta.out_buf_ptrs_dev}", flush=True)


                if tensor_meta.num_output_layers < 3:
                    print(f"Probe ERROR: Expected 3 output layers (loc, conf, landms), got {tensor_meta.num_output_layers}", flush=True)
                    for i in range(tensor_meta.num_output_layers):
                        # Use the correct API from the sample
                        layer_info_item = pyds.get_nvds_LayerInfo(tensor_meta, i)
                        print(f"  Layer {i}: Name: {layer_info_item.layerName if layer_info_item.layerName else 'N/A'}, Dims: {layer_info_item.dims.d[:layer_info_item.dims.numDims]}", flush=True)
                    l_user = l_user.next
                    continue

                try:
                    # <<<<< Use the correct API: pyds.get_nvds_LayerInfo() >>>>>
                    bboxes_layer = pyds.get_nvds_LayerInfo(tensor_meta, 0)
                    scores_layer = pyds.get_nvds_LayerInfo(tensor_meta, 1)
                    landmarks_layer = pyds.get_nvds_LayerInfo(tensor_meta, 2)
                    
                    print(f"--- Debugging Layer Info (Frame: {frame_meta.frame_num}) ---", flush=True)
                    if bboxes_layer:
                        print(f"  BBox Layer (idx 0): Name='{bboxes_layer.layerName}', DataType={bboxes_layer.dataType}, "
                              f"NumDims={bboxes_layer.dims.numDims}, Dims={bboxes_layer.dims.d[:bboxes_layer.dims.numDims]}, "
                              f"Buffer Ptr={bboxes_layer.buffer}", flush=True)
                    else: print("  BBox Layer (idx 0) is None!", flush=True)

                    if scores_layer:
                         print(f"  Scores Layer (idx 1): Name='{scores_layer.layerName}', DataType={scores_layer.dataType}, "
                               f"NumDims={scores_layer.dims.numDims}, Dims={scores_layer.dims.d[:scores_layer.dims.numDims]}, "
                               f"Buffer Ptr={scores_layer.buffer}", flush=True)
                    else: print("  Scores Layer (idx 1) is None!", flush=True)

                    if landmarks_layer:
                        print(f"  Landmarks Layer (idx 2): Name='{landmarks_layer.layerName}', DataType={landmarks_layer.dataType}, "
                              f"NumDims={landmarks_layer.dims.numDims}, Dims={landmarks_layer.dims.d[:landmarks_layer.dims.numDims]}, "
                              f"Buffer Ptr={landmarks_layer.buffer}", flush=True)
                    else: print("  Landmarks Layer (idx 2) is None!", flush=True)

                except Exception as e:
                    print(f"Probe ERROR: Failed to get NvDsInferLayerInfo objects or print debug info: {e}", flush=True)
                    import traceback; traceback.print_exc()
                    l_user = l_user.next; continue
                
                # These checks are now critical! If Buffer Ptr is still None, we need to know.
                # Note: bboxes_layer.buffer is not 'None' in Python sense, but a PyCapsule wrapping NULL.
                # ctypes.cast on a PyCapsule(NULL) is what causes the ArgumentError
                if bboxes_layer.buffer is None: # This check might not catch PyCapsule(NULL) but will catch Python None
                    print(f"Probe FATAL: Bbox layer (Name: '{bboxes_layer.layerName}') buffer is NULL (Python None). Skipping.", flush=True)
                    l_user = l_user.next; continue
                if scores_layer.buffer is None:
                    print(f"Probe FATAL: Scores layer (Name: '{scores_layer.layerName}') buffer is NULL (Python None). Skipping.", flush=True)
                    l_user = l_user.next; continue
                if landmarks_layer.buffer is None:
                    print(f"Probe FATAL: Landmarks layer (Name: '{landmarks_layer.layerName}') buffer is NULL (Python None). Skipping.", flush=True)
                    l_user = l_user.next; continue

                # The rest of your processing logic which assumes a valid buffer pointer
                # from bboxes_layer.buffer, scores_layer.buffer, landmarks_layer.buffer
                if bboxes_layer.dataType == pyds.NvDsInferDataType.FLOAT:
                    # THIS IS THE LINE THAT CAUSES ctypes.ArgumentError with PyCapsule(NULL)
                    ptr_bboxes = ctypes.cast(bboxes_layer.buffer, ctypes.POINTER(ctypes.c_float))
                    num_elements_bbox = np.prod(bboxes_layer.dims.d[:bboxes_layer.dims.numDims])
                    if num_elements_bbox == 0:
                        print(f"Probe WARN: Bbox layer (Name: '{bboxes_layer.layerName}') has zero elements. Dims: {bboxes_layer.dims.d[:bboxes_layer.dims.numDims]}", flush=True)
                        l_user = l_user.next; continue
                    bboxes_flat = np.ctypeslib.as_array(ptr_bboxes, shape=(int(num_elements_bbox),))
                    num_priors_bbox = bboxes_flat.shape[0] // 4
                    if bboxes_flat.shape[0] % 4 != 0:
                        print(f"Probe WARN: Bbox tensor size {bboxes_flat.shape[0]} not divisible by 4. Dims: {bboxes_layer.dims.d[:bboxes_layer.dims.numDims]}", flush=True)
                        l_user = l_user.next; continue
                    bboxes = bboxes_flat.reshape(num_priors_bbox, 4)
                else:
                    print(f"Probe WARN: Bbox layer data type is {bboxes_layer.dataType}, not FLOAT.", flush=True)
                    l_user = l_user.next; continue
                
                if scores_layer.dataType == pyds.NvDsInferDataType.FLOAT:
                    ptr_scores = ctypes.cast(scores_layer.buffer, ctypes.POINTER(ctypes.c_float))
                    num_elements_score = np.prod(scores_layer.dims.d[:scores_layer.dims.numDims])
                    if num_elements_score == 0:
                        print(f"Probe WARN: Scores layer (Name: '{scores_layer.layerName}') has zero elements. Dims: {scores_layer.dims.d[:scores_layer.dims.numDims]}", flush=True)
                        l_user = l_user.next; continue
                    scores_flat = np.ctypeslib.as_array(ptr_scores, shape=(int(num_elements_score),))
                    num_priors_score = scores_flat.shape[0] // 2
                    if scores_flat.shape[0] % 2 != 0:
                        print(f"Probe WARN: Score tensor size {scores_flat.shape[0]} not divisible by 2. Dims: {scores_layer.dims.d[:scores_layer.dims.numDims]}", flush=True)
                        l_user = l_user.next; continue
                    scores = scores_flat.reshape(num_priors_score, 2)
                else:
                    print(f"Probe WARN: Scores layer data type is {scores_layer.dataType}, not FLOAT.", flush=True)
                    l_user = l_user.next; continue

                if landmarks_layer.dataType == pyds.NvDsInferDataType.FLOAT:
                    ptr_landmarks = ctypes.cast(landmarks_layer.buffer, ctypes.POINTER(ctypes.c_float))
                    num_elements_landm = np.prod(landmarks_layer.dims.d[:landmarks_layer.dims.numDims])
                    if num_elements_landm == 0:
                        print(f"Probe WARN: Landmarks layer (Name: '{landmarks_layer.layerName}') has zero elements. Dims: {landmarks_layer.dims.d[:landmarks_layer.dims.numDims]}", flush=True)
                        l_user = l_user.next; continue
                    landmarks_flat = np.ctypeslib.as_array(ptr_landmarks, shape=(int(num_elements_landm),))
                    num_priors_landm = landmarks_flat.shape[0] // (NUM_LANDMARKS * 2)
                    if landmarks_flat.shape[0] % (NUM_LANDMARKS*2) != 0:
                        print(f"Probe WARN: Landmark tensor size {landmarks_flat.shape[0]} not divisible by {NUM_LANDMARKS*2}. Dims: {landmarks_layer.dims.d[:landmarks_layer.dims.numDims]}", flush=True)
                        l_user = l_user.next; continue
                    landmarks = landmarks_flat.reshape(num_priors_landm, NUM_LANDMARKS * 2)
                else:
                    print(f"Probe WARN: Landmarks layer data type is {landmarks_layer.dataType}, not FLOAT.", flush=True)
                    l_user = l_user.next; continue
                
                if not (num_priors_bbox == num_priors_score == num_priors_landm and num_priors_bbox > 0):
                    print(f"Probe WARN: Mismatch or zero in num priors: B:{num_priors_bbox}, S:{num_priors_score}, L:{num_priors_landm}", flush=True)
                    l_user = l_user.next; continue
                
                face_scores = scores[:, 1] 
                good_detections_indices = np.where(face_scores > FACE_CONFIDENCE_THRESHOLD)[0]

                if len(good_detections_indices) > 0:
                     print(f"Probe: Frame {frame_meta.frame_num}, Found {len(good_detections_indices)} potential faces.", flush=True)

                display_meta = pyds.nvds_acquire_display_meta_from_pool(batch_meta)
                display_meta.num_labels = 0; display_meta.num_rects = 0; display_meta.num_circles = 0

                for i in good_detections_indices:
                    obj_meta = pyds.nvds_acquire_obj_meta_from_pool(batch_meta)
                    obj_meta.class_id = 0; obj_meta.confidence = float(face_scores[i])
                    
                    raw_box = bboxes[i]
                    obj_meta.rect_params.left = np.clip(raw_box[0] * RETINAFACE_MODEL_INPUT_WIDTH, 0, RETINAFACE_MODEL_INPUT_WIDTH)
                    obj_meta.rect_params.top = np.clip(raw_box[1] * RETINAFACE_MODEL_INPUT_HEIGHT, 0, RETINAFACE_MODEL_INPUT_HEIGHT)
                    obj_meta.rect_params.width = np.clip((raw_box[2] - raw_box[0]) * RETINAFACE_MODEL_INPUT_WIDTH, 0, RETINAFACE_MODEL_INPUT_WIDTH - obj_meta.rect_params.left)
                    obj_meta.rect_params.height = np.clip((raw_box[3] - raw_box[1]) * RETINAFACE_MODEL_INPUT_HEIGHT, 0, RETINAFACE_MODEL_INPUT_HEIGHT - obj_meta.rect_params.top)
                    
                    pyds.nvds_add_obj_meta_to_frame(frame_meta, obj_meta, None)

                    raw_landmarks = landmarks[i]
                    for k in range(NUM_LANDMARKS):
                        if display_meta.num_circles < pyds.MAX_ELEMENTS_IN_DISPLAY_META:
                            circle_params = display_meta.circle_params[display_meta.num_circles]
                            lk_x = np.clip(raw_landmarks[k*2] * RETINAFACE_MODEL_INPUT_WIDTH, 0, RETINAFACE_MODEL_INPUT_WIDTH)
                            lk_y = np.clip(raw_landmarks[k*2+1] * RETINAFACE_MODEL_INPUT_HEIGHT, 0, RETINAFACE_MODEL_INPUT_HEIGHT)
                            circle_params.xc = int(lk_x); circle_params.yc = int(lk_y)
                            circle_params.radius = 2; circle_params.circle_color.set(0.0, 1.0, 0.0, 1.0) 
                            display_meta.num_circles += 1
                        else: break
                pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)

            try:
                l_user = l_user.next
            except StopIteration: break
        try:
            l_frame = l_frame.next
        except StopIteration: break
    return Gst.PadProbeReturn.OK

Observed Behavior (The Core Problem):

When running the pipeline, gst-nvinfer logs show successful engine loading (either rebuilding from ONNX, or loading retinaface_manual.engine built by trtexec). The Python probe is successfully called, tensor_meta.num_output_layers is 3, and pyds.get_nvds_LayerInfo(tensor_meta, i).layerName correctly shows “loc”, “conf”, “landms” with correct dimensions.

However, bboxes_layer.buffer (and for scores/landms) consistently reports Buffer Ptr=None (or <capsule object NULL …>).

Attempts to cast this None buffer with ctypes.cast result in ValueError: NULL pointer access or ctypes.ArgumentError: wrong type.

Debugging Steps Taken & Findings:

ONNX Model Validity: Verified ONNX model with Netron and trtexec --verbose. trtexec successfully builds the engine and reports valid named output bindings (loc, conf, landms) with correct dimensions and FP32 type.
output-blob-names Behavior:

When gst-nvinfer rebuilds the engine from onnx-file (with output-blob-names=loc;conf;landms in config): Probe called, layerName correct, but Buffer Ptr=None.

When gst-nvinfer loads trtexec-built model-engine-file (with output-blob-names=loc;conf;landms in config): Segmentation Fault (core dumped), crashing before probe prints.

When gst-nvinfer loads trtexec-built model-engine-file (with output-blob-names commented out): No SegFault, layerName correct, but Buffer Ptr=None.

net-scale-factor and Preprocessing: Initially, net-scale-factor in [property] caused a parsing error with network-type=100. This was later resolved (likely a config file formatting issue) when more parameters were added. The Buffer Ptr=None persists whether these are included or not.

pyds API: Tested tensor_meta.output_layers_info(i) and pyds.get_nvds_LayerInfo(tensor_meta, i). Both retrieve NvDsInferLayerInfo objects, but their .buffer field is consistently None.

I am trying to integrate a RetinaFace model (ResNet50 backbone) into DeepStream 7.1 using the gst-nvinfer plugin and Python (via pyds) to parse custom tensor outputs. I am encountering a persistent issue where NvDsInferLayerInfo.buffer and NvDsInferTensorMeta.out_buf_ptrs_host (and out_buf_ptrs_dev) are consistently NULL (or Python None) for all output layers, even when the gst-nvinfer plugin successfully loads the engine and identifies the layer names and dimensions. This prevents me from accessing the raw tensor data in my Python probe.

You can refer to our FAQ How to convert NvDsInferTensorMeta output data to numpy ndarray.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.