I’m trying to run two object detection models (as nvinferserver
) where the second detector uses the crop from the first detector as its input - the config for the second model specifies process_mode: PROCESS_MODE_CLIP_OBJECTS
and operate_on_gie_id: 1
and the config for the first model specifies unique_id: 1
. Both models have parsing/postprocessing in python done in a probe function and the probe for the first model adds the detection metadata to the frame so that the crop is used for the second model.
Where I’m currently stuck is that the postprocessing for the second model requires the original crop size from the first model such that the bounding boxes can be scaled back to the original video size. Right now I’m only seeing 1 output per frame from the second model regardless of the number of detections by the first model. Here’s my second probe function showing how I convert the model outputs to numpy arrays for postprocessing and my debugging efforts so far of figuring out how this metadata aligns.
def sgie_src_pad_buffer_probe(pad,info,u_data):
gst_buffer = info.get_buffer()
if not gst_buffer:
print("Unable to get GstBuffer ")
return
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
try:
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
except StopIteration:
break
l_obj = frame_meta.obj_meta_list
len_obj_meta = 0
len_obj_user_meta = 0
while l_obj is not None:
try:
obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
except StopIteration:
break
len_obj_meta += 1
l_user = obj_meta.obj_user_meta_list
while l_user is not None:
user_meta = pyds.NvDsUserMeta.cast(l_user.data)
len_obj_user_meta += 1
try:
l_user = l_user.next
except StopIteration:
break
try:
l_obj = l_obj.next
except StopIteration:
break
l_user = frame_meta.frame_user_meta_list
len_user_meta = 0
while l_user is not None:
len_user_meta += 1
user_meta = pyds.NvDsUserMeta.cast(l_user.data)
tensor_meta = pyds.NvDsInferTensorMeta.cast(user_meta.user_meta_data)
layers_info = []
for i in range(tensor_meta.num_output_layers):
layer = pyds.get_nvds_LayerInfo(tensor_meta, i)
layers_info.append(layer)
cls_scores = next(layer for layer in layers_info if layer.layerName == 'cls_scores')
bbox_preds = next(layer for layer in layers_info if layer.layerName == 'bbox_preds')
ptr = ctypes.cast(pyds.get_ptr(bbox_preds.buffer), ctypes.POINTER(ctypes.c_float))
bbox_preds = np.ctypeslib.as_array(ptr, shape=BBOX_PRED_SHAPE)
ptr = ctypes.cast(pyds.get_ptr(cls_scores.buffer), ctypes.POINTER(ctypes.c_float))
cls_scores = np.ctypeslib.as_array(ptr, shape=CLS_SCORE_SHAPE)
print(bbox_preds.shape, cls_scores.shape)
try:
l_user = l_user.next
except StopIteration:
break
print(len_obj_meta, len_user_meta, len_obj_user_meta)
len_obj_meta
seems to track the number of detections of the first model and I can get the crop sizes from obj_meta.rect_params
but I’m only getting one output from the second model regardless of how many detections/crops came from the first model - i.e. len_user_meta
is always 1. I also tried checking obj_meta.obj_user_meta_list
for the model outputs but this is always length-0.
Is it possible the second model is only getting 1 crop as its input or what might be causing this mismatch? Thanks in advance.