Nvinfer opperating in secondary mode not always giving tensor output

awennersteen · December 18, 2020, 3:04pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) T4
• DeepStream Version 5.0.1
• TensorRT Version 7.2.1
• CuDNN Version 8.0.4
• NVIDIA GPU Driver Version (valid for GPU only) 450.80.02
• CONTAINER : Modified from the deepstream 5.0.1 20.09 base

We have a Python based deepstream 5 pipeline running 1 primary detector and several secondary ones. In one of the secondary models what we are interested is the embedding in the last layer. Sometimes, although very rarely, the tensor is not there. We access the data at the end of the pipeline in a gstreamer appsink.

Is this known/expected?
If not, are there any pointers to things we can investigate? it doesnt seem to print any log

Since it happens very seldomly it’s not a prioritized issue for us, but we’ll try to have something reproducible to share.

I.e. the following path

batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
l_obj = frame_meta.obj_meta_list
obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
obj_meta.obj_user_meta_list

ends with obj_meta.user_meta_list being None, even though

class_id = obj_meta.class_id
confidence = obj_meta.confidence
rect = obj_meta.rect_params
obj_label = obj_meta.obj_label
bbox = np.asarray( [rect.left, rect.top, rect.left + rect.width, rect.top+rect.height])

All gives perfectly sensible results.

the pipeline is something like:

gst-launch-1.0 rtspsrc location=rtsp://xxxxxxxxx1 ! rtph265depay ! nvv4l2decoder ! nvstreammux name=mux batch-size=1 batched-push-timeout=400000 live-source=true width=1280 height=768 ! nvinfer config-file-path=./config_infer_primary_yoloV4.txt ! queue ! nvinfer config-file-path=config_infer_secondary_1.txt ! queue ! nvinfer config-file-path=config_infer_secondary_2.txt ! appsink wait-on-eos=false drop=true max-buffers=1 enable-last-sample=false eos=false sync=0 async=0 emit-signals=true

And the config file for the affected model is

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-engine-file=model.engine
batch-size=4
network-mode=0
network-type=1
process-mode=2
model-color-format=0
gie-unique-id=2
operate-on-gie-id=1
operate-on-class-ids=0
output-tensor-meta=1

The model is custom, but we’ve not had to implement any custom layers.

Thanks

Fiona.Chen · December 21, 2020, 1:43pm

Since the tensor output is the modle output, so you need to check whether there is model output from your classifier model when the issue happens. Even the output of the detector is correct, it does not mean the classifier model can classify the object correctly. It is decided by model but not deepstream.