Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) Jetson Xavier AGX
• DeepStream Version 5.0
• JetPack Version (valid for Jetson only) 4.4
• TensorRT Version 7.1.3
• Issue Type( questions, new requirements, bugs) question
I’m trying to obtain the raw mask output from my custom semantic segmentation model (deeplab v3+/mobilenetv3) using the DeepStream Python API and a buffer probe.
My code is based on the deepstream_test_1.py sample code. I succeeded to read the meta data of an object detection model (yolo) but I couldn’t read the output of my segmentation model. Below is the code of my buffer probe function:
def osd_sink_pad_buffer_probe(pad, info, u_data):
frame_number = 0
num_rects = 0
gst_buffer = info.get_buffer()
if not gst_buffer:
print("Unable to get GstBuffer ")
return
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
l_frame = batch_meta.frame_meta_list
while l_frame is not None:
try:
frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
except StopIteration:
break
user_meta_list = frame_meta.frame_user_meta_list
print(user_meta_list)
while user_meta_list is not None:
try:
user_meta = pyds.NvDsUserMeta.cast(user_meta_list.data)
print(user_meta)
except StopIteration:
break
print('there is some user metadata')
user_meta_data = user_meta.user_meta_data
print(user_meta_data)
if user_meta.base_meta.meta_type == pyds.NvDsMetaType.NVDSINFER_SEGMENTATION_META:
print('segmentation meta data found')
classes = user_meta.user_meta_data.classes
print(f'classes: {classes }')
try:
user_meta_list = user_meta_list.next
except StopIteration:
break
From checking the documentation of NvDsInferSegmentationMeta I assumed that the mask information should be accessible through the property classes, but I get the error: AttributeError: ‘PyCapsule’ object has no attribute ‘classes’.
It seems that there is some step missing in my workflow. I assumed that a cast to NvDsInferSegmentationMeta might be needed but NvDsInferSegmentationMeta doesn’t have this functionality.
https://docs.nvidia.com/metropolis/deepstream/python-api/PYTHON_API/NvDsInfer/NvDsInferSegmentationMetaDoc.html
My neural network output layer has the shape of [1,1024,1024], a mask where the pixel value denotes the class id.
This are my config settings:
[property]
gpu-id=0
net-scale-factor=1
enable-dla=0
use-dla-core=0
model-color-format=0
model-engine-file=deeplab_fp16.engine
labelfile-path=labels_segment.txt
network-mode=2
num-detected-classes=2
gie-unique-id=1
network-type=2
cluster-mode=4
maintain-aspect-ratio=1
output-tensor-meta=1
Are there any additional steps to implement a custom segmentation model into DeepStream? Is there a way to just obtain the raw neural network output? I need the raw data for further processing downstream. I didn’t find any useful demos for segmentation models except the outdated segnet demo.
I confirmed that the model output is correct by performing inference using the Python TensorRT library.
Edit: I further investigated the issue and found out that if I add another dimension to the neural network output I can obtain some buffer content but ONLY using the C demo as a base (Same error in Python). I saw that the old segmentation test app using UNET the output dimensions where Classes X Height X Width so I added a dimension to my neural network output which was only H x W before. If I read the NvDsInferSegmentationMeta struct for width, height and classes it all matches up but reading the buffer content directly the output is very different from my inference output using the tensorrt engine directly.
Here is my C sink_pad_buffer_probe code:
static GstPadProbeReturn
osd_sink_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
gpointer u_data)
{
GstBuffer *buf = (GstBuffer *) info->data;
guint num_rects = 0;
NvDsObjectMeta *obj_meta = NULL;
guint vehicle_count = 0;
guint person_count = 0;
NvDsMetaList * l_frame = NULL;
NvDsMetaList * l_obj = NULL;
NvDsDisplayMeta *display_meta = NULL;
NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);
gchar *user_meta_data = NULL;
for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
l_frame = l_frame->next) {
NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);
int offset = 0;
for (l_obj = frame_meta->frame_user_meta_list; l_obj != NULL;
l_obj = l_obj->next) {
NvDsUserMeta * user_meta = (NvDsUserMeta *) (l_obj->data);
if (user_meta->base_meta.meta_type == NVDSINFER_SEGMENTATION_META){
//g_print ("Segmentation meta data found\n");
NvDsInferSegmentationMeta *user_seg_data = (NvDsInferSegmentationMeta *) (user_meta->user_meta_data);
guint classes = user_seg_data->classes;
guint width = user_seg_data->width;
guint height = user_seg_data->height;
gint *class_map = user_seg_data->class_map;
gfloat *class_probs = user_seg_data->class_probabilities_map;
g_print ("classes = %d, width = %d, height = %d, ", classes, width, height);
gint image[1024][1024];
for(int i=0; i<height; i++){
for(int j=0; j<width; j++){
if(*(class_map + (width * i + j)) == 0){
image[i][j] = 255;
}
else if(*(class_map + (width * i + j))== -1){
image[i][j] = 0;
}
}
}
My output looks like this:
But the output without deepstream just using my tensorrt engine model: