I had a probem when working with my simple deepstream app (streammux → nvpreprocess → pose_engine_nvinfer → my_custom_pose_decode_mask_to_keypoint_plugin → …)
When I use roi param with full HD resolution, the output is good but if I change roi param to another resolution, the output is a bit weird. The bounding boxes are not drawn well.
I have a question. Is the instance mask scaled to input resolution or to ROI resolution? Could you guys give me some info?
what is the model’s input and output? there is a PeopleSegNet model peoplesegnet, it will outputs bounding-box coordinates and segmentation mask for each detected person in the input image, why is there no segment mask in your pictures?
About “Is the instance mask scaled to input resolution or to ROI resolution”, from the code attach_metadata_detector, which is opensource, the bbox will be converted to a new coordinate in the original frame, but mask dose not do this conversion.
my model output only has cif and caf, there are no bounding-box coordinates. However, I built a custom plugin to convert cif and caf to keypoints and then construct the bounding box coordinates from those key points.
Thank for your confirmation. I debugged the nvinfer and I also realized that mask on inferenced ROI does not do the conversion to the original frame. In my case, what is the solution here? I should write a conversion code by myself, right? It there any example code to convert model output mask of ROI to original frame coordinates in deepstream source code? Thanks