When I try to deploy the yolov5s-seg instance segmentation model in deepstream, I have a problem that the mask of the target is not displayed, or displayed incompletely.
The image size is 1920×1080, and the network input size is 640×640. Through debugging, we know that the mask will not rescale to the original image size. Therefore, in the post-processing, I restore the mask to the original image, and the data in the mask is a value of 0-1.
In addition, through the following tests, we can confirm that there is no problem with the incoming data.
the gray mask pictures is created during post-processing and is only used to judge whether the mask is correct. But the mask in the video file generated after inference is not good.
You need to scaling the output back according to the model preprocessing. How did you do the preprocess scaling with your yolov5s-seg model? Is there “keep-aspect-ratio” operation? What is the size of the mask output matrix(640x640 or other size)?
the image size is 1920 × 1080, the network input is 640×640, with “keep-aspect-ratio” operation。the output of mask size should be 640 × 640, but i have rescaled the mask to 1920 x 1080.
[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0
Suppose your final display resolution is 1920x1080, since your model accept the padded scaled image as input, the output is also padded. So you need to scale the valid part(removing the padding parts) of the output to the display resolution.
we just see the bus object,the gray picture is saved in NvDsInferParseYolov5Seg function,the RGB picture is output with the pipeline in the video. They are all the first frame data of the video as the result of the input。I think the mask data is changed in somewhere, but I can’t locate it right now.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks